# Configuration

Funflow provides support for configuring a Flow via a YAML config file or environment variables. Support for automatically generated CLI flags is also planned but as of the writing of this tutorial has not been implemented.

If a Task you are using contains a Configurable argument, when you write your Flow you will need to specify the configuration via one of three constructors: Literal, ConfigFromEnv, or ConfigFromFile. For example, the args :: [Arg] field of the DockerTask supports configurable args. Let's look at a few examples; first, here are the main imports and extensions we'll use.

In [1]:
:opt no-lint    -- Hide unused pragma warnings.

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
-- Note: Using OverloadedStrings with DockerTask since it will automatically
-- make sure that any `Literal` strings we write are of type `Arg`
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE RankNTypes #-}

import Control.Exception.Safe (StringException(..), try)
import Control.Monad.Catch (handleIf)
import Data.Maybe (fromMaybe, isJust, isNothing)

import qualified Data.Map as Map
import Path (Abs, Dir, File, Path, Rel, parseAbsDir, reldir, relfile, toFilePath, (</>))
import System.Directory (getCurrentDirectory)
import System.Environment (lookupEnv, setEnv, unsetEnv)

import qualified Data.CAS.ContentStore as CS
import Funflow
import Funflow.Tasks.Docker
import Funflow.Config (Configurable (Literal, ConfigFromFile, ConfigFromEnv))

Environment Variables¶

In [2]:
flow1 = dockerFlow $ 
    DockerTaskConfig {
        image="alpine:latest",
        command="echo",
        args=["this is a hard-coded literal value, the next value is:", Arg $ ConfigFromEnv "CONFIGURING_FLOWS"]
    }

We've declared that our task will use two arguments: a literal value that will simply be echoed back to us, and a value to be determined by an environment variables. Now we just need to set the CONFIGURING_FLOWS environment variable and run the task:

In [3]:
setEnv "CONFIGURING_FLOWS" "'hello from an environment variable!'"

runFlow flow1 DockerTaskInput {inputBindings = [], argsVals = mempty} :: IO (CS.Item)
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:29.522362181Z this is a hard-coded literal value, the next value is: hello from an environment variable!
Item {itemHash = ContentHash "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}

## YAML File To configure a task via config file, use the ConfigFromFile constructor and pass a file path to runFlowWithConfig.

In [4]:
getStoreAndConfig :: IO (Path Abs Dir, Path Abs File)
getStoreAndConfig = do
    cwd <- parseAbsDir =<< getCurrentDirectory
    let storeDirPath = cwd </> [reldir|./.tmp/store|]
        configFilePath = cwd </> [relfile|./flow.yaml|]
    return (storeDirPath, configFilePath)
    
getTempRunConf :: IO RunFlowConfig
getTempRunConf = (\(d, f) -> RunFlowConfig{ configFile = Just f, storePath = d }) <$> getStoreAndConfig
In [5]:
-- Inspect the config file and create the related Arg.
lines <$> ((toFilePath . snd <$> getStoreAndConfig) >>= readFile)
goodFileArg = Arg $ ConfigFromFile "ourMessage"
["ourMessage: \"Hello from the flow.yaml\"","ourOtherValue: 42"]
In [6]:
-- helper to run flows for this section of the demo
runWithEmptyInput :: [Arg] -> IO CS.Item
runWithEmptyInput confArgs = 
    let taskConf = DockerTaskConfig{ image = "alpine:latest", command = "echo", args = confArgs }
    in getTempRunConf >>= (\runCfg -> runFlowWithConfig runCfg (dockerFlow taskConf) (mempty :: DockerTaskInput) )
In [7]:
runWithEmptyInput [goodFileArg]
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:32.315382937Z Hello from the flow.yaml
Item {itemHash = ContentHash "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}

Mixing Configurables¶

As noted, a Configurable argument may constructed one of three ways:

  • Literal
  • ConfigFromEnv
  • ConfigFromFile

Since each yields a Configurable a value, though, we may mix these when we provide the args list to a task configuration.

In [8]:
-- Mixing literal and file config
runWithEmptyInput ["I'm a literal", goodFileArg]
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:34.231553696Z I'm a literal Hello from the flow.yaml
Item {itemHash = ContentHash "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}
In [9]:
setEnv "SECOND_GREETING" "I'm from an env var!"
lookupEnv "SECOND_GREETING"
Just "I'm from an env var!"
In [10]:
-- Mixing config file and env var.
runWithEmptyInput [goodFileArg, Arg $ ConfigFromEnv "SECOND_GREETING"]
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:36.538071830Z Hello from the flow.yaml I'm from an env var!
Item {itemHash = ContentHash "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}

We can also mix Placeholders with Configurables:

In [11]:
currArgs  = [goodFileArg, Arg $ ConfigFromEnv "SECOND_GREETING", Placeholder "par3"]
currInput = DockerTaskInput{ inputBindings = [], argsVals = Map.fromList [("par3", "hello-from-placeholder")] }
taskConf  = DockerTaskConfig{ image = "alpine:latest", command = "echo", args = currArgs }
getTempRunConf >>= (\runCfg -> runFlowWithConfig runCfg (dockerFlow taskConf) currInput :: IO CS.Item)
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:38.460348004Z Hello from the flow.yaml I'm from an env var! hello-from-placeholder
Item {itemHash = ContentHash "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}

Safety for efficiency: fail fast to avoid lost compute time and resources¶

Another key point is that configuration precedes execution. This means if a failure is inevitable due to an incomplete or illegal configuration, we can fail early by detecting that fact before we ever actually run anything. Here we set up simple examples to demonstrate, but imagine the time (and perhaps money!) that could be saved when failure would occur after a long-running computation.

Argument configurability as a prerequisite for execution¶

First, note that regardless of how an argument is constructed, execution requires a value for the argument (though for a Literal this is trivial.) To see this, we'll catch exceptions of type StringException since that's what should arise when a configuration is illegal or incomplete. Any other result will yield an alarming message. We thus define a helper:

In [12]:
-- First, define a helper for running a simple echo command
import Control.Exception (SomeException)

runEcho1 :: Arg -> Maybe RunFlowConfig -> Maybe DockerTaskInput -> IO ()
runEcho1 arg runConfOpt taskInOpt = 
    let taskConf  = DockerTaskConfig{ image = "alpine:latest", command = "echo", args = [arg] }
        taskIn    = fromMaybe mempty taskInOpt
        getStore  = (</> [reldir|./.tmp/store|]) <$> (getCurrentDirectory >>= parseAbsDir)
        mkRunConf = case runConfOpt of 
            Nothing -> (\s -> RunFlowConfig{ configFile = Nothing, storePath = s }) <$> getStore
            Just rc -> pure rc
    in do
        runCfg <- mkRunConf
        result <- try $ (runFlowWithConfig runCfg (dockerFlow taskConf) taskIn :: IO CS.Item)
        case result of
            Left (StringException msg _) -> putStrLn ("Successfully caught exception: " ++ msg)
            Left ex                      -> putStrLn ("Unexpected error: " ++ show ex)
            Right _                      -> putStrLn "Unintended success :/"

Note that this works when the argument is a literal, and we're simply echoed whatever we provide, and we'll get an "unintended success" message:

In [13]:
runEcho1 "salut, funflow" Nothing Nothing
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:40.485880851Z salut, funflow
Unintended success :/

But if we set up an argument as a placeholder or as configurable by environment variable or configuration file, we will get an error when the argument can't be configured:

In [14]:
-- Set up env var config failure.
isNothing <$> lookupEnv "MY_TMP_EV"
True
In [15]:
-- failure from environment variable
runEcho1 (Arg $ ConfigFromEnv "MY_TMP_EV") Nothing Nothing
Successfully caught exception: Missing the following required config keys: ["MY_TMP_EV"]
In [16]:
-- failure from file configurable
do
    (d, f) <- getStoreAndConfig
    readFile (toFilePath f) >>= print . lines
    runEcho1 (Arg $ ConfigFromFile "MY_TMP_EV") (Just RunFlowConfig{ configFile = Just f, storePath = d }) Nothing
["ourMessage: \"Hello from the flow.yaml\"","ourOtherValue: 42"]
Successfully caught exception: Missing the following required config keys: ["MY_TMP_EV"]
In [17]:
-- failure from placeholder
runEcho1 (Placeholder "WONT_BE_FILLED") Nothing (Just DockerTaskInput{ inputBindings = [], argsVals = mempty })
Found docker images, pulling...
Pulling docker image: alpine:latest
Successfully caught exception: Docker task failed with configuration errors: ["Unfilled label (WONT_BE_FILLED)"]

When the argument can be configured, though, as expected there's no issue:

In [18]:
-- now the positive case, to show that the error catch is specific to an actual error and doesn't just occur generally
runEcho1 (Placeholder "WONT_BE_FILLED") Nothing (Just DockerTaskInput{ inputBindings = [], argsVals = Map.fromList [("WONT_BE_FILLED", "surprise!")] })
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:44.049747695Z surprise!
Unintended success :/

No partial execution: argument configurability requirement is total.¶

In [19]:
-- Set stage for config-time error.
unsetEnv "SECOND_GREETING"
isNothing <$> lookupEnv "SECOND_GREETING"
True
In [20]:
-- Trigger a config-time error.
do
    res <- try (runWithEmptyInput [goodFileArg, Arg $ ConfigFromEnv "SECOND_GREETING"])
    case res of 
        Left (StringException msg _) -> putStrLn ("Caught error: " ++ msg)
        Right _ -> error "Unexpected success!"
Caught error: Missing the following required config keys: ["SECOND_GREETING"]

Note here that rather execution fails right away (in fact, as the flow is configuring the task, before any execution really begins). More specifically, rather first echoing the value that our config file assigns to "ourMessage", the flow catches the error before any action occurs (i.e., when the underlying task is configured, rather than as it's running).

Configuration is dynamic and prompted by a run attempt, not static and fixed by construction.¶

Also important is that configuration is dynamic. Although task execution is decoupled from configuration, a new run triggers a fresh interpretation, which allows a flow's configuration to be reconsidered before it's run again.

In [21]:
-- pretest
unsetEnv "NOT_STATIC"
isNothing <$> lookupEnv "NOT_STATIC"
True
In [22]:
type DockFlow = Flow DockerTaskInput CS.Item

-- First, a little helper to run flows for this demo
runDynaDemo :: DockFlow -> IO ()
runDynaDemo currFlow = do
    (s, _) <- getStoreAndConfig
    result <- try ( runFlowWithConfig RunFlowConfig{ configFile = Nothing, storePath = s } currFlow (mempty :: DockerTaskInput) :: IO CS.Item )
    case result of
        Left (StringException msg _) -> putStrLn ("flow failed: " ++ msg)
        Right _                      -> putStrLn "flow succeeded!"
In [23]:
-- Build the flow that will first fail, then succeed.
dynaFlow = dockerFlow DockerTaskConfig{ image = "alpine:latest", command = "echo", args = [Arg $ ConfigFromEnv "NOT_STATIC"] }
In [24]:
runDynaDemo dynaFlow
flow failed: Missing the following required config keys: ["NOT_STATIC"]
In [25]:
setEnv "NOT_STATIC" "works now!"
isJust <$> lookupEnv "NOT_STATIC"
True
In [26]:
runDynaDemo dynaFlow
Found docker images, pulling...
Pulling docker image: alpine:latest
2022-10-07T09:23:49.078870747Z works now!
flow succeeded!

Besides its own practical value, this property strengthens the equivalence of functionality between files and environment variables with respect to filling configurables. Namely, each runWithConfigFile provides an opportunity for different configuration through RunFlowConfig, so it's natural for any configuration that's influenced by environment variables to be reconsidered on a per-run basis.