# First steps with funflow

## Introduction

funflow is a Haskell library for defining and running workflows.

A workflow specifies a pipeline of tasks structured in a Direct Acyclic Graph (DAG).

Workflows in funflow have the great property of being composable which means that you can easily share and combine components across different workflows.

funflow supports type checking, result caching, and other features that simplify setting up your pipeline.

Let's get started

## Anatomy of a Flow

The Flow type captures the notion of a workflow; it takes an input and produces an output:

flow :: Flow input output

input and output are the types of the input and output values of the flow. For instance a flow working on numbers might have the following type signature:

flow :: Flow Int Int

It takes an integer as input and produces an integer as its output. A flow that doesn't take any input can be written as:

flow :: Flow () Int

Such a flow might request some user input or download some data.

## Tasks

A Flow is a DAG comprising one or more Tasks which describe what you would like to execute.

funflow works with a wide range of task granularities. A Task can be a simple Haskell function, a database query, a command to run in a Docker container, or more.

Accordingly, funflow provides several task types to support these different kinds of computations. These Tasks are defined in the Funflow.Tasks subpackage. Of these types, PureTask is the simplest and represents a Haskell function which has no side effects, like reading a file or running a command. Other task datatypes include IOTask, which runs a Haskell function that can perform I/O (e.g. reading a file), and DockerTask, which runs a Docker container.

## How to create a flow A Flow is most easily built either...

  1. ...with the toFlow function, providing a task (or any value of a type with an IsFlow instance)
  2. ...with a smart constructor, providing either a function or configuration value. Funflow provides three such smart constructors:
    • pureFlow, building a flow from a non-effectful function
    • ioFlow, building a flow from an effectful function
    • dockerFlow, building a flow from a DockerTaskConfig

### toFlow: create a flow from a task To create a Flow value, you can use the function toFlow, which can be imported from the top level Funflow module and is defined in Funflow.Flow. Often a Task value will be the argument to toFlow, but really it could be anything of a type for which an IsFlow instance is available, as that's the class that declares toFlow. The resulting Flow value can then be composed with other flows into a larger, final Flow DAG.

Here is a Flow that runs a PureTask, incrementing its input by 1.

In [1]:
import Funflow (Flow, toFlow)
import Funflow.Tasks.Simple (SimpleTask (PureTask))

flow :: Flow Int Int
flow = toFlow $ PureTask (+1)

In this example, flow is essentially a DAG with one node, PureTask (+1). Here is a flow that runs a simple IO task which prints its input.

flow :: Flow String ()
flow = toFlow $ IOTask putStrLn

### Smart constructors: create a flow from a function (or config value) A single-task Flow like the ones above can also be created directly with a smart constructor. For instance, instead of the previous, one can write:

-- pure function, pure flow
flow :: Flow Int Int
flow = pureFlow (+1)

or

-- impure function, IO flow
flow :: Flow String ()
flow = ioFlow putStrLn

An additional smart constructor, dockerFlow, is defined in Funflow.Flow.

## Execute a flow

Everything needed to run a flow is available in the module Funflow.Run. The function runFlow is the main way to do so:

runFlow flow input

where

  • flow is the Flow to run
  • input is the input, with the same type as the input type of flow

It will return a result of type IO output where output is the output type of flow. Let's run our flow from earlier:

In [2]:
import Funflow (runFlow)

runFlow flow (1 :: Int) :: IO Int
2

As expected, it returned 2.

Astute readers may have noticed that the output of runFlow is of type IO output and not simply output. This wrapping of output in IO happens because runFlow uses a context to accommodate execution of any task type. Since runFlow supports IO and Docker tasks, ecah of which utilizes IO, the output of runFlow must also be wrapped by IO.

## Next Steps

With the basics out of the way, you should be ready to write and run your first Flow!

Check out the wordcount flow tutorial for a guided example.