Getting started: Hello world

Anduril workflows are constructed using Scala 2.11. Let’s start with the classic “Hello world” program. Below, BashEvaluate is a function provided by Anduril that indirectly invokes the shell interpreter bash.

#!/usr/bin/env anduril

import anduril.builtin._
import org.anduril.runtime._

object HelloWorld {
    println("Beginning workflow construction")
    val helloWorld = BashEvaluate(script = "echo Hello world!")
    println("Ending workflow construction")

You might expect that the program prints the three messages to console in the order given in the source code. As we will see, this is not the case. There are key differences between workflow systems, such as Anduril, and non-workflow systems, such as R. Also, there are differences in how Anduril uses Scala compared to how standalone Scala programs are written.

Running the workflow

When the workflow (stored in hello-world.scala) is executed, it prints the following:

$ ./hello-world.scala
Beginning workflow construction
Ending workflow construction
[INFO <run-workflow>] Current ready queue: helloWorld (READY-QUEUE 1)
[INFO helloWorld] Executing helloWorld ( (SOURCE hello-world.scala:10) (COMPONENT-STARTED) (2016-04-27 16:27:52)
[INFO helloWorld] Component finished with success (COMPONENT-FINISHED-OK) (2016-04-27 16:27:52)
[INFO helloWorld] Current ready queue: (empty) (READY-QUEUE 0)
[INFO <run-workflow>] Done. No errors occurred.

In the file system, the file result_hello-world/helloWorld/stdOut has appeared (among some other files), and its contents are:

Hello world!

Understanding the workflow

Let’s focus on important points of this workflow. The script starts with #!/usr/bin/env anduril, a so-called hashbang/shebang line, which indicates that the script is executed using the anduril program. Anduril workflow files are not standalone Scala programs and cannot be executed using the scala executable, although they are syntactically Scala code and can be edited using any Scala editor. Anduril takes your workflow definition as input, and uses it to construct, verify and execute a workflow.

The two main computational steps in Anduril are workflow construction and execution. They are separate steps, as the above console output hints. In the workflow construction phase, the Scala code is executed in a special Anduril environment. The code prints the messages Beginning workflow construction and Ending workflow construction. Between these messages, the BashEvaluate function call inserts a task into the workflow. In Anduril, such tasks are called components. The BashEvaluate component is not yet executed at this stage. At the end of the workflow construction phase, Anduril has in memory a representation of a workflow consisting of (just) one component.

In the workflow execution phase, the built-in Anduril workflow engine executes the components in the workflow. This phase prints several [INFO] messages to the console, including lines that indicate that the helloWorld step was executed. You can see from the tag (SOURCE hello-world.scala:10) that the workflow engine can map workflow components back to your Scala source file, and from the message Component finished with success you see that the component was executed successfully.

Most Anduril workflow components involve operations on the file system, and the workflow engine maintains a folder that hierarchically stores outputs of all components. The location of this folder can be configured, but by default it is result_hello-world (named after your source file). The result files of the component instance helloWorld are in the subfolder result_hello-world/helloWorld. Anduril components can in general write multiple output files, so the standard output of of BashEvaluate is in the file result_hello-world/helloWorld/stdOut.

Optimized re-execution

You may wonder what is the benefit of such two-phase execution, and workflows in general. One such benefit is apparent when you run the same workflow again:

$ ./hello-world.scala
Beginning workflow construction
Ending workflow construction
[INFO <run-workflow>] Nothing to execute (READY-QUEUE 0)

Anduril detected that your workflow is synchronized with the results on disk, and did not execute any components. Anduril has sophisticated logic to detect configuration changes of individual components, and selectively re-executes a minimal set.