1. Getting started: Hello world
Anduril workflows are constructed using Scala 2.11.
Let’s start with the classic “Hello world” program. Below, BashEvaluate
is a
function provided by Anduril that indirectly invokes the shell interpreter
bash
.
#!/usr/bin/env anduril
import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._
object HelloWorld {
info("Beginning workflow construction")
val helloWorld = BashEvaluate(script = "echo Hello world!")
info("Ending workflow construction")
}
You might expect that the program prints the three messages to console in the order given in the source code. As we will see, this is not the case. There are key differences between workflow systems, such as Anduril, and non-workflow systems, such as R. Also, there are differences in how Anduril uses Scala compared to how standalone Scala programs are written.
Running the workflow
When the workflow (stored in hello-world.scala
) is executed, it prints the following:
$ ./hello-world.scala
[INFO <runtime>] Beginning workflow construction
[INFO <runtime>] Ending workflow construction
[INFO <run-workflow>] Current ready queue: helloWorld (READY-QUEUE 1)
[INFO helloWorld] Executing helloWorld (anduril.tools.BashEvaluate) (SOURCE hello-world.scala:10) (COMPONENT-STARTED) (2016-04-27 16:27:52)
[INFO helloWorld] Component finished with success (COMPONENT-FINISHED-OK) (2016-04-27 16:27:52)
[INFO helloWorld] Current ready queue: (empty) (READY-QUEUE 0)
[INFO <run-workflow>] Done. No errors occurred.
In the file system, the file result_hello-world/helloWorld/stdOut
has
appeared (among some other files), and its contents are:
Hello world!
Understanding the workflow
Let’s focus on important points of this workflow. The script starts with
#!/usr/bin/env anduril
, a so-called hashbang/shebang line, which indicates
that the script is executed using the anduril
program. Anduril workflow
files are not standalone Scala programs and cannot be executed using the
scala
executable, although they are syntactically Scala code and can be
edited using any Scala editor. Anduril takes your workflow definition as
input, and uses it to construct, verify and execute a workflow.
The two main computational steps in Anduril are workflow construction and
execution. They are separate steps, as the above console output hints. In the
workflow construction phase, the Scala code is executed in a special Anduril
environment. The code prints the messages Beginning workflow construction
and Ending workflow construction
. Between these messages, the BashEvaluate
function call inserts a task into the workflow. In Anduril, such tasks are
called components. The BashEvaluate
component is not yet executed at this
stage. At the end of the workflow construction phase, Anduril has in memory a
representation of a workflow consisting of (just) one component.
In the workflow execution phase, the built-in Anduril workflow engine executes
the components in the workflow. This phase prints several [INFO]
messages to
the console, including lines that indicate that the helloWorld
step was
executed. You can see from the tag (SOURCE hello-world.scala:10)
that the
workflow engine can map workflow components back to your Scala source file,
and from the message Component finished with success
you see that the
component was executed successfully.
Most Anduril workflow components involve operations on the file system, and
the workflow engine maintains a folder that hierarchically stores outputs of
all components. The location of this folder can be configured, but by default
it is result_hello-world
(named after your source file). The result files of
the component instance helloWorld
are in the subfolder
result_hello-world/helloWorld
. Anduril components can in general write
multiple output files, so the standard output of of BashEvaluate
is in the
file result_hello-world/helloWorld/stdOut
.
Optimized re-execution
You may wonder what is the benefit of such two-phase execution, and workflows in general. One such benefit is apparent when you run the same workflow again:
$ ./hello-world.scala
[INFO <runtime>] Beginning workflow construction
[INFO <runtime>] Ending workflow construction
[INFO <run-workflow>] Nothing to execute (READY-QUEUE 0)
Anduril detected that your workflow is synchronized with the results on disk, and did not execute any components. Anduril has sophisticated logic to detect configuration changes of individual components, and selectively re-executes a minimal set.