External scripts

Click component name to see manual page

Component Shorthand Bundle Description
BashEvaluate bash"..." tools Invoke Bash script
PythonEvaluate python"..." tools Invoke Python script
MatlabEvaluate tools Invoke Matlab script
QuickBash tools Invoke Bash script with fewer inputs
REvaluate R"..." tools Invoke R script
ScalaEvaluate scala"..." tools Invoke Scala script
TableQuery sql"..." tools Invoke SQL query on CSV files


Anduril provides facilities to invoke scripts written in external languages. The components that provide this all have similar interfaces: they take a script and a number of data files as input, and produce a number of data files as output.

There are two ways to call these components. First, they can be invoked as regular components using the function call syntax of Scala. Second, they have shorthand syntax using Scala string interpolation that makes common use case more convenient. Regular invocation is needed if you want to place the external script in its own file or need to modify parameters; otherwise, shorthand syntax can be used.

The general form of the shorthand syntax is prefix"script" or prefix"""long script""" (for multi-line strings). Here, prefix is a language-dependent identifier such as bash or python. You can use Scala variables inside the strings using ${variable}: they are expanded to the value of the variable. Scala expressions such as ${component.port} are also supported. When expanding output ports of components, Anduril also inserts dependencies to the workflow.

These components are in the tools bundle, so remember to put import anduril.tools._ in your scripts.


The script below shows three equivalent ways of filtering a CSV file using grep to obtain lines that contain “gene01”. The shorthand syntax bash"..." expands the script inside the quotes using the variables grepArguments and data to produces a final command like grep gene01 /home/user/data/data.csv. Dependencies are properly configured: filtered1 depends on data in the workflow.

QuickBash is a simplified interface with fewer ports and is suitable when you have one input (visible in Bash as $in) and one output ($out). BashEvaluate is the verbose version of the shorthand syntax, in which parameter substitution is done using templates of the form @arg@.

The multiple component demonstrates chaining Bash calls together, having multiple statements in the script, and writing to multiple output ports.

#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object Bash {
    val data = INPUT("data.csv")
    val grepArguments = "gene01"
    val filtered1 = bash"grep ${grepArguments} ${data.out}"
    val filtered2 = QuickBash(script = "grep " + grepArguments + " $in > $out",
                              in = data)
    val filtered3 = BashEvaluate(script = "grep @param1@ @var1@",
                                 var1 = data,
                                 param1 = grepArguments)

    val multiple = bash"""
        cat ${filtered1.stdOut} ${filtered2.out} ${filtered3.stdOut} > @out1@
        echo OK > @out2@"""


In the following example, we have two CSV files with columns Gene, Value and QualityOK, and want to compute the mean Value for genes that are present in both files and have QualityOK = 1. TableQuery and sql"..." construct a temporary in-memory database that can be used for executing a query. We use the multi-line form of string interpolation with triple quotation marks. The sql shorthand syntax expands file references like data1 into table names like table1; we can rename them using AS to be explicit. The default SQL engine is HSQLDB.

#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object SQL {
    val data1 = INPUT("data1.csv")
    val data2 = INPUT("data2.csv")

    val qualityCondition = 1
    val joined = sql"""
        SELECT data1."Gene",
               (data1."Value"+data2."Value")/2 AS "MeanValue"
        FROM ${data1} AS data1,
             ${data2} AS data2
        WHERE data1."Gene" = data2."Gene"
          AND data1."QualityOK" = ${qualityCondition}
          AND data2."QualityOK" = ${qualityCondition}"""


In this example, we compute the sum of Value columns in a CSV file, and write the result into a CSV file. PythonEvaluate provides access to an Anduril API that can be used for such tasks, and pre-populates certain variables. Alternatively, the Python standard library (such as csv) or external libraries can be used.

Here, we chose to place the Python script into an external file and invoke PythonEvaluate using the verbose syntax. The Python script is:

import anduril
value_sum = 0
for row in table1:
    value_sum += row['Value']

Workflow configuration:

#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object Python {
    val data = INPUT("data.csv")
    val script = INPUT("script.py")
    val result = PythonEvaluate(scriptIn = script, table1 = data)

We can also use the shorthand syntax. Here, we have to be take care of proper indentation, because Python uses whitespace to mark code structure. The python"..." syntax supports the Scala stripMargin feature, in which whitespace before initial | is removed. Note that this feature is specific to the Python shorthand syntax. Also, the data file is imported as a generic file, not a CSV table, so we manually construct a CSV iterator.

#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object PythonInline {
    val data = INPUT("data.csv")
    val result = python"""|
                          |import anduril
                          |value_sum = 0
                          |for row in anduril.TableReader(${data}):
                          |    value_sum += row['Value']