10. Reference: Component naming

When using Scala to insert components on a workflow, you need to follow basic code patterns that ensure that each component gets a consistent and human-readable name in the workflow. This allows tracing components between Scala code and workflow results (log messages and execution folder), and is required by Anduril to correctly determine if a component has been changed and needs to be re-executed.

Summary of supported Scala patterns:

Code Component name(s)
val simple = INPUT("data.csv") simple
val map = NamedMap[INPUT]("parent")
map("child") = INPUT("data.csv")
parent_child
val seq = NamedSeq[INPUT]("parent")
map += INPUT("data.csv")
parent_0
withName("parent")
{ val child = INPUT("data.csv") }
parent-child
val parent = CSVSort(INPUT("data.csv")) parent, parent_in
def f(x) = { val child = INPUT("data.csv") }
val parent = f()
parent-child
def f(x) = { val child = INPUT("data.csv") }
f()
child
INPUT("data.csv", _name = "explicit") explicit

These patterns are illustrated in the following code:

#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object ComponentNaming {

    // 1. Name: data
    val data = INPUT(path = "data.csv")

    val mySequence = NamedSeq[INPUT]("dataSeq")
    val myMap = NamedMap[INPUT]("dataMap")

    for (key <- Seq("data1", "data2")) {
        withName(key) {
            // 2. Names: data1-encapsulated, data2-encapsulated
            val encapsulated = INPUT(path = key+".csv")
        }

        // 3. (Seq) Names: dataSeq_0, dataSeq_1
        mySequence += INPUT(path = key+".csv")

        // 3. (Map) Names: dataMap_data1, dataMap_data2    
        myMap(key) = INPUT(path = key+".csv")
    }

    // 4. Names: embedded (CSVSort), embedded-in (INPUT)
    val embedded = CSVSort(
        INPUT(path = "data.csv")
    )

    // 5. Name: explicitName
    CSVSort(data, _name = "explicitName")

    // 6. Name: fromFunction-sorted
    def subFunction() = {
        val sorted = CSVSort(data)
        // Or any other pattern from above
    }
    val fromFunction = subFunction()
}

Explanation of Scala patterns

Basic pattern: val

The most basic pattern in val. The component constructor val name = Component( must be on the same line as val, but the argument list can extend over multiple lines.

Iteration: NamedMap, NamedSeq, withName

NamedMap, NamedSeq and withName are used in iterative structures.

Embedded calls

Embedded calls like f(g(x)) are supported supported to one level deep. g(x) does not have to be on the same line as f. f(g(h(x))) (two levels) is not supported.

Function calls

Arbitrarily nested function calls are supported, and result in names like parent1-parent2-child. Inside functions, the hierarchical prefix (such as parent1-parent2) is inserted to all generated names. All supported naming patterns are available in functions.

Explicit naming

The _name annotation can be used in the component constructor to assign explicit names. This always gives proper names, but should be avoided due to the manual work involved.

Invalid solutions

The following code demonstrates some antipatterns that do not result in consistent names. The problems are:

  1. Orphan component that has no val or var, or _name annotation. Correcting: use val or _name.
  2. val is on a different line than component definitions. Correcting: use _name or fit on one line.
  3. Name is reused in a for loop. Correcting: use withName, NamedSeq or NamedMap.
  4. Component is inserted into a plain Scala collection (Seq, Map, etc.). Correcting: use NamedSeq or NamedMap.
#!/usr/bin/env anduril

import anduril.builtin._
import anduril.tools._
import org.anduril.runtime._

object ComponentNamingBad {

    // 1. BAD EXAMPLE: orphan
    INPUT(path = "data.csv")

    // 2. BAD EXAMPLE: different line
    val flag = true
    val conditional = if (flag) {
        INPUT(path = "data1.csv")
    } else {
        INPUT(path = "data2.csv")
    }

    // 3. BAD EXAMPLE: name reused
    for (key <- Seq("data1", "data2")) {
        val data = INPUT(path = key+".csv")
    }

    // 4. BAD EXAMPLE: plain collection
    val plainSeq = scala.collection.mutable.Seq[INPUT]
    for (key <- Seq("data1", "data2")) {
        plainSeq += INPUT(path = key+".csv")
    }
}