Anduril 1 (legacy)
Anduril 1 is a legacy version that will continue to be available for download, but is no longer actively maintained. Use Anduril 2 for new installations.
License
Anduril 1 is licensed under the GNU General Public License. Note that Anduril 2 uses a different license.
Documentation
- User Guide in PDF
- Quick Reference card
- Component documentation (all bundles)
- Tutorials are stored with the source code of Anduril in the folder doc/tutorial
- ChangeLog of Anduril 1.x core
Anduril 1 API
- Java API documentation for the engine
- Scala API documentation
- R API documentation
- BASH API documentation
- MATLAB API documentation
- Python API documentation
- Anduril Maintenance Guide (PDF) (provides a formal and technical description of Anduril core architecture)
Download
Anduril is an integrator of multiple analysis tools, and thus it depends on a large set of libraries and software. If you plan to test drive Anduril only, it may be wise to start out with the preinstalled VirtualBox image, or with Docker.
VirtualBox
Docker
Docker is a container platform – almost like a virtual machine, but it runs directly on the current operating system kernel. Anduril on Docker can use all the computer resources, but it requires a Linux operating system.
We build many flavors of Anduril to the Docker Hub
Installation on Ubuntu/Debian
Due to the nature of dependency use in Anduril, the full installation of Anduril and its Bundles always requires a sudo or root access to the system.
Binary Installation
Example for Ubuntu Trusty (14.04 LTS): Add the following repositories to your 3rd party sources:
deb http://anduril.org/linux/ binary/
deb http://cran.at.r-project.org/bin/linux/ubuntu trusty/
You can add them by copy/pasting the above lines in the file: /etc/apt/sources.list.d/anduril.list
Next, add the signature for our repository, and CRAN R:
wget http://anduril.org/linux/anduril_pub.gpg -O - | sudo apt-key add -
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
Update your package lists and install anduril
sudo apt-get update
sudo apt-get install anduril
Note that R packages must be installed separately. Many of the components use Bioconductor packages. If you want to automatically install requirements of components, such as R packages, use the InstallRequirements component:
cd /tmp
sudo ANDURIL_HOME=/usr/share/anduril anduril run-component InstallRequirements
Refer to the documentation of InstallRequirements prior to running it.
Source Installation
Installation of Anduril 1.x
To install dependencies, install Anduril like in the binary example above, but instead of package anduril
,
install package anduril-meta
.
# Clone the repository:
hg clone https://bitbucket.org/anduril-dev/anduril -r anduril1 anduril
# set up environment
cd anduril
export ANDURIL_HOME=$( pwd )
export PATH=$ANDURIL_HOME/bin:$ANDURIL_HOME/utils:$PATH
# compile
ant anduril.jar
For each of the bundles you want to install, find the source code URL, and:
# Clone the repository to ANDURIL_HOME
cd $ANDURIL_HOME
hg clone https://bitbucket.org/anduril-dev/[bundleRepo] -r default [bundle_name]
cd [bundle_name]
ant setup
sudo $ANDURIL_HOME/utils/anduril-install-requirements -b . '*'
Eclipse integration
The integration of Anduril into the Eclipse IDE makes it possible to edit AndurilScript source and to invoke the workflow engine from Eclipse. The plugin implements syntax and error highlighting. See User Guide for instructions on installation and use. AndurilEclipse plugin installation is done using the Software Updates feature in Eclipse, with the following URL: http://anduril.org/pub/anduril_eclipse.
Bundles
List of bundles hosted by Anduril development team:
Name | Description |
---|---|
Builtin | Builtin bundle is shipped with Anduril, and includes input and output components. |
Anima | Anduril IMage Analysis bundle. APIs for popular scientific image analysis platforms, and convenience components. |
FlowAnd | Flow cytometry analysis tools. |
Microarray | Microarray analysis. |
Moksiskaan | Generic database and a toolkit for integrating information on connections between genes, proteins, pathways, drugs, and other biological entities. |
Sequencing | Deep sequencing data analysis. |
TCGA | Routines for TCGA microarray, clinical and sequencing data as well as TCGA data importing. |
Tools | All those generic little tools to help you, CSV handling, plotting etc. |
Builtin
Includes the very basic components.
- Source code Bitbucket
- Documentation for Anduril 1.x
Anima
ANduril IMage Analysis bundle
- Home page
- Source code for Anduril 1.x
- Component documentation for Anduril 1.x (together with all bundles)
- Change log for Anduril 1.x
FlowAnd
Flow cytometry analysis for Anduril
- Home page
- Source code
- Component documentation for Anduril 1.x (together with all bundles)
Microarray
Provides components for several types of analysis, such as
- gene expression,
- SNP,
- ChIP-on-chip,
- comparative genomic hybridization and
- exon microarray analysis as well as
- short-read sequencing.
Links:
- Source code
- Component documentation for Anduril 1.x (together with all bundles)
- Change log for Anduril 1.x
Moksiskaan
Moksiskaan is a generic database and a toolkit that can be used to integrate information about the connections between genes, proteins, pathways, drugs, and other biological entities. The database is used to combine various existing databases to find biological relationships between the genes of interest and to predict their interactions.
- Home page
- Source code for Anduril 1.x
- Component documentation
Sequencing
Bundle intended for sequencing analysis.
- Source code for Anduril 1.x
- Component documentation for Anduril 1.x (together with all bundles)
- Change log for Anduril 1.x
TCGA
The bundle encompasses routines to handle TCGA microarray, clinical and sequencing data as well as importing TCGA data into pipelines. The bundle’s components automatize the download of data from the TCGA data portal, and supports TCGA specific features such as data levels and batches. The download components automatically annotate array files with their TCGA sample codes.
Tools
- Source code
- Component documentation for Anduril 1.x (together with all bundles)
- Change log for Anduril 1.x
Frequently Asked Questions
Discussion forum
If your answer is not here, come and ask us on the dicussion forum! Q&A Forum
Distributed Execution
Anduril provides a support for Slurm out-of-the-box and other schedulers via a custom prefix mode.
Slurm
To use Slurm with Anduril, specify –exec-mode slurm
in Anduril command line.
Example: Allocate submit each component in awesome_workflow as a job to Slurm
anduril run awesome_workflow.and --exec-mode slurm -b awesome_bundle
Anduril uses Slurm srun
command to launch components. To pass
arguments to srun, use –slurm-args [arguments]
switch. Dashes in
arguments must be replaced with %-signs.
Exmple: Allocate 5 CPUs and one gigabyte of memory for the component
anduril run awesome_workflow.and --exec-mode slurm --slurm-args "%c 5 %%mem=10000" -b awesome_bundle
If you want to pass custom resource requirements to Slurm on component level, you can use @cpu and @memory annotations in your workflow. The specified values will be passed to srun command.
Example: Allocate 5 CPUs and one gigabyte of memory for the component
cB = CSVCleaner(original = in, rename = "number=value", @cpu=5, @memory=1024)
Similarly you can tell Slurm which node to use to run a specific component.
cB = CSVCleaner(original = in, rename = "number=value", @host="node3")
Prefix scripts
By using prefix mode it is possible to use run Anduril components
with another scheduler, any other program or even specify a custom
logic for each component. Prefix mode simply appends a custom prefix
in front of the component execution string, so that component launch
string is passed to the prefix as parameters. The prefix mode is
taken into use by –exec-mode prefix
switch and the prefix is
specified by –prefix [script-name]
.
Example: Execute custom_prefix as part of each component’s execution.
anduril run awesome_workflow.and --exec-mode prefix --prefix custom_prefix -b awesome_bundle
One way to introduce custom logic for executing components is to use
a Bash script as a prefix. Refer to
doc/templates/prefix_template.sh
as an example for such a script.
Prefix mode also supports @cpu, @memory, @host annotations. To take
them into use you must specify execution logic in a prefix script.
The prefix template script contains an example how these annotations
are applied to a prefix.
How do I simulate components and unavailable resources?
Records can be used to alter the output interface of the components. You may use this to:
- replace one component with another one that has different names for its output ports or even lacks some of them;
- simulate unavailable resources such as old databases or components that would not work on your environment;
- substitute real inputs with some test data.
The real advantage of using records is in runtime switching between the actual implementations.
if (useOldData) {
dirOldData = "execBak/"
mdIn1 = INPUT(path=dirOldData+"dbRead/idlist.lst")
mdIn2 = INPUT(path=dirOldData+"dbRead/annotations.csv")
myData = record(ids = mdIn1, // rename idlist to ids
annotations = mdIn2,
report = null) // LatexCombiner will skip this automatically
} else {
myData = MyDatabaseReader() // outputs are: ids, annotations, and report
}
How do I generate unique row identifiers for a CSV file?
You can use TableQuery
component and SQL sequencies for this. The
following SQL will add an “id” column to the given file and the
values are of the form id#, where # gets numbers from one to the
number of rows in the file.
CREATE SEQUENCE seqMY_id AS INTEGER START WITH 1 INCREMENT BY 1;
SELECT 'id'||NEXT VALUE FOR seqMY_id AS "id", table1.* FROM table1
How do I define inputs that accept files and folders?
You may use generic data types to define component inputs that will
accept files and folders. StandardProcess
component can be used as an
example of this. The component can be found from the microarray
bundle.
How do I define public constants?
You may have a set of universal constants you would like to use in various pipelines. You can wrap these constants into a public function that can be called to make them visible. The same function can be used to include these values into your bundle API.
First you will need a function that is used to declare the constants. This same initialization function may also carry out some other preparements for the end user. Here is an example body that could be used:
include "doc-files/myConstants.and"
function MyInit {
// You may add some logic in here!
}
The constants are defined in doc-files/myConstants.and
,
which is now an independent file. This file can be generated and
maintained without a need to worry about the function itself.
The doc element of your component.xml may contain something like:
This function declares a set of useful
<a href="myConstants.and">constants for me</a>.
Now the actual values of your constants are included into the API.
How do I execute quick-and-dirty or standalone workflows?
Sometimes you want to use Anduril for a quick task and do not want to create various files (workflow configuration, CSV files, etc) as done with more complex problems. Or, you may want to avoid polluting the file system with too many files and want to have a standalone workflow file.
The following example shows how Bash and inline file generation can be used to create standalone workflows. This simple example creates a 5×5 random matrix and converts it into an Excel spreadsheet with style information. The style sheet is stored in a temporary file; this has the disadvantage that the name and timestamp of the file change on each execution, so CSV2Excel is re-executed each time. An alternative is to place the style sheet into a separate file.
#!/bin/bash
STYLE_FILE=$(tempfile)
cat >$STYLE_FILE <<EOF
Row Column Bold
1 * true
EOF
EXECUTE_DIR=execute
# When - is given as workflow file, Anduril reads standard input.
anduril run - -d $EXECUTE_DIR <<EOF
style = INPUT(path="$STYLE_FILE")
matrix = Randomizer(rows=5, columns=5)
excel = CSV2Excel(csv=matrix, style=style)
OUTPUT(excel.excelFile)
EOF
rm $STYLE_FILE
Why do my INPUT component instances get re-executed even if I haven’t changed them?
The likely reason for your INPUTs to get re-executed even if you haven’t changed them is that you using them directly as inputs to downstream components without first creating a named instance, like this:
myComponent = SomeComponent(INPUT(path="myFile.csv"), ...)
If you then introduce a new INPUT component instance upstream of these, their dynamic names change and the INPUTs are re-execute. To circumvent this problem, simply define a named instance for your inputs:
inputInstance = INPUT(path="myFile.csv")
otherComp = OtherComponent(inputInstance)