Identify user-defined spatio-temporal patterns in ChIP-seq and other sequencing data using the SPINLONG method. SPINLONG stands for Spatial Pattern Identification using Non-Linear OptimizatioN with Global constraints. Its applications include analysis of time-series RNA polymerase and histone modification data.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Short-read Sequencing |
Authors | Kristian Ovaska (kristian.ovaska@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | commons-primitives-1.0.jar (jar) ; commons-math-2.1.jar (jar) ; csbl-javatools.jar (jar) ; jahmm-0.6.1.jar (jar) ; jcommon-1.0.16.jar (jar) ; jfreechart-1.0.13.jar (jar) ; saxon9he.jar (jar) |
Source files | component.xml |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
bedFiles | BinaryFolder | Mandatory | Short reads in BED formats. The files may optionally by gzipped. |
patterns | XML | Mandatory | XML file defining the patterns. |
regions | CSV | Mandatory | Genomic regions. |
chromosomes | CSV | Optional | Metadata for chromosomes. Must contain the columns Chromosome and Size, where Size gives the length of the chromosome. If not present, the parameter chromosomePreset is used. |
mappability | BinaryFile | Optional | If given, contains genomic regions that are uniquely mappable in BED format. This is used to scale short read densities. |
control | BinaryFile | Optional | If given, contains a control track in BED format that is subtracted from primary tracks (bedFiles input). |
Name | Type | Description |
---|---|---|
scores | CSV | Segment scores. |
plots | ImageList | Segment plots and optionally optimizer plots. If both plotSegments and plotOptimizer are false, this output is empty. |
patternsDump | HTML | Patterns formatted as HTML. |
Name | Type | Default | Description |
---|---|---|---|
chromosomePattern | string | "" | Java regular expression that selects the chromosomes to be used in analysis. The empty value selects all chromosomes. The pattern is matched against the first column of BED files. |
chromosomePreset | string | "hs37" | If a custom chromosomes input is not given, this parameter is used to select a predefined chromosome set. Currently the legal values are hs36 (Homo sapiens, genome build 36) and hs37 (Homo sapiens, build 37). |
fragmentSize | int | 200 | Size of sequences DNA fragments. The short reads are elongated so their final length matches this number. Setting this to 0 disables elongation. |
maxDuplicateReads | int | 0 | The maximum number of duplicate short reads (same position and strand) that are utilized for each position. The rest are filtered out. This allows to renive reads that may be technical artefacts. If 0 or negative, all repeats are used. |
minRegionLength | int | 0 | Minimum length for input regions that are processed. If 0, all regions are processed. This allows to filter out very short genes. |
plotOptimizer | boolean | false | If true, visualize the progression of the optimizer. If false, omit optimizer plotting. |
plotSegments | boolean | true | If true, visualize results of regions whose score is over the threshold. If false, omit plotting. |
scoreThreshold | float | 0.1 | Minimum score for including a pattern match in the score and plot outputs. Notice that the score distribution depends on the scoring method. |
seed | int | -1 | Seed for random number generator. If negative, an automatically generated seed is used. Using a pre-defined seed ensures that results are deterministic. |
threads | int | 4 | Maximum number of threads to use. |
yLog | boolean | false | If true, the Y axis is plotted using logarithmic scale. |
Test case | Parameters▼ | IN bedFiles |
IN patterns |
IN regions |
IN chromosomes |
IN mappability |
IN control |
OUT scores |
OUT plots |
OUT patternsDump |
---|---|---|---|---|---|---|---|---|---|---|
case1 | properties | bedFiles | patterns | regions | (missing) | (missing) | (missing) | scores | (missing) | (missing) |
scoreThreshold=0, |
||||||||||
case2_chr | properties | bedFiles | patterns | regions | (missing) | (missing) | (missing) | scores | (missing) | (missing) |
scoreThreshold=0, |
||||||||||
case3_xslt | properties | bedFiles | patterns | regions | (missing) | (missing) | (missing) | scores | (missing) | (missing) |
scoreThreshold=0.325, |
||||||||||
case4_xslt_simple | properties | bedFiles | patterns | regions | (missing) | (missing) | (missing) | scores | (missing) | (missing) |
scoreThreshold=0, |