Aligns BS or RRBS data though BSMAP software, version 2.74. The software only works with base-space data and does not perform Methylation Calling (for this purpose use the component MethylCall).
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Alignment DNA Methylation |
Authors | Chiara Facciotto (chiara.facciotto@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | download (bash) ; bsmap ; samtools |
Source files | component.xml BSAlign.sh |
Usage | Example with default values |
Deprecated |
BismarkAlign is a better aligner. |
Name | Type | Mandatory | Description |
---|---|---|---|
reference | FASTA | Mandatory | The reference genome file in fasta format. It supports also gzipped fasta format. |
reads | BinaryFile | Mandatory | Reads in fasta, fastq or in bam format. It supports gzipped fasta/fastq format.
Use the inputType
parameter to define the format.
For paired-end alignment using a single bam input the same input bam file should be assigned to both 'reads' and 'mate'. |
mates | BinaryFile | Optional | Mate in fasta,fastq or bam format. It supports gzipped fasta/fastq format.
If the file format is bam then the same input bam file should be assigned to both 'reads' and 'mate'. |
Name | Type | Description |
---|---|---|
alignedReads | BAM | Aligned reads in compressed bam format. |
Name | Type | Default | Description |
---|---|---|---|
gapSize | int | 0 | Gap size. BSMAP only allow 1 continuous gap (insertion or deletion) with up to 3 nucleotides. Gaps will not be allowed within 6nt of the read edges.
The number of mismatches of gapped alignment is calculated as #gap_size+#mismatches+1 |
mismatches | float | 0.08 | Number of allowed mismatches. If this value is between 0 and 1, it's interpreted as the mismatch rate w.r.t to the read length. Otherwise it's interpreted as the maximum number of mismatches allowed on a read. Max=15.
Example: mismatches
= 5 (max #mismatches = 5),
mismatches
= 0.1 (max #mismatches = read_length * 10%) |
optionsAlignment | string | "" | Other options for alignment. This parameter is given as written to the aligner execution command. Example: "-H -s 10" allows respectively to avoid the header informations in the sam file and to set the seed size to 10 (Note: The '-s' option is valid only when working on WGBS mode). |
restrictionSite | string | "" | Restriction site recognized by the restriction enzyme used in the experimental procedure. It sets restriction enzyme digestion site and activates reduced representation bisulfite mapping mode (RRBS mode).
Possible restriction sites are "C-CGG" (recognized by MspI, the most commonly used restriction enzyme) and "T-CGA" (recognized by TaqI), where the symbol "-" represent the location in which the fragment is digested. Note: To analyze whole genome bisulfite sequencing data (WGBS mode), set the parameter restrictionEnzyme
= "" (default mode). |
threads | int | 1 | The number of processors to use. Default=CPU cores detected (up to 8 threads). Setting the parameter to '-1' allows to use the default value.
Note: The parallel performance scales well with 12 threads or less, no significant speed gain for >12 threads. |
Test case | Parameters▼ | IN reference |
IN reads |
IN mates |
OUT alignedReads |
|
---|---|---|---|---|---|---|
case1_reads | (missing) | reference | reads | (missing) | alignedReads | |
case2_reads_and_mate | (missing) | reference | reads | mates | alignedReads | |
case3_restriction_site | properties | reference | reads | mates | alignedReads | |
# Testing BSAlign component, |
||||||
case4_mismatches_gapSize | properties | reference | reads | mates | alignedReads | |
# Testing BSAlign component, |
||||||
case5_optionsAlignment | properties | reference | reads | mates | alignedReads | |
# Testing BSAlign component, |