This function uses the Picard java library to perform a conversion from SAM/BAM alignment format back to FASTQ sequence format. The resulting raw reads with included quality values can then be re-aligned using an alignment component. The procedure SAM->FASTQ->SAM can be useful when re-processing old alignments.
Normally single-end reads will be output to 'reads' and pair-end reads to 'reads' and 'mate'. If 'perReadgroup' is set to 'true', all read-group specific output files are written to 'folder'. Use 'options' to add any number of additional picard-tools options to the run command. See the manual for more information about picard-tools SamToFastq.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Alignment |
Authors | Rony Lindell (rony.lindell@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | picard-tools |
Source files | component.xml function.scala |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
alignment | AlignedReadSet | Mandatory | Input SAM/BAM alignment file to convert. The file should have correct extension (.sam or .bam) in order for Picard to be able to determine the file type. |
Name | Type | Description |
---|---|---|
folder | BinaryFolder | Folder in which to output fastq files when 'perReadgroup' is set to true. |
reads | FASTQ | Output FASTQ file containing single-end reads or reads of the first pair in pair-end data. |
mate | FASTQ | Output FASTQ file containing reads of the second pair in pair-end data. |
Name | Type | Default | Description |
---|---|---|---|
memory | string | "2g" | A non-default value appends -XmxVALUE to the java command to specify the maximum size, in bytes, of the memory allocation pool. This value must a multiple of 1024 greater than 2MB. Append the letter k or K to indicate kilobytes, m or M to indicate megabytes or g or G to indicate gigabytes. The default value is chosen at runtime based on system configuration. For example a value of "4g" would allocate 4 gigabytes of memory and a value of "512m" would allocate 512 megabytes of memory. Note: a value of at least '2g' is recommended by Picard developers, therefore this is the default value. |
options | string | "" | Any additional picard-tools options can be added here. This string is added to the run command as written. See the picard-tools manual part 'SamToFastq' for more information. Some useful options are e.g. clipping and trimming.
Example: options="CLIPPING_ACTION=X READ1_TRIM=5 READ2_TRIM=5 RE_REVERSE=false" |
perReadgroup | boolean | false | Output a FASTQ file per read group (two fastq files per read group if the group is paired). All FASTQ files will be written to 'folder' output. |
picard | string | "/opt/share/picard/" | Path to Picard directory, e.g. "/opt/share/picard", which containg the Picard-tools .jar files. If empty string is given, the picard version in sequencing lib directory will be used |
Test case | Parameters▼ | IN alignment |
OUT folder |
OUT reads |
OUT mate |
|
---|---|---|---|---|---|---|
case1 | properties | alignment | (missing) | reads | (missing) | |
# Test conversion; input SAM, |
||||||
case2 | properties | alignment | folder | (missing) | (missing) | |
# Test conversion; input SAM, |
||||||
case3 | properties | alignment | (missing) | reads | mate | |
# Convert pair-end alignment, |