Uses FASTX-Toolkit
to perform adapter trimming, artifact filtering, base-quality filtering, and read trimming for single-end read data. This function is executed within the larger QCFasta
component when parameter tool=fastx
.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Preprocessing smallRNA |
Authors | Katherine Icay (katherine.icay@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | FASTX-Toolkit |
Source files | component.xml function.scala |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
reads | FASTQ | Mandatory | Input file in FASTQ format. |
Name | Type | Description |
---|---|---|
fastq | FASTQ | Trimmed, high-quality reads. Must be in fastq/fq format and should not be zipped. |
stats | CSV | Count statistics of reads before and after processing. |
Name | Type | Default | Description |
---|---|---|---|
Lmax | int | 32 | Maximum acceptable sequence length. |
Lmin | int | 15 | Minimum acceptable sequence length. |
M | int | 6 | Minimum, partial adapter match length needed for removal to occur. |
adapter | string | "ATCTCGTATGCCGTCTTCTGCTT" | Adapter sequence to remove. Default is Illumina smallRNA-seq adapter. Value of "NA" will disable trimming and just calculate total read lengths per sample. |
extra | string | "-n" | Extra parameters for fastx_clipper . E.G. "-n" keeps sequences with N, "-c" discards non-clipped sequences. |
minPercent | int | 20 | Minimum percentage of bases that must have at least minQ for a read to be kept. |
minQ | int | 30 | Minimum quality score to keep. |
qual | string | "-Q64" | Type of quality scores of sequences: -Q64 for Sanger scores, -Q33 for Phred scores. |
zip | boolean | false | Defines if the output sequences should be gzipped or not. |
Test case | Parameters▼ | IN reads |
OUT fastq |
OUT stats |
---|---|---|---|---|
case1 | (missing) | reads | (missing) | (missing) |