Quantify reads mapped to EnsemblID transcripts for a given sample. Requires htseq-count
and picard
in PATH. Similar to HTSeqBam2Counts
, but developed for smallRNA, with more htseq-count parameters available, and with the option to use custom annotation files (e.g. known and novel mature miRNAs).
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Expression |
Authors | Katherine Icay (katherine.icay@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | python2.7-dev (DEB) ; HTSeq (python) ; picard ; installer (bash) |
Source files | component.xml htseq.sh |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
bam | BAM | Mandatory | Sample file containing reads aligned to Ensembl transcripts. |
annotation_mirbase | BinaryFile | Mandatory | GFF/GFF3 file containing feature information on the alignment of the bam input. This can be the 'File' column of the AnnotMirbase output or a filtered Ensembl GFF file of ncRNAs. |
Name | Type | Description |
---|---|---|
counts | CSV | Quantified expression of reads in a sample. |
Name | Type | Default | Description |
---|---|---|---|
byStrand | string | "no" | Should expression be strand-specific? If "yes", sequences were processed with strand information and reads will be counted only if they overlap the same strand and region as an annotation feature. If "no", reads are counted whether it is mapped to the same or the opposite strand as the feature. |
featureID | string | "Name" | GFF/GFF3 attribute to use as feature ID to count. Counts will be combined for rows in the GFF/GFF3 file sharing the same feature ID. Default is to quantify reads aligned to mature sequences of a miRBase GFF3 file. To quantify on a hairpin/transcript level of a miRBase GFF3 file, use 'type=miRNA_primary_transcript' . |
featureType | string | "miRNA" | Feature in 3rd column of GFF/GFF3 file to quantify reads to, all other types ignored. For miRBase GFF3, this is either (mature) "miRNA" or "miRNA_primary_transcript". For Ensembl transcripts, this is usually "exon". |
mirbase_gff | boolean | true | Is the annotation file from mirbase (GFF3) or Ensembl (GFF)? |
order | string | "pos" | Alignment can either be sorted by name or by pos (alignment position). |
overlap_rule | string | "intersection-nonempty" | One of three decision-making modes used by htseq-count for counting reads. Other options are 'union' and 'intersection-strict.' See HTSeq for more details. |
picardPath | string | "../../lib/picard" | Path to picard installation. |