Up: Component summary Component

HTSeqExprMatrix

Takes a CSV list of outputs from HTSeqCount to merge into an expression matrix. Similar to ExprTable but with smallRNA and the output of htseq-count in mind.

Version 0.1
Bundle sequencing
Categories Expression
Authors Katherine Icay (katherine.icay@helsinki.fi)
Issue tracker View/Report issues
Requires R
Source files component.xml HTSeqExprMatrix.r
Usage Example with default values

Inputs

Name Type Mandatory Description
samples CSV Mandatory 2-column list of HTSeq outputs to merge into an expression matrix: first column containing sample "Key" id, second column containing sample "File" path to HTSeq feature count output.
TPM CSV Optional Extracts total mapped reads to calculate TPM. Only implemented if parameter TPMcolumn != "".

Outputs

Name Type Description
countArray Array<CSV> Array of expression matrices. Keys include all , filtered, and if TPM calculation is activated, allTPM, filteredTPM. Count matrices labelled "all" contain all quantified features, including those with 0 mapped reads across all samples and those with less than the specified minCount and minPercent parameters. All "filtered" count matrices exclude these quantified features.
no_feature CSV Count matrix of reads that HTSeq did not map to any features. Large values here relative to the expression matrices indicate problems and/or inefficiencies in the expression quantification step. Usually, errors in the feature selection.

Parameters

Name Type Default Description
TPMcolumn string "" Name of column containing the total number of mapped transcripts to calculate the transcripts-per-million value to scale-normalize samples with. Default is to skip this step.
inclusionKey string "hsa" Species name(s) to be used as inclusion term to count matrix (everything else is filtered to output). These should match the featureID values of HTSeqCount. Regular expression search is possible for multiple inclusion key values. E.G. "hsa|hiv" (human and human immunodeficiency virus miRNAs).
minCount int 1 Minimum number of read counts for a miRNA/smallRNA feature to be considered "expressed" in a sample.
minPercent float 0.0 Minimum percentage of samples that a miRNA/smallRNA feature must be considered "expressed" to be included in the final matrix. Default value is "0" or no filtering.

Test cases

Test case Parameters IN
samples
IN
TPM
OUT
countArray
OUT
no_feature
case1 properties samples (missing) countArray no_feature

inclusionKey=mir


Generated 2019-02-08 07:42:12 by Anduril 2.0.0