Given an expression matrix and a CSV containing sample IDs and treatment groups, calculate basic statistics by treatment group: mean, median, and standard deviation of genes. Visualizations (histogram, density, boxplot, hierarchical clustering) are generated if parameter makeVisuals
is true.
Version | 0.1 |
---|---|
Bundle | sequencing |
Categories | Expression smallRNA |
Authors | Katherine Icay (katherine.icay@helsinki.fi), Alejandra Cervera (alejandra.cervera@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R ; MASS (R-package) ; ggplot2 (R-package) ; reshape (R-package) |
Source files | component.xml ExpressionStats.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
expr | CSV | Mandatory | Expression matrix. If matrix has an additional bio_type column annotated, parameter biotype should not be "skip". |
ref | CSV | Mandatory | CSV file containing sample names and treatment groups. Sample names must match column names of expr . |
geneSet | CSV | Optional | 2-column list of interesting genes (gene id, gene name) to create heatmap with. Id column should be the same name as defined in parameter exprID . |
bodyMap | CSV | Optional | Illumina body map. CSV file of geneIds (rows) per tissue (columns). Ids used should be the same as in expr (e.g. Ensembl gene id). |
Name | Type | Description |
---|---|---|
stats | CSV | Rows of genes with columns containing the mean, median and standard deviation of its expression by treatment type. |
statsArray | Array<CSV> | Additional statistics (topExpressed, topExpressedByTissue, zero, low, high, and summary) when makeVisuals is true and biotype is not skipped. |
report | Latex | Produced only if parameter makeVisuals is true. Latex report containing expression visualizations using additional inputs geneSet and bodyMap . |
Name | Type | Default | Description |
---|---|---|---|
biotype | string | "skip" | Is bio_type annotation added to input expr file? Either blank "" or "skip". Default is to skip this step. |
biotype_min | float | 0 | When biotype is blank "", this is the minimum expression value to consider when calculating additional statistics. |
bodySite | string | "body" | Any body tissue from the Illumina Body Map. Default is an empty string, but other options are heart, stomach, brain, etc. |
exprID | string | "" | Column name of expr file containing the unique gene IDs that statistics are calculated for. If blank, the first column is used. |
makeVisuals | boolean | false | Perform additional analysis of geneSet. When true, geneSet and bodyMap should be defined. |
refGroup | string | "Treatment" | Column name of ref file containing the treatment information corresponding to each sample ID. If blank, the component will lookfor a "Treatment" column. |
refID | string | "" | Column name of ref file containing the reference names to match to the expr columns. If blank, the first column is used. |
topNum | int | 10 | Number of top genes to be reported |
Test case | Parameters▼ | IN expr |
IN ref |
IN geneSet |
IN bodyMap |
OUT stats |
OUT statsArray |
OUT report |
---|---|---|---|---|---|---|---|---|
case1 | (missing) | expr | ref | (missing) | (missing) | stats | (missing) | (missing) |