Creates a -1/0/1 matrix that indicates whether a given gene/probe is differentially expressed in individual samples. The decision whether a gene is differentially expressed is done (a) using fold change thresholding or (b) selecting P largest and P smallest log ratios, where P is a proportion or (c) selecting log ratios enough below or above zero when the distance is given as a coefficient of standard deviation or median absolute deviation.
The expression input can be either log ratios or channel values, which are handled differently. Log ratios are directly used for fold change thresholding. Channel values are converted to log ratios by dividing them with reference samples. In this case, the expression matrix is assumed to contain both the target and reference samples. Reference samples are converted to a single column by taking a median. The columns that contain reference values are defined in the groups input. If the groups input is not given, values are assumed to be log ratios.
Version | 1.1 |
---|---|
Bundle | tools |
Categories | DEG |
Authors | Kristian Ovaska (kristian.ovaska@helsinki.fi), Marko Laakso (Marko.Laakso@Helsinki.FI), Riku Louhimo (Riku.Louhimo@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | R |
Source files | component.xml SampleExpression.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
expr | LogMatrix | Mandatory | Expression matrix. The values are assumed to be log ratios if the "groups" input is not given; otherwise, they are assumed to be channel values. |
groups | SampleGroupTable | Optional | Sample groups that define reference (control) columns in the expression input. The reference group name is given with the referenceGroup parameter. |
Name | Type | Description |
---|---|---|
indicator | Matrix | Matrix containing -1/0/1 for under/neutral/overexpression, respectively. The matrix has same dimension as the expr input. The value at position i,j is -1 if the gene i is underexpressed in sample 1, 1 if it is overexpressed, and 0 otherwise. |
logratio | LogMatrix | Log ratios that were computed based on the expr and groups inputs. If "groups" was not provided (i.e., log ratios were not computed in this component), this output is empty and should not be used. |
deviation | Matrix | Depending on 'thresholdType', this is a matrix of standard deviations (sd), median absolute deviations (mad) or empty (fold-change, top-most). |
Name | Type | Default | Description |
---|---|---|---|
referenceGroup | string | "" | Group name in the groups input that contains reference samples. Used to compute log ratios from channel values. |
threshold | float | 4 | Threshold that determines whether a gene is under- or overexpressed. Interpretation of the value depends on thresholdType. |
thresholdType | string | "fold-change" | Defines how differential expression is decided. Legal values are "fold-change" and "top-most" and "standard-deviation". For "fold-change", the threshold parameter is a linear fold change threshold. A gene is overexpressed if its log ratio is over the threshold and underexpressed if the log ratio is under 1/threshold. For "top-most", threshold is a fraction between 0 and 1 that defines how many genes with highest and lowest expression are selected. For example, if threshold is 0.1, the top 10% genes are overexpressed and the bottom 10% are underexpressed. For "sd" and "mad", the threshold is the coefficient of the standard deviation or median absolute deviation, respectively. The deviations are calculated for the log ratios. |
Test case | Parameters▼ | IN expr |
IN groups |
OUT indicator |
OUT logratio |
OUT deviation |
---|---|---|---|---|---|---|
case1_fc | properties | expr | (missing) | indicator | logratio | (missing) |
threshold=4 |
||||||
case2_fc_groups | properties | expr | groups | indicator | logratio | (missing) |
threshold=16, |
||||||
case3_thold | properties | expr | (missing) | indicator | (missing) | (missing) |
threshold = 0.1, |
||||||
case4_sd | properties | expr | groups | indicator | logratio | deviation |
thresholdType = sd, |
||||||
case5_mad | properties | expr | groups | indicator | logratio | (missing) |
thresholdType = mad, |