Up: Component summary Component


Creates a -1/0/1 matrix that indicates whether a given gene/probe is differentially expressed in individual samples. The decision whether a gene is differentially expressed is done (a) using fold change thresholding or (b) selecting P largest and P smallest log ratios, where P is a proportion or (c) selecting log ratios enough below or above zero when the distance is given as a coefficient of standard deviation or median absolute deviation.

The expression input can be either log ratios or channel values, which are handled differently. Log ratios are directly used for fold change thresholding. Channel values are converted to log ratios by dividing them with reference samples. In this case, the expression matrix is assumed to contain both the target and reference samples. Reference samples are converted to a single column by taking a median. The columns that contain reference values are defined in the groups input. If the groups input is not given, values are assumed to be log ratios.

Version 1.1
Bundle tools
Categories DEG
Authors Kristian Ovaska (kristian.ovaska@helsinki.fi), Marko Laakso (Marko.Laakso@Helsinki.FI), Riku Louhimo (Riku.Louhimo@Helsinki.FI)
Issue tracker View/Report issues
Requires R
Source files component.xml SampleExpression.r
Usage Example with default values


Name Type Mandatory Description
expr LogMatrix Mandatory Expression matrix. The values are assumed to be log ratios if the "groups" input is not given; otherwise, they are assumed to be channel values.
groups SampleGroupTable Optional Sample groups that define reference (control) columns in the expression input. The reference group name is given with the referenceGroup parameter.


Name Type Description
indicator Matrix Matrix containing -1/0/1 for under/neutral/overexpression, respectively. The matrix has same dimension as the expr input. The value at position i,j is -1 if the gene i is underexpressed in sample 1, 1 if it is overexpressed, and 0 otherwise.
logratio LogMatrix Log ratios that were computed based on the expr and groups inputs. If "groups" was not provided (i.e., log ratios were not computed in this component), this output is empty and should not be used.
deviation Matrix Depending on 'thresholdType', this is a matrix of standard deviations (sd), median absolute deviations (mad) or empty (fold-change, top-most).


Name Type Default Description
referenceGroup string "" Group name in the groups input that contains reference samples. Used to compute log ratios from channel values.
threshold float 4 Threshold that determines whether a gene is under- or overexpressed. Interpretation of the value depends on thresholdType.
thresholdType string "fold-change" Defines how differential expression is decided. Legal values are "fold-change" and "top-most" and "standard-deviation". For "fold-change", the threshold parameter is a linear fold change threshold. A gene is overexpressed if its log ratio is over the threshold and underexpressed if the log ratio is under 1/threshold. For "top-most", threshold is a fraction between 0 and 1 that defines how many genes with highest and lowest expression are selected. For example, if threshold is 0.1, the top 10% genes are overexpressed and the bottom 10% are underexpressed. For "sd" and "mad", the threshold is the coefficient of the standard deviation or median absolute deviation, respectively. The deviations are calculated for the log ratios.

Test cases

Test case Parameters IN
case1_fc properties expr (missing) indicator logratio (missing)


case2_fc_groups properties expr groups indicator logratio (missing)


case3_thold properties expr (missing) indicator (missing) (missing)

threshold = 0.1,
thresholdType = top-most,

case4_sd properties expr groups indicator logratio deviation

thresholdType = sd,
referenceGroup = ref,
threshold = 2

case5_mad properties expr groups indicator logratio (missing)

thresholdType = mad,
referenceGroup = ref,
threshold = 3

Generated 2019-02-08 07:42:19 by Anduril 2.0.0