Combines expression data from several samples into one by taking means, medians or log ratios. Combining is done using a sample group table. Each row of the table specifies how one sample group is created from a list of source groups.
If the expression input file contains log ratios, both median and log ratio combining are possible and the result are log ratios. If the input are channel values, the groups may be either median groups (result type: channel values) or log ratio groups (result type: log ratio), but not both.
Version | 1.0 |
---|---|
Bundle | tools |
Categories | Filter |
Authors | Kari Nousiainen (Kari.Nousiainen@Helsinki.FI), Kristian Ovaska (kristian.ovaska@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R |
Source files | component.xml SampleCombiner.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | LogMatrix | Mandatory | Expression data, either log ratios or channel expressions. |
groups | SampleGroupTable | Mandatory | Sample group specification. |
Name | Type | Description |
---|---|---|
out | LogMatrix | Modified expression data. |
Name | Type | Default | Description |
---|---|---|---|
geometricMean | boolean | true | Indicates whether the means will be calculated using geometric (true) or arithmetic mean (false). The calculation is done after transforming logarithmic values to linear values. |
groupIDs | string | "" | Comma-separated list of sample group IDs. If given, combine only these groups. If not given (empty), combine all sample groups. |
includeOriginal | boolean | false | If true, put input groups into the result even if they are not present in 'groups'. |
thresholdRatio | float | 0 | Allows to omit those genes that have missing values in some samples. Only genes that have non-missing values in at least the given number of columns are included in the result. The actual threshold T is ceiling(thresholdRatio*n), where n is the number of samples that are used in combining. Only genes that have non-missing values in at least T columns are included. Other genes have the combined value set to NA. Note: this parameter specifies the ratio instead of integer threshold since the number of samples may be different in each sample group. |
Test case | Parameters▼ | IN in |
IN groups |
OUT out |
||
---|---|---|---|---|---|---|
case1 | (missing) | in | groups | out | ||
case2 | (missing) | in | groups | out | ||
case3 | properties | in | groups | out | ||
groupIDs=Group3,Group2 |
||||||
case4_include_orig | properties | in | groups | out | ||
includeOriginal = true, |
||||||
case5_ratio | (missing) | in | groups | out | ||
case6_ratio2 | (missing) | in | groups | out |