Performs pathway analysis by computing NEk and NTk statistics described in Tian et al (2005).
The NEk statistic is computed using sample group permutation, so a decent number of samples is necessary. You can study only the NTk statistic by setting the allpathways parameter to TRUE. Currently uses only first ID for genes with multiple comma separated IDs.
More details can be found in the sigPathway R package documentation.
Version | 1.0 |
---|---|
Bundle | microarray |
Categories | Pathway |
Authors | Viljami Aittomaki (viljami.aittomaki@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R ; sigPathway (R-bioconductor) |
Source files | component.xml SigPathway.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
expr | LogMatrix | Mandatory | Expression matrix with samples as columns and gene ID's as rows. ID's must be EntrezGene ID's! |
samplegroup | SampleGroupTable | Mandatory | SampleGroupTable file containing the sample group (e.g. phenotype) of the columns in expr. All columns of expr do not have to be present in the table. Those columns that are not present are ignored. Multiple groups are allowed. The Type and Description columns are ignored. |
Name | Type | Description |
---|---|---|
pathways | CSV | Highly ranking pathways according to the statistics used by sigPathway. Contains columns 'Gene Set Category', 'Pathway', 'Set Size', 'Percent Up', 'NTk Stat', 'NTk q-value', 'NTk Rank', 'NEk Stat', 'NEk q-value', 'NEk Rank' and 'Probes'. |
Name | Type | Default | Description |
---|---|---|---|
allpathways | boolean | false | Indicates whether to include all pathways or just the top npath pathways (sorted by the sum of ranks of both statistics) in the result table. If false, only consistently high ranking pathways are considered. If true, the resulting pathway table can be very long. |
alwaysRandPerm | boolean | false | Indicates whether the algorithm will use random permutations even when nsim is greater than the total number of unique permutations possible with the number of samples and their groups (or phenotypes). If false, complete permutation is used in such cases. |
maxNPS | int | 500 | Maximum number of probe sets a pathway can contain to be included in the analysis. |
minNPS | int | 20 | Minimum number of probe sets a pathway must contain to be included in the analysis. |
npath | int | 25 | Number of top gene sets to consider from each statistic when ranking the top pathways. |
nsim | int | 1000 | Number of permutations used for computing null-distribution. |
seed | int | 0 | Seed for random number generator (for permutations). Use a nonzero value (zero is the default value for not setting the seed manually). |
weightType | string | "constant" | Type of weight to use when calculating NEk statistics, 'constant' or 'variable'. Constant is faster. However, if the histogram of unadjusted p-values (for the genes from a t-test-like hypothesis test) is nearly horizontal, and the top ranked pathways have high NEk q-values (i.e. approaching 1), setting weightType to 'variable' should help lower some of the NEk q-values. |
Test case | Parameters▼ | IN expr |
IN samplegroup |
OUT pathways |
||
---|---|---|---|---|---|---|
case1 | (missing) | expr | samplegroup | (expecting failure) | ||
case2 | properties | expr | samplegroup | pathways | ||
minNPS=3, |