Performs Singular Value Decomposition to gene sets to test whether a set of genes is significantly differentially expressed.
In the SVD method the expression values of multiple variables (=genes) are captured into one variable called metagene. This metagene is the main component of the SVD and captures most of the variation in the expression matrix.
The main idea of SVDAnalyzer is to summarize the expression values of a sample in a gene set into a single value called activity level. The activity level is defined for each sample in each pathway by using SVD. The activity level can be regarded as a weighted sum of the expression values, where the weights are given by the first metagene.
The activity levels are, then, used to make comparisons between the sample groups. These comparisons can be made by calculating e.g. t statistic for each pathway.
Required packages
Install package 'genefilter' from R: source('http://bioconductor.org/biocLite.R'); biocLite(); Install package 'csbl.go' from CSBL pages: csbl.go (requires package RUnit) Install package 'Category' from R: biocLite('Category');
Version | 1.0 |
---|---|
Bundle | microarray |
Categories | Analysis Pathway |
Authors | Minna Miettinen (Minna.Miettinen@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | R ; libxml2-dev (DEB) ; MASS (R-package) ; KEGG.db (R-bioconductor) ; genefilter (R-bioconductor) ; csbl.go (R-package) ; Category (R-bioconductor) |
Source files | component.xml analyzer.R |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
annotation | CSV | Mandatory | Gene annotation table. Parameters sourceId and targetId specify the columns containing the names of the genes and respective annotations. |
expr | LogMatrix | Mandatory | The expression values of the genes. First column should contain the same gene identifiers as the sourceId column in the annotation table. The number of rows i.e. genes, can be more than the number of sourceIds in the annotation table. However, expression values for all the sourceIds should be found in the expr table. |
sampleGroupTable | SampleGroupTable | Mandatory | SampleGroupTable represents the relation between a sample and its group. Table should contain at least two groups. The activity levels are determined by performing SVD using expression data from all samples. However, only two groups can be selected to make comparisons with t-test. These groups are selected with parameters Group1, and Group2. |
Name | Type | Description |
---|---|---|
report | Latex | Latex report of the SVD results: activity level plots of interesting gene sets and a table presenting statistics for each gene set. |
resultTable | CSV | Table presents analysis results in ascending order according to permuted p-value. |
Name | Type | Default | Description |
---|---|---|---|
database | string | "GO" | The gene set identifiers used. The possible choices are "GO", "KEGG" or "other". |
group1 | string | "" | The group whose expressions will be compared against Group2. |
group2 | string | "" | The group whose expressions will be compared against Group1. |
nPath | int | 50 | The number of top pathways presented in the latex report. |
nperm | int | 10000 | The number of permutations produced while evaluating p value for a pathway. |
pPlots | float | 0.01 | The pathways to be plotted. The activity levels of pathways with the p values smaller than Pplots are plotted. An error might occur if too high value is chosen for Pplots. |
pagebreak | boolean | false | Tells if the resulting document should start with a page break. |
section | string | "" | Section title for the table container or an empty string in case no section is selected to be generated. |
sectionType | string | "subsection" | Type of LaTeX section: usually one of: section, subsection, or subsubsection. No section statement is written if section title is empty. |
seed | int | 12345 | Seed number for the pseudo random number generator |
sourceId | string | "GeneId" | Character string specifying the column of gene identifiers in the input annotation. |
targetId | string | "GeneSetId" | Character string specifying the column of gene set identifiers in the input annotation. |
threshold | int | 10 | The minimum number of genes in a gene set. |
Test case | Parameters▼ | IN annotation |
IN expr |
IN sampleGroupTable |
OUT report |
OUT resultTable |
---|---|---|---|---|---|---|
case1 | properties | annotation | expr | sampleGroupTable | report | resultTable |
group1 = low, |
||||||
case2 | properties | annotation | expr | sampleGroupTable | report | resultTable |
pPlots = 0.01, |
||||||
case3 | properties | annotation | expr | sampleGroupTable | report | resultTable |
pPlots = 0.01 , |
||||||
case4 | properties | annotation | expr | sampleGroupTable | report | resultTable |
database = KEGG, |