Encapsulates three algorithms with similar inputs for copy number to RNA expression integration. All inputs must be in the same order and have matching IDs.
In pint (R Bioconductor), canonical correlation analysis using pSimCCA to detect integrative effects of CNA and gene expression. Inputs must be matched i.e., dim(exprMatrix) = dim(cnaMatrix) and rows (samples) and columns (genes/transcripts/exons) match.
In DRI (R CRAN), either a standard correlation coefficient between the two data or a supervised learning method is employed to detect similarily directed abnormalities.
In edira, equally directed deviations from reference samples are detected by combined modified correlation coefficient followed by a Wilcoxon test.
Version | 1.0 |
---|---|
Bundle | microarray |
Categories | Integration Copy Number Analysis |
Authors | Riku Louhimo (Riku.Louhimo@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | edira (R-package) ; mixOmics (R-package) ; pint (R-bioconductor) ; PREDA (R-package) ; SIM (R-bioconductor) ; iCluster (R-bioconductor) ; tilingArray (R-bioconductor) |
Source files | component.xml CN2GECollection.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
exprMatrix | LogMatrix | Mandatory | Matrix of expression values. |
cnaMatrix | CSV | Mandatory | Matrix of copy number values. |
exprAnnotation | AnnotationTable | Mandatory | Annotations for rows. Shall include id, chromosome and locus. |
exprRefMatrix | LogMatrix | Optional | Matrix of normalized expression values from reference samples. Only used if algorithm=edira. |
cnaRefMatrix | CSV | Optional | Matrix of normalized copy number values from reference samples. Only used if algorithm=edira. |
labels | CSV | Optional | Sample groupings for DRImethod=SAM. |
Name | Type | Description |
---|---|---|
concomitantGenes | CSV | Genes that show concomitant expression and copy number aberration. |
concomitantGenes2 | CSV | Genes that show concomitant (down) expression and copy number aberration. Output is reserved for SODEGIR output. |
plot | Latex | Clustering plot for iCluster algorithm. |
Name | Type | Default | Description |
---|---|---|---|
DRImethod | string | "pearson" | Specify the integration method for DRI. One of "pearson", "spearman", "ttest" or "SAM", where SAM stands for supervised learing algorithm, and pearson and spearman are variants of the correlation coefficient. |
SGRthold | float | 0.1 | Thresholds for test statistic distribution for SODEGIR. |
algorithm | string | (no default) | Define the algorithm for integration. Must be one of "pint", "edira", 'SODEGIR', 'SIM', 'iCluster', 'integrOmics' or "DRI". |
ediraMaxseg | int | 50 | Maximum number of segments for edira segmentation. Only used if algorithm='edira' and ediraSegment='true'. |
ediraSegment | boolean | false | Define whether CNA data is segmentated prior to integration. Only used if algorithm=edira. |
endCol | string | "" | Name of the column with probe end locus information. |
fdrLimit | float | 0.1 | Result fdr cutoff limit. Only used if algorithm=DRI, SODEGIR or SIM. |
iClusterK | int | 3 | Number of K means clusters for iClust. |
iClusterOpt | boolean | false | Optimize K for iClust. See 'iClusterK' parameter also. Enabling this slows down processing. |
locusCol | string | "" | Name of the column with locus information. |
perm | int | 1000 | Number of permutations for null distribution. Only used if algorithm=DRI,SODEGIR. |
pintArm | string | "" | Arm for which scores are calculated. Must be one of "p", "q" or empty. Default value indicates both arms to be analyzed. Only used if algorithm=pint. |
pintMethod | string | "pSimCCA" | Specify the dependency model: "pCCA" probabilistic canonical correlation analysis (Bach and Jordan 2005) "pPCA" probabilistic principal component analysis (Tipping and Bishop 1999) "pFA" probabilistic factor analysis (Rubin and Thayer 1982) "pSimCCA" probabilistic similarity constrained canonical correlation analysis (Lahti et al. 2009) "TPriorpSimCCA" probabilistic similarity constarined canonical correlation analysis with possibility to tune T prior (Lahti et al. 2009) This parameter is only used if algorithm=pint. |
scoreLimit | float | 0.1 | Result cutoff limit. Only used if algorithm=pint. |
transformType | string | "raw" | Type of transformation which is applied when executing drsam algorithm. Must be one of "raw", "rank" or "standardize". Only used if algorithm=DRI and DRImethod=SAM. |
windowSize | int | 10 | Window size for pint algorithm sliding window. Only used if algorithm=pint. |
Test case | Parameters▼ | IN exprMatrix |
IN cnaMatrix |
IN exprAnnotation |
IN exprRefMatrix |
IN cnaRefMatrix |
IN labels |
OUT concomitantGenes |
OUT concomitantGenes2 |
OUT plot |
---|---|---|---|---|---|---|---|---|---|---|
case1 | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | concomitantGenes | (missing) | (missing) |
algorithm=pint, |
||||||||||
case3_edira | properties | exprMatrix | cnaMatrix | exprAnnotation | exprRefMatrix | cnaRefMatrix | (missing) | concomitantGenes | (missing) | (missing) |
algorithm=edira, |
||||||||||
case4 | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | concomitantGenes | (missing) | (missing) |
algorithm=pint, |
||||||||||
case5_SIM | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | concomitantGenes | (missing) | (missing) |
algorithm=SIM, |
||||||||||
case6_SODEGIR | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) |
algorithm=SODEGIR, |
||||||||||
case7_iCluster | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | concomitantGenes | (missing) | plot |
algorithm=iCluster, |
||||||||||
case8_int | properties | exprMatrix | cnaMatrix | exprAnnotation | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) |
algorithm=integrOmics, |