Segments Array CGH data with the Circular Binary Segmentation (CBS) algorithm and produces hard or soft copy number aberration calls.
The 'geneAnnotation' input file must contain columns Chr, start, and end separately and the chromosomal numbers must be without the string 'chr' in front of the chromosome identifier. Valid chromosome identifiers are [1,2,..22, X, Y]. The actual names of the column names of the identifiers in the 'geneAnnotation' file are user definable.
Two output modes exist. The first one (hard) calls the copy number of a probe aberrated if is more than the median plus two standard deviations apart from the mean of the intensities of all the samples under scrutiny. In the second (soft) copy number aberration calls are made probabilistically with either the CGHcall or the FastCall (part of the TASSO package) algorithm. Please note that FastCall only works on 32bit systems.
Several plots can be produced based on user parameters. Example images can be found here.
The outputs are the dependant on the analysis procedure. The following matrices can be output: All segments with their respective copy number aberration calls, thresholds for significant copy number aberrations, and segment call probabilities.
Version | 2.0 |
---|---|
Bundle | microarray |
Categories | Agilent Copy Number Analysis |
Authors | Riku Louhimo (Riku.Louhimo@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | R ; limma (R-bioconductor) ; DNAcopy (R-bioconductor) ; CGHcall (R-bioconductor) ; TASSO (R-package) |
Source files | component.xml ACGHsegment.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
caseChan | CSV | Mandatory | CSV file containing the normalized probe intensities for the case channel. First row should have the probenames and samples should be columnwise. |
geneAnnotation | AnnotationTable | Mandatory | Probewise annotations as produced by the AgilenReader component. Only the probes present in the casechannel csv can be included. Use CSVFilter if the probes do not otherwise match. |
Name | Type | Description |
---|---|---|
report | Latex | Latex report for the analysis. |
segments | CSV | Segmented probewise copy number change values. |
tholds | CSV | Thresholds that were used to define a segment aberrated. |
lossProbs | CSV | Probewise probability of a loss having occured in the segment to which the probe was assigned. |
gainProbs | CSV | Probewise probability of a gain having occured in the segment to which the probe was assigned. |
normProbs | CSV | Probewise probability of no copy number change in the segment to which the probe was assigned. |
rawSegments | CSV | Probewise segment means. |
frequency | CSV | Probewise copy-number alteration frequencies. |
Name | Type | Default | Description |
---|---|---|---|
CGHCallMaxnumseg | int | 100 | Only used if callProbMethod=CGHCall. Maximum number of segments on a sample to be used for fitting the probabilty model. |
CGHCallPrior | string | "auto" | Only used if callProbMethod=CGHCall. Set the method to determine prior probabilities to CGHCall algorithm. Must be one of "auto", "all", "not all". |
CGHCallRobustsig | boolean | true | Only used if callProbMethod=CGHCall. Setting this to true enforces a lower bound on the normal segments. |
alpha | float | 0.01 | P-value of CBS to accept a break point. |
bpEndCol | string | "end" | Column name for chromosome end basepair in geneAnnotation. |
bpStartCol | string | "start" | Column name for chromosome start basepair in geneAnnotation. |
callProbMethod | string | "CGHCall" | Method with which to call CNA segment probabilities. Must be either CGHCall or FastCall. FastCall is significantly (30000 times) faster while CGHCall is more accurate. |
callProbabilities | boolean | false | Enabling this will make the component estimate probabilties for CNA segments via the CGHcall package. Only guaranteed to work when multiple chromosomes are analyzed simultaneously. |
chrColumn | string | "Chr" | Column name for chromosome identifiers in geneAnnotation. |
filterNA | boolean | true | Filter out probes with NA values. This is always true when 'callProbabilities' is true. |
lowerLimit | float | 0.0 | Use this as the intensity limit for calling a segment lossed. Default value computes the mean of the sample set and estimates the lower threshold for CNA call to be two standard deviations from it. |
minWidth | int | 2 | Minimum number of probes for CBS to define a segment. CBS only allows widths between 2 and 5. |
nSegFit | int | 3000 | Maximum number of segments used for fitting the mixture model in CGHcall probability calculations. Disabled if callProbabilities=false. Decreasing this lowers accuracy but can speed computation significantly. |
outputAllSegs | boolean | false | Output all segmentation results even if non-significant or non-aberrated. |
plotChromosomes | string | "0" | Defines the plots that the user wants to output as a comma separated list. The whole genome is plotted by default for each sample. Inputting a value different than 0 will generate additional plots for these specific chromosomes from each sample. |
plotEverySample | boolean | false | Enabling this will make the component print each sample separately. |
undoSD | int | 3 | Only used if 'undoSplits=sdundo'. Defines how many SDs two adjacent segments can be apart before they are combined. |
undoSplits | string | "sdundo" | A character string specifying how change-points are to be undone, if at all. Undoing change-points decreases the sensitivity of segmentation. Choices are "none","prune", which uses a sum of squares criterion, and "sdundo" (default), which undoes splits that are not at least this many SDs apart. SD by default is 3. |
upperLimit | float | 0.0 | Use this as the intensity limit for calling a segment gained. Default value computes the mean of the sample set and estimates the upper threshold for CNA call to be two standard deviations from it. |
Test case | Parameters▼ | IN caseChan |
IN geneAnnotation |
OUT report |
OUT segments |
OUT tholds |
OUT lossProbs |
OUT gainProbs |
OUT normProbs |
OUT rawSegments |
OUT frequency |
---|---|---|---|---|---|---|---|---|---|---|---|
case1 | properties | caseChan | geneAnnotation | report | segments | tholds | lossProbs | gainProbs | normProbs | rawSegments | (missing) |
chrColumn=Chromosome, |
|||||||||||
case2 | properties | caseChan | geneAnnotation | report | segments | tholds | lossProbs | gainProbs | normProbs | (missing) | (missing) |
plotChromosomes = 6,7, |
|||||||||||
case3 | properties | caseChan | geneAnnotation | report | segments | tholds | (missing) | (missing) | (missing) | (missing) | (missing) |
upperLimit=0.583291228317388, |
|||||||||||
case4_InvalidParam | properties | caseChan | geneAnnotation | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) |
minWidth=6, |
|||||||||||
case5_invalidColumnNames | properties | caseChan | geneAnnotation | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) | (expecting failure) |
undoSplits=prune |
|||||||||||
case6_allSegs | properties | caseChan | geneAnnotation | (missing) | segments | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) |
upperLimit=1, |