Up: Component summary Component

RUBIC

RUBIC detects recurrent copy number aberrations using copy number breaks, rather than recurrently amplified or deleted regions. This allows for a vastly simplified approach as recursive peak splitting procedures and repeated re-estimation of the background model are avoided. Furthermore, the false discovery rate is controlled on the level of called regions, rather than at the probe level.

Consult RUBIC webpage for the full documentation.

Version 1.0
Bundle sequencing
Categories Copy Number Analysis
Authors Gabriele Partel (gabrielepartel@gmail.com)
Issue tracker View/Report issues
Requires installer (bash) ; biomaRt (R-bioconductor) ; data.table 1.9.4 ; pracma (R-package) ; digest (R-package) ; ggplot2 1.0.1 ; gtable (R-package)
Source files component.xml main.sh
Usage Example with default values

Inputs

Name Type Mandatory Description
segments Array<CSV> Mandatory Array of copy number segmented files. Each file should have the same columns in the same order. Each CSV can contain arbitrary number of columns, but four fields are expected to exist for each segment:
  1. Column 1: Chromosome
  2. Column 2: Start
  3. Column 3: End
  4. Column 4: Log R value
If these columns are not in this order, their names must be specified with respective parameters.
markers CSV Mandatory The markers file indicates the exact locations of measurement probes (markers) for the given platform. For sequencing data, copy number values are often estimated with fixed bin sizes (prior to segmentations). In this case each marker should be associated with a bin and the center genomic position of the bin. The file must contain at least 3 columns:
  1. Column 1: probe name
  2. Column 2: chromosome
  3. Column 3: location on the chromosome
genes TextFile Optional Plot only selected genes. Text file (without header) containing in a single column a list of the Ensembl ID of the genes that will be plotted.

Outputs

Name Type Description
gains CSV Focal gains output file.
losses CSV Focal losses output file.
plots BinaryFolder Creates and saves two plots for each chromosome; one plot showing the gains and one plot showing the losses. In each plot is shown the location of the genes used to compute the focal events. However, it is possible to plot a different set of genes using the genes input.

Parameters

Name Type Default Description
ampLevel float 0.1 A positive number specifying the threshold used for calling amplifications.
assembly string "hg19" Genome assembly used. Possible values: hg19, hg38.
colChr int 1 The number of the column containing the chromosome name in input segments CSV files.
colEnd int 3 The number of the column containing the end position of each segment in input segments CSV files.
colLogR int 4 The number of the column containing the log ratio value in input segments CSV files.
colStart int 2 The number of the column containing the start position of each segment in input segments CSV files.
delLevel float -0.1 A negative number specifying the threshold used for calling deletions.
fdr float 0.25 False discovery rate.
maxMean float 0.0 A number specifying the maximum mean copy number allowed. If 0, segments will not be filtered based on their maximum mean copy number
minMean float 0.0 A number specifying the minimum mean copy number allowed. If 0, segments will not be filtered based on their minimum mean copy number.

Test cases

Test case Parameters IN
segments
IN
markers
IN
genes
OUT
gains
OUT
losses
OUT
plots
case1 properties segments markers (missing) (missing) (missing) (missing)

assembly=hg19,
colLogR=6,
colChr=2,
colStart=3,
colEnd=4,
metadata.timeout=0

case2 properties segments markers genes (missing) (missing) (missing)

colLogR=6,
colChr=2,
colStart=3,
colEnd=4,
metadata.timeout=0


Generated 2019-02-08 07:42:12 by Anduril 2.0.0