This component performs signature extraction from point mutation data using non-negative matrix factorization following the pipeline in SomaticSignature R package.
Version | 1.2 |
---|---|
Bundle | sequencing |
Categories | Analysis |
Authors | Amjad Alkodsi (Amjad.Alkodsi@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | SomaticSignatures (R-bioconductor) ; BSgenome.Hsapiens.UCSC.hg19 (R-bioconductor) ; BSgenome.Hsapiens.UCSC.hg38 (R-bioconductor) ; pheatmap (R-package) ; plyr (R-package) ; cowplot (R-package) ; RColorBrewer (R-package) ; reshape (R-package) ; grid (R-package) ; ggplot2 (R-package) |
Source files | component.xml SignatureExtractor.R |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | CSV | Mandatory | Input csv file having variant data. The file should have columns for chr, position, ref, alt and sample ID columns that can be specified by parameters |
Name | Type | Description |
---|---|---|
outContribution | CSV | CSV file having percentage of contribution of each signature in each sample. |
outNmuation | CSV | CSV file having estimated number of mutations induced by each signature in each sample. |
signatureDistance | CSV | CSV file having euclidean distance between each extracted signature and the list of published signatures. |
plots | BinaryFolder | Binary folder having different plots. |
Name | Type | Default | Description |
---|---|---|---|
altColumn | string | "alt" | Name of the column having alternative allele |
barplotDimensions | string | "5,5" | Dimenstions of the barplot pdf in the format of "width,height". |
barplotExtra | string | "" | ggplot R expression to be added to barplot command for example 'scale_fill_manual(values=c("red","blue","green")' to change colors. |
chrColumn | string | "chr" | Name of column having chromosomes. |
genomeBuild | string | "hg19" | Can be either hg19 or hg38. |
n | int | 3 | Final number of signatures to be extracted |
nRange | string | "2:10" | R expression for range of number of signatures to be tested for explained variance. |
nRep | int | 5 | Number of repeitition for each number of signatures as specified in the nRange parameter. |
nrun | int | 1 | The signatures will be extracted based on repeated runs specified by this parameter. |
posColumn | string | "pos" | Name of column having chromosomal position of the variant. |
refColumn | string | "ref" | Name of the column having reference allele. |
sampleIDcolumn | string | "ID" | Name of the column having sample IDs. |
signaturePlotDimensions | string | "5,5" | Dimenstions of the signature plot pdf in the format of "width,height". |
signaturePlotExtra | string | "" | ggplot R expression to be added to signature plot command for example 'theme(axis.text.x=element_text(size=5))' to change x axis font size. |
sortBy | string | "burden" | sort samples in the barplot according to this parameters. It can take signature names like S1, S2 ..Sn or "burden" to sort by number of mutations. |
Test case | Parameters▼ | IN in |
OUT outContribution |
OUT outNmuation |
OUT signatureDistance |
OUT plots |
---|---|---|---|---|---|---|
case1 | properties | in | (missing) | (missing) | (missing) | (missing) |
nRange=2:3, |