Up: Component summary Component

PyClone

Component to run PyClone variant clustering tool. Component has been tested with PyClone version 0.13.0.

See: https://bitbucket.org/aroth85/pyclone/wiki/Home for more information. Always cite Roth et al. PyClone: statistical inference of clonal population structure in cancer PMID: 24633410 if you use PyClone.

Config file (config.yaml) is generated, mutation files are built and PyClone analysis is run. Optionally, the outputs of the clusterig results are output.

Step1: Mutation files are generated with command: PyClone build_mutations_file. Prior for this command is defined with mutationPrior parameter. Step2: PyClone analysis is run with command: PyClone run_analysis. Seed value for the run is defined in parameter seed. Step3: Optional: Clustering results are generated in a table format if clusterTables is true. Wanted output types are defined with a comma-separated list in clusterTableType parameter.

Version 0.1
Bundle sequencing
Categories Analysis
Specialties generic
Authors Mikko Kivikoski (mikko.kivikoski@helsinki.fi)
Issue tracker View/Report issues
Requires PyClone
Source files component.xml pyclone_run.sh
Usage Example with default values

Type parameters (generics)

Inputs

Name Type Mandatory Description
in Array<T1> (generic) Mandatory Array of variant tables in tab-separated format.
purity CSV Optional Optional. Tumor purity estimates for each sample in a two column csv file. Column 'Key' must match to the Key in the input array. Column 'Purity' is the purity estimate for the sample. This overrides the purityDefault parameter.

Outputs

Name Type Description
config YAML Output port for the config file.
trace BinaryFolder Output folder for trace files.
mutationFiles Array<YAML> Mutation prior files in yaml format.
clusteringResults CSVList Results of clustering, if executed.

Parameters

Name Type Default Description
alpha int 1 Alpha value
beta int 1 Beta value
burnin int 5 Number of MCMC samples to discard from the beginning. Prior to convergence, the MCMC series features an initial transient which is controlled by the initial parameters and not the data. This initial transient should be specified such that it can be discarded from the subsequent analysis.
clusterTableType string "cluster,loci,old_style" Comma-separated list of wanted clustering tables. Possible options: cluster,loci and old_style. Default: all.
clusterTables boolean false Boolean, default = false. If true, PyClone clustering results are produced by using PyClone build_table command.
concentration float 1.0 Concentration parameter
densityFunction string "pyclone_beta_binomial" Density function
initMethod string "disconnected" Initial clustering. The value "connected" starts with all loci in a single cluster, subsequent iterations likely splitting the clusters; while "disconnected" starts with each loci in a separate cluster, iterations likely merging them. The former can be beneficial for large datasets. Default: "disconnected".
iterations int 50 Number of iterations used.
meshSize int 101 Number of mesh points for density estimation. The default of 101 allows subdivision of 1% in cellular prevalences.
mutationPrior string "major_copy_number" Mutation prior for building mutation files
oldFormat string "true" Specifies if the outputs should be converted to the old format when using PyClone 0.13.1 and newer. The cluster labels and trace files are off by one. Default: true
purityDefault float 1.0 Tumor purity estimate. Parameter's value is used as tumor purity estimate for all samples. Purity input overrides this parameter.
rate float 0.001 Rate parameter
seed string "3" Seed value for analysis.
shape float 1.0 Shape parameter
tableName string "" String to be added as a prefix to the file names of clustering results. The prefix will be separate with hyphen. Default: ""
thin int 1 Thinning ratio for the MCMC series in samples. In an MCMC series, the consecutive samples are correlated. These correlations will skew higher-order statistics such as variance estimates. This can be mitigated by specifying T such that each T-sample string is thinned into a single sample.

Test cases

Test case Parameters IN
in
IN
purity
OUT
config
OUT
trace
OUT
mutationFiles
OUT
clusteringResults
buildClusterTables properties in (missing) config trace mutationFiles clusteringResults

seed=7,
clusterTables=true,
tableName=testCase,
,

default (missing) in (missing) config trace mutationFiles (missing)

Generated 2019-02-08 07:42:12 by Anduril 2.0.0