Perform Methylation Calling of aligned RRBS and WGBS data obtained from the component BSalign.
WARNING: It requires quite a big amount of CPU (For human genome, needs ~26GB memory or more). For systems with limited memory, user can set the -c/--chr option to process specified chromosomes only,
and combine results for all chromosomes afterwards.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | DNA Methylation |
Authors | Chiara Facciotto (chiara.facciotto@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | python ; samtools |
Source files | component.xml MethylCall.sh |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
reference | FASTA | Mandatory | The reference genome file in fasta format. It supports also gzipped fasta format. |
alignment | BAM | Mandatory | Aligned reads in bam format. |
Name | Type | Description |
---|---|---|
methylationCalling | CSV | CSV file with the following columns:
1) chromorome 2) coordinate (1-based) 3) strand 4) sequence context (2nt upstream to 2nt downstream in Watson strand direction) 5) methylation ratio, calculated as #C_counts / #eff_CT_counts 6) number of effective total C+T counts on this locus (#eff_CT_counts) ctSNP
="no action", #eff_CT_counts = #CT_counts
ctSNP
="correct", #eff_CT_counts = #CT_counts * (#rev_G_counts / #rev_GA_counts)
7) number of total C counts on this locus (#C_counts) 8) number of total C+T counts on this locuso (#CT_counts) 9) number of total G counts on this locus of reverse strand (#rev_G_counts) 10) number of total G+A counts on this locus of reverse strand (#rev_GA_counts) 11) lower bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion. 12) upper bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion. |
Name | Type | Default | Description |
---|---|---|---|
chr | string | "all" | Option to process only specified chromosomes. Chromosomes must be listed as comma separated values without spaces and in the form chr1,chrX and not only the chromosome number or identifier (X, Y or MT).
example: chr
="chr1,chr2" uses ~4.5GB compared with ~26GB for the whole genome. |
ctSNP | string | "correct" | How to handle CT SNP when performing the methylation calling. Three possible modes of use: "no-action", "correct", "skip":
- "correct": correct the methylation ratio according to the C/T SNP information estimated by the G/A counts on reverse strand -"skip": do not report loci with C/T SNP detected (i.e. detected A on reverse strand) - "no-action": do not consider C/T SNP. |
optionsMethylCall | string | "" | Other options for methylation calling. This parameter is given as written to the aligner execution command. Example: "-g true" combines CpG methylaion ratio from both strands. |
pair | boolean | true | Option to process only properly paired mappings (i.e true -> process only properly paired reads, false -> process all aligned reads). |
removeDuplicate | boolean | false | Option to remove duplicated mappings to reduce PCR bias (i.e true -> remove duplicated mappings, false -> process all mappings).
This option should not be used on RRBS data. For WGBS, sometimes it's hard to tell if duplicates are caused by PCR due to high seqeuncing depth. |
trim | int | 2 | Defines the number of fill-in nucleotides to be trimmed in DNA fragment end-repairing.
This option is only for pair-end mapping. For RRBS, trim
could be detetmined by the distance between
cuttings sites on forward and reverse strands.
For WGBS, trim
is usually between 0~3. |
unique | boolean | false | Option to process only unique mappings/pairs (i.e true -> process only unique mappings/pairs, false -> process all aligned reads). |
zeroMeth | boolean | true | Option to report loci with zero methylation ratios (i.e true -> report loci with zero methylation ratios, false -> report only loci with non-zero methylation ratios). |
Test case | Parameters▼ | IN reference |
IN alignment |
OUT methylationCalling |
||
---|---|---|---|---|---|---|
case1_default | (missing) | reference | alignment | methylationCalling | ||
case2_pair | properties | reference | alignment | methylationCalling | ||
# Testing MethylCall component, |
||||||
case3_zeroMeth | properties | reference | alignment | methylationCalling | ||
# Testing MethylCall component, |
||||||
case4_ctSNP_skip | properties | reference | alignment | methylationCalling | ||
# Testing MethylCall component, |
||||||
case5_ctSNP_no-action | properties | reference | alignment | methylationCalling | ||
# Testing MethyCall component, |