Function loading the methylation calls data into a MethylRaw
or MethylRawList
object (MethylKit
package data types)
and performing normalization and filtering of the data.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | DNA Methylation |
Authors | Chiara Facciotto (chiara.facciotto@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R ; GenomicRanges (R-bioconductor) ; data.table (R-package) ; download (bash) ; methylKit |
Source files | component.xml MethFilterNorm.R |
Usage | Example with default values |
Deprecated |
This component is not part of the methylation pipeline anymore. |
Name | Type | Mandatory | Description |
---|---|---|---|
methCallArray | Array<CSV> | Mandatory | An array pointing to CSV files containing the methylation calls information. Each of these files must include the following columns:
- chr , the chromosome location in the format chr1
- pos , the position (1-based) of the cytosine inside the chromosome
- strand , the strand where the cytosine is located
- coverage , how many reads are aligned at that specific position
- freqC , the percentage of methylated cytosines aligned at that specific position. |
groupIDs | CSV | Mandatory | CSV file containg 2 columns:
- Key , the same sample IDs associated to the methCallArray array elements
- Treatment , a boolean value (0 or 1) denoting which samples belongs to one state (e.g. controls) which samples belongs to another (e.g. test).
|
Name | Type | Description |
---|---|---|
rawFilteredNormMethyl | BinaryFile | RData file containing the methylRawList object to be used as input for DiffMeth component. |
statistics | Latex | Directory containing plots related to coverage and methylation statistics before and after normalization and filtering. |
analysis | CSVList | Directory containing CSV files with numCs , coverage and perc.meth information (one file per sample). |
visualization | BinaryFolder | Directory containing the bedgraph files for the visualization. The positions are 0-based and every bedgraph file contains data for every sample but with different score value
(either numCs , coverage or perc.meth ). |
Name | Type | Default | Description |
---|---|---|---|
assembly | string | "hg19" | A string containing the assembly genome used during the alignment. |
bedgraph | boolean | false | Boolean value indicating if three bedgraph files should be printed in the visualization folder. Each of the three files contains a different parameter as score (respectivelly numCs, coverage and percMeth). |
bothStrands | boolean | false | Boolean value indicating whether the plots of coverage and methylation statistics should summarize either both strands in the same plot (FALSE ) or produce two strand-specific plots (TRUE ). |
context | string | "" | A string stating the methylation context. Allowed values are CpG , CHG , CHH or just the empty string ""
in case the context is not relevant in the analysis. |
dataFrame | boolean | false | Boolean value indicating if CSV files summarizing the methylation information for every sample should be printed in the analysis folder. |
hiThr | float | 0.999 | An value for filtering the read counts. When the value is between 0 and 1 it indicates the percentage of reads that are aligned,
while when it is higher than 1 it indicates the number of reads.
Bases having higher coverage than this value are discarded. If the data don not have to be filtered, both loThr and hiThr must be set as NULL . |
loThr | float | 10 | An value for filtering the read counts. When the value is between 0 and 1 it indicates the percentage of reads that are aligned,
while when it is higher than 1 it indicates the number of reads.
Bases having lower coverage than this value are discarded. If the data don not have to be filtered, both loThr and hiThr must be set as NULL . |
method | string | "median" | String denoting the method used to calculate the scaling factor during normalization. Allowed values are median or mean . |
normalize | boolean | true | Boolean values stating if the coverage has to be normalized or not. |
Test case | Parameters▼ | IN methCallArray |
IN groupIDs |
OUT rawFilteredNormMethyl |
OUT statistics |
OUT analysis |
OUT visualization |
---|---|---|---|---|---|---|---|
case1_default | (missing) | methCallArray | groupIDs | rawFilteredNormMethyl | (missing) | (missing) | (missing) |
case2_method | properties | methCallArray | groupIDs | rawFilteredNormMethyl | (missing) | (missing) | (missing) |
# Testing MethFilterNorm component, |
|||||||
case3_loThr_hiThr | properties | methCallArray | groupIDs | rawFilteredNormMethyl | (missing) | (missing) | (missing) |
# Testing MethFilterNorm component, |
|||||||
case4_dataFrame_bedgraph | properties | methCallArray | groupIDs | rawFilteredNormMethyl | (missing) | analysis | visualization |
# Testing MethFilterNorm component, |