Takes a GFF3 file downloaded from miRBase and modifies the genomic locations to reference by Ensembl transcript ID (instead of chromosome) and relative positions inside the transcript ID (start and end positions). The output is then used as the input reference for HTSeq-count.
Version | 1.0 |
---|---|
Bundle | sequencing |
Categories | Annotation smallRNA |
Authors | Katherine Icay (katherine.icay@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | biomaRt (R-package) |
Source files | component.xml function.scala |
Usage | Example with default values |
Name | Type | Description |
---|---|---|
annot | CSV | Key-File CSV file containing the path to the modified GFF3 file. |
Name | Type | Default | Description |
---|---|---|---|
annotation_mirbase | string | "" | Path to downloaded miRBase GFF3 file. Required for the function to work! |
ensembl_dataset | string | "hsapiens_gene_ensembl" | biomaRt dataset parameter (i.e. species) to use. |
ensembl_host | string | "feb2014.archive.ensembl.org" | URL of Ensembl version to use (see Ensembl Archives). To guarantee optimal identification of transcripts, be sure to use the same genome build AND version of the genome as reference_hairpin . |
reference_hairpin | string | "" | Path to Ensembl fasta file of known smallRNA sequences. Required for the function to work! |
Test case | Parameters▼ | OUT annot |
||||
---|---|---|---|---|---|---|
case1 | properties | annot | ||||
annotation_mirbase=../../../sequencing/functions/AnnotMirbase/testcases/case1/annotation_mirbase.gff3, |