Finds the closest gene, exon, or transcript for the given loci. The search for the closest gene is restricted into a window that expands the locus independently into both directions along the DNA strand. The input locations can be given as baseColumn in format chromosome:start-end or using chrColumn and baseColumn in format start.
Version | 1.5.1 |
---|---|
Bundle | microarray |
Categories | Annotation Short-read Sequencing SNP |
Authors | Marko Laakso (Marko.Laakso@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | mysql-connector-java-5.1.6-bin.jar (jar) |
Source files | component.xml |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
sourceKeys | CSV | Mandatory | Locations of interest |
connection | Properties | Optional | Database connection can be defined using this file. The definition of parameters: database.url, database.timeout, database.recycle, and database.driver can be found from the documentation of Korvasieni. |
Name | Type | Description |
---|---|---|
bioAnnotation | CSV | Table that contains the Ensembl IDs for the loci. |
Name | Type | Default | Description |
---|---|---|---|
baseColumn | string | (no default) | Name of the base pair loci column in input. |
bothStrands | boolean | false | Take the nearest separately for both DNA strands. The same nextDist limit applies to both strands that is the strand with the closest gene determines the distance limit. |
bpAfter | int | 10000 | Number of base pairs that are looked after the given loci |
bpBefore | int | 0 | Number of base pairs that are looked before the given loci |
chrColumn | string | "" | Name of the chromosome column in input if the base column does not include this information. |
echoColumns | string | "" | A comma separated list of column names for the columns that will be copied to the output. |
elementType | string | "gene" | Type of elements of interest (gene, exon, or transcript) |
idColumn | string | (no default) | Name of the identifier column in input. |
locationDef | string | "average" | Definition of the reference point if the input is in region format (chr:start-end). Options are: start=start base pair, average=(start+end)/2, and end=end base pair. |
nearestOnly | boolean | true | Take only that gene that is the closest for the given locus |
nextDist | int | -1000 | If this value is greater than zero then the nearest gene is reported separately for both directions along the strand. The value determines the maximum number of base pairs that are allowed after the nearest gene to accept the second hit from the other side of the site or those from the complement strand (see bothStrands). This parameter has no effect if nearestOnly parameter is false. |
signDist | boolean | false | True means that the distance is gene.start-loci and false means that it is max(0, gene.start-loci, loci-gene.end). |
Test case | Parameters▼ | IN sourceKeys |
IN connection |
OUT bioAnnotation |
||
---|---|---|---|---|---|---|
case1 | properties | sourceKeys | connection | bioAnnotation | ||
bpBefore = 1, |
||||||
case2 | properties | sourceKeys | connection | bioAnnotation | ||
bpBefore = 1, |
||||||
case3 | properties | sourceKeys | connection | bioAnnotation | ||
bpBefore=1000000, |
||||||
case4 | properties | sourceKeys | connection | bioAnnotation | ||
signDist=true, |
||||||
case5 | properties | sourceKeys | connection | bioAnnotation | ||
bpBefore = 1, |
||||||
case6 | properties | sourceKeys | connection | bioAnnotation | ||
bpAfter = 10000, |