Up: Component summary Component

NextGene

Finds the closest gene, exon, or transcript for the given loci. The search for the closest gene is restricted into a window that expands the locus independently into both directions along the DNA strand. The input locations can be given as baseColumn in format chromosome:start-end or using chrColumn and baseColumn in format start.

Version 1.5.1
Bundle microarray
Categories Annotation Short-read Sequencing SNP
Authors Marko Laakso (Marko.Laakso@Helsinki.FI)
Issue tracker View/Report issues
Requires mysql-connector-java-5.1.6-bin.jar (jar)
Source files component.xml
Usage Example with default values

Inputs

Name Type Mandatory Description
sourceKeys CSV Mandatory Locations of interest
connection Properties Optional Database connection can be defined using this file. The definition of parameters: database.url, database.timeout, database.recycle, and database.driver can be found from the documentation of Korvasieni.

Outputs

Name Type Description
bioAnnotation CSV Table that contains the Ensembl IDs for the loci.

Parameters

Name Type Default Description
baseColumn string (no default) Name of the base pair loci column in input.
bothStrands boolean false Take the nearest separately for both DNA strands. The same nextDist limit applies to both strands that is the strand with the closest gene determines the distance limit.
bpAfter int 10000 Number of base pairs that are looked after the given loci
bpBefore int 0 Number of base pairs that are looked before the given loci
chrColumn string "" Name of the chromosome column in input if the base column does not include this information.
echoColumns string "" A comma separated list of column names for the columns that will be copied to the output.
elementType string "gene" Type of elements of interest (gene, exon, or transcript)
idColumn string (no default) Name of the identifier column in input.
locationDef string "average" Definition of the reference point if the input is in region format (chr:start-end). Options are: start=start base pair, average=(start+end)/2, and end=end base pair.
nearestOnly boolean true Take only that gene that is the closest for the given locus
nextDist int -1000 If this value is greater than zero then the nearest gene is reported separately for both directions along the strand. The value determines the maximum number of base pairs that are allowed after the nearest gene to accept the second hit from the other side of the site or those from the complement strand (see bothStrands). This parameter has no effect if nearestOnly parameter is false.
signDist boolean false True means that the distance is gene.start-loci and false means that it is max(0, gene.start-loci, loci-gene.end).

Test cases

Test case Parameters IN
sourceKeys
IN
connection
OUT
bioAnnotation
case1 properties sourceKeys connection bioAnnotation

bpBefore = 1,
bpAfter = 10,
idColumn = SNP,
chrColumn = chr,
baseColumn = loci

case2 properties sourceKeys connection bioAnnotation

bpBefore = 1,
bpAfter = 10,
idColumn = SNP,
baseColumn = loci,
echoColumns = reason

case3 properties sourceKeys connection bioAnnotation

bpBefore=1000000,
bpAfter=1000000,
bothStrands=true,
nextDist=200000,
idColumn= SNP,
chrColumn= chr,
baseColumn= loci

case4 properties sourceKeys connection bioAnnotation

signDist=true,
bpBefore=1000000,
bpAfter=1000000,
bothStrands=true,
nextDist=230000,
idColumn= SNP,
chrColumn= chr,
baseColumn= loci

case5 properties sourceKeys connection bioAnnotation

bpBefore = 1,
bpAfter = 10,
idColumn = SNP,
chrColumn = chr,
baseColumn = loci,
elementType = exon

case6 properties sourceKeys connection bioAnnotation

bpAfter = 10000,
idColumn = SNP,
chrColumn = chr,
baseColumn = loci,
elementType = transcript,
nearestOnly = false,
locationDef = start


Generated 2019-02-08 07:42:10 by Anduril 2.0.0