Up: Component summary Component

KorvasieniAnnotator

Converts gene, transcription, and translation identifiers using Korvasieni. This component can be used to annotate findings and to integrate information between various sources.

An example list of supported (source and target) databases includes a snapshot of known Ensembl links. You may prefix these databases with an underscore (_) for real database identifiers.

Version 1.9
Bundle microarray
Categories Annotation GO
Authors Marko Laakso (Marko.Laakso@Helsinki.FI)
Issue tracker View/Report issues
Requires mysql-connector-java-5.1.6-bin.jar (jar)
Source files component.xml
Usage Example with default values

Inputs

Name Type Mandatory Description
sourceKeys CSV Mandatory A list of source database keys.
connection Properties Optional Database connection can be defined using this file. The definition of parameters: database.url, database.user, database.password, database.timeout, database.recycle, and database.driver can be found from the documentation of Korvasieni.

Outputs

Name Type Description
bioAnnotation AnnotationTable Table that contains columns for all types of annotation given in the parameter "types". Indicator column is 1 if the input key is found from the database.

Parameters

Name Type Default Description
echoColumns string "" A comma separated list of column names for the columns that will be copied to the output. An asterisk (*) can be used to denote all columns except the keyColumn.
goFilter string "" A comma separated list of the Gene Ontology evidence codes that shall be excluded. This parameter is only used for the 'GO' annotations.
indicator boolean true Enables an indicator column that tells (=1) if the source key was matching the database or not (=0).
inputDB string "" Type of input keys. This must be a database supported by Korvasieni. If the parameter is omitted, the component tries to derive the database from the type of geneID. If this is not possible, an error is returned. You may define three columns in form of chromosome:start-end in case the inputDB is .DNARegion. This format provides a comfortable compatibility with DNARegion datatype. The end positions can be left out if they would be the same as the start positions (=single nucleotides).
inputType string "Any" Ensembl object type for the input keys (Any, Gene, Transcript, Translation)
isListKey boolean false Enables the automatic value splits for the comma separated key column
keyColumn string "" Name of the key column withing sourceKeys file or an empty string for the first column. See inputDB for further information about the DNA regions.
maxHits int 100000 Maximum number of target identifiers for a single source identifier
primary boolean false Skip secondary identifiers such as LGR identifiers.
rename string "" Comma separated list of column renaming rules (oldname=newname)
skipLevel string "never" Skip result rows if the source identifier is unknown or target identifiers are not available. Possible values are: never (no filtering), source (skip if the source ID is unknown), target (skip if no target IDs are found), any (skip if any of the target IDs is missing).
targetDB string (no default) Comma-separated list of annotation types. Possible values are all databases supported by Korvasieni.
unique boolean false This flag can be turned on in order to eliminate duplicate annotations.

Test cases

Test case Parameters IN
sourceKeys
IN
connection
OUT
bioAnnotation
case1 properties sourceKeys connection bioAnnotation

inputType=Gene,
inputDB =.GeneName,
targetDB =.GeneId,UniGene

case2 properties sourceKeys connection bioAnnotation

inputType=Gene,
inputDB=.GeneName,
targetDB=.GeneId,UniGene,.DNARegion,.DNABand,
keyColumn=gene name,
isListKey=true,
echoColumns=gene name,
primary=true

case3_DNARegion properties sourceKeys connection bioAnnotation

inputType =Gene,
inputDB =.DNARegion,
targetDB =.GeneName,.DNARegion,
keyColumn =chromosome:start-end,
echoColumns=*

case4_colconflict properties sourceKeys connection bioAnnotation

inputType=Gene,
inputDB =.GeneName,
targetDB =EntrezGene,_EntrezGene,_WikiGene,WikiGene,EntrezGene,_WikiGene,
rename =WikiGene3=TheLastWikiGene,
indicator=false

case5 properties sourceKeys connection bioAnnotation

inputType = Gene,
inputDB = .DNARegion,
targetDB = .GeneName,.DNARegion,
keyColumn = chr:pos

case6_DNARegion properties sourceKeys (missing) bioAnnotation

inputType =Gene,
inputDB =.DNARegion,
targetDB =.GeneName,.DNARegion,
keyColumn =chromosome:start,
echoColumns=*


Generated 2019-02-08 07:42:10 by Anduril 2.0.0