Converts gene, transcription, and translation identifiers using Korvasieni. This component can be used to annotate findings and to integrate information between various sources.
An example list of supported (source and target) databases includes a snapshot of known Ensembl links. You may prefix these databases with an underscore (_) for real database identifiers.
Version | 1.9 |
---|---|
Bundle | microarray |
Categories | Annotation GO |
Authors | Marko Laakso (Marko.Laakso@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | mysql-connector-java-5.1.6-bin.jar (jar) |
Source files | component.xml |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
sourceKeys | CSV | Mandatory | A list of source database keys. |
connection | Properties | Optional | Database connection can be defined using this file. The definition of parameters: database.url, database.user, database.password, database.timeout, database.recycle, and database.driver can be found from the documentation of Korvasieni. |
Name | Type | Description |
---|---|---|
bioAnnotation | AnnotationTable | Table that contains columns for all types of annotation given in the parameter "types". Indicator column is 1 if the input key is found from the database. |
Name | Type | Default | Description |
---|---|---|---|
echoColumns | string | "" | A comma separated list of column names for the columns that will be copied to the output. An asterisk (*) can be used to denote all columns except the keyColumn. |
goFilter | string | "" | A comma separated list of the Gene Ontology evidence codes that shall be excluded. This parameter is only used for the 'GO' annotations. |
indicator | boolean | true | Enables an indicator column that tells (=1) if the source key was matching the database or not (=0). |
inputDB | string | "" | Type of input keys. This must be a database supported by Korvasieni. If the parameter is omitted, the component tries to derive the database from the type of geneID. If this is not possible, an error is returned. You may define three columns in form of chromosome:start-end in case the inputDB is .DNARegion. This format provides a comfortable compatibility with DNARegion datatype. The end positions can be left out if they would be the same as the start positions (=single nucleotides). |
inputType | string | "Any" | Ensembl object type for the input keys (Any, Gene, Transcript, Translation) |
isListKey | boolean | false | Enables the automatic value splits for the comma separated key column |
keyColumn | string | "" | Name of the key column withing sourceKeys file or an empty string for the first column. See inputDB for further information about the DNA regions. |
maxHits | int | 100000 | Maximum number of target identifiers for a single source identifier |
primary | boolean | false | Skip secondary identifiers such as LGR identifiers. |
rename | string | "" | Comma separated list of column renaming rules (oldname=newname) |
skipLevel | string | "never" | Skip result rows if the source identifier is unknown or target identifiers are not available. Possible values are: never (no filtering), source (skip if the source ID is unknown), target (skip if no target IDs are found), any (skip if any of the target IDs is missing). |
targetDB | string | (no default) | Comma-separated list of annotation types. Possible values are all databases supported by Korvasieni. |
unique | boolean | false | This flag can be turned on in order to eliminate duplicate annotations. |
Test case | Parameters▼ | IN sourceKeys |
IN connection |
OUT bioAnnotation |
||
---|---|---|---|---|---|---|
case1 | properties | sourceKeys | connection | bioAnnotation | ||
inputType=Gene, |
||||||
case2 | properties | sourceKeys | connection | bioAnnotation | ||
inputType=Gene, |
||||||
case3_DNARegion | properties | sourceKeys | connection | bioAnnotation | ||
inputType =Gene, |
||||||
case4_colconflict | properties | sourceKeys | connection | bioAnnotation | ||
inputType=Gene, |
||||||
case5 | properties | sourceKeys | connection | bioAnnotation | ||
inputType = Gene, |
||||||
case6_DNARegion | properties | sourceKeys | (missing) | bioAnnotation | ||
inputType =Gene, |