Extracts columns from each given CSV file and prints their content out without duplicates. NA values are removed. Only one column from each input file can be processed at a time, so taking the union of two columns in the same file requires that the file be specified twice as an input table. The result is an union over all the inputs.
Version | 1.4 |
---|---|
Bundle | tools |
Categories | Convert |
Authors | Marko Laakso (Marko.Laakso@Helsinki.FI) |
Issue tracker | View/Report issues |
Requires | csbl-javatools.jar (jar) ; installer (bash) |
Source files | component.xml |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in1 | CSV | Optional | The first input relation |
in2 | CSV | Optional | The second input relation |
in3 | CSV | Optional | The third input relation |
in4 | CSV | Optional | The fourth input relation |
in5 | CSV | Optional | The fifth input relation |
in6 | CSV | Optional | The sixth input relation |
in7 | CSV | Optional | The seventh input relation |
in8 | CSV | Optional | The eighth input relation |
in9 | CSV | Optional | The ninth input relation |
array | Array<CSV> | Optional | An array of input files |
Name | Type | Description |
---|---|---|
out | IDList | A list of selected IDs |
Name | Type | Default | Description |
---|---|---|---|
acceptMissing | boolean | false | Files with missing columnIn are accepted as empty if this is true. |
columnIn | string | "" | A comma separated list of column names for the IDs of interest in each table input. Empty values refer to the first column of the file. |
columnInArray | string | "" | A comma separated list of array_key=column_name pairs for the IDs of interest in array files. Empty values refer to the first column of the file. |
columnOut | string | "" | Name of the only column of the output list. Empty input refers to the name of the input column. |
constants | string | "" | A comma separated list of values that are always included into the output |
isList | boolean | false | True if the seleted column contains a comma separated list of values to be splitted |
quotation | boolean | false | Indicator that can be used to disable quotation of the output values |
regexp1 | string | "" | Regular expression for the row filtering in table1. A row is included in the result if this parameter is empty or if values in the given columns match given regular expressions. The parameter has a format COLNAME1=EXPRESSION,COLNAME2=EXPRESSION2 where COLNAMEs are column names in "csv" and EXPRESSIONs are regular expressions using Java syntax. For example, "col=a|b" includes rows where the column col has a value of "a" or "b". |
regexp2 | string | "" | Regular expression for the row filtering in table2 |
regexp3 | string | "" | Regular expression for the row filtering in table3 |
regexp4 | string | "" | Regular expression for the row filtering in table4 |
regexp5 | string | "" | Regular expression for the row filtering in table5 |
regexp6 | string | "" | Regular expression for the row filtering in table6 |
regexp7 | string | "" | Regular expression for the row filtering in table7 |
regexp8 | string | "" | Regular expression for the row filtering in table8 |
regexp9 | string | "" | Regular expression for the row filtering in table9 |
regexpArr | string | "" | Regular expression for the row filtering of array files |
Test case | Parameters▼ | IN in1 |
IN in2 |
IN in3 |
IN in4 |
IN in5 |
IN in6 |
IN in7 |
IN in8 |
IN in9 |
IN array |
OUT out |
---|---|---|---|---|---|---|---|---|---|---|---|---|
case1 | properties | in1 | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | out |
columnIn=name, |
||||||||||||
case2 | properties | in1 | in2 | in3 | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | out |
isList =true, |
||||||||||||
case3 | properties | in1 | (missing) | (missing) | in4 | in5 | in6 | in7 | in8 | in9 | (missing) | out |
acceptMissing=true, |
||||||||||||
case4 | properties | (missing) | in2 | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | (missing) | array | out |
isList = true, |