Transforms CSV files using R expressions. This allows applying arithmetic functions to numeric columns and combining columns from different CSV files.
The inputs are one to two CSV files. The R expressions are evaluated and are expected to return R matrices, data frames or vectors that are concatenated to a final result. Concatenations is done on columns, so each transformation creates additional columns to the output. Transformations should create items having the same number of rows. However, the expression may yield a single string or number that is duplicated to fit the number of rows.
In the expressions, "csv1" and "csv2" are R data frames containing contents of the input files; "csv2" is defined only if the csv2 input is given. The numeric columns of csv1 and csv2 are visible as R matrices "matrix1" and "matrix2". If there are no numeric columns, these are empty.
Version | 1.1 |
---|---|
Bundle | tools |
Categories | Preprocessing |
Specialties | generic |
Authors | Kristian Ovaska (kristian.ovaska@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R |
Source files | component.xml CSVTransformer.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
csv1 | CSV | Optional | Input file 1. |
csv2 | CSV | Optional | Input file 2. |
columnNamesFile | IDList | Optional | Column names for output. Overridden by the 'columnNames' parameter. |
array | Array<CSV> | Optional | Input CSV Array. Variables are named csv.[key] and matrix.[key] |
Name | Type | Description |
---|---|---|
out | T (generic) | Transformed output. The first column(s) are created using transform1, the next column(s) using transform2, and so on. |
Name | Type | Default | Description |
---|---|---|---|
columnNames | string | "" | R expression that evaluates to the column names of the result CSV file. The evaluated vector must have the same number of items as there are columns in the output. If empty, column names are taken from the input CSV files; depending on the transforms, some column names may be automatically generated. |
combineFunction | string | "cbind" | R expression that combines the transformations. Defaults to cbind, that joins by columns. Use rbind to join by rows. |
transform1 | string | (no default) | R expression that evaluates to a matrix, data frame, vector or constant. The expression may refer to data frames "csv1" and "csv2" (only if csv2 is given) and matrices "matrix1" and "matrix2" (only if csv2 is given). |
transform2 | string | "" | Transformation expression 2. If empty, no transformation is done. |
transform3 | string | "" | Transformation expression 3. If empty, no transformation is done. |
transform4 | string | "" | Transformation expression 4. If empty, no transformation is done. |
transform5 | string | "" | Transformation expression 5. If empty, no transformation is done. |
transform6 | string | "" | Transformation expression 6. If empty, no transformation is done. |
transform7 | string | "" | Transformation expression 7. If empty, no transformation is done. |
transform8 | string | "" | Transformation expression 8. If empty, no transformation is done. |
transform9 | string | "" | Transformation expression 9. If empty, no transformation is done. |
Test case | Parameters▼ | IN csv1 |
IN csv2 |
IN columnNamesFile |
IN array |
OUT out |
---|---|---|---|---|---|---|
case1 | properties | csv1 | csv2 | (missing) | (missing) | out |
transform1=csv1[,c("C1","C2")], |
||||||
case2_colnames | properties | csv1 | csv2 | (missing) | (missing) | out |
transform1=csv1[,c("C1","C2")], |
||||||
case3_matrix | properties | csv1 | csv2 | (missing) | (missing) | out |
transform1=(matrix1+matrix2)/2 |
||||||
case4_reusing_transformed | properties | csv1 | csv2 | (missing) | (missing) | out |
transform1=cbind(csv1[,c("C1","N1")],csv2[,"N3"],100*csv1[,"N1"]/csv2[,"N3"]), |
||||||
case5_colInput | properties | csv1 | csv2 | columnNamesFile | (missing) | out |
transform1=csv1[,c("C1","C2")], |
||||||
case6_join_by_row | properties | csv1 | csv2 | (missing) | (missing) | out |
transform1=t(csv1[,c("C2")]), |
||||||
case7_array | properties | (missing) | (missing) | (missing) | array | out |
transform1=csv.key1[,c("C1","C2")], |