Divides the content of the given CSV to an array of multiple CSV files. Possible partition rules are:
The component acts as the inverse operation to CSVListJoin.
Version | 1.0 |
---|---|
Bundle | tools |
Categories | Convert |
Authors | Ville Rantanen (ville.rantanen@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | python |
Source files | component.xml CSVSplit.py |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | CSV | Mandatory | A CSV file. |
Name | Type | Description |
---|---|---|
out | Array<CSV> | Splitted CSV files. |
Name | Type | Default | Description |
---|---|---|---|
N | int | 2 | Divide CSV into N CSV files. |
includeLabelCol | boolean | true | Flag to include/exclude column that is used as label |
labelCol | string | "" | Column name for unique labels or regular expression matching. Defining a labelCol will override the N parameter |
order | string | "first" | 'first': First file contains M first lines of the source. 'sparse': First file contains every Mth row of the source. Valid only when labelCol is not used. |
regexp | string | "(.*)" | Regular expression for matching. |
Test case | Parameters▼ | IN in |
OUT out |
|||
---|---|---|---|---|---|---|
case1_split_first | (missing) | in | out | |||
case2_split_sparse | properties | in | out | |||
order=sparse |
||||||
case3_split_by_label | properties | in | out | |||
labelCol=Well |
||||||
case4_split_by_regexp | properties | in | out | |||
labelCol=File, |