A very basic clustering-like method. Bins each column in N number of bins. Bins are based on z-scores. First bin starts from MEAN-3*SD and last bin ends at MEAN+3*SD. Bin breaks are linearly distributed.
ClusterId is a unique numerical representation of the bin in N-D space. The columns in the input data are assumed to be in most-meaningful-first order.
Non-numeric columns are skipped.
Version | 1.0 |
---|---|
Bundle | tools |
Categories | Clustering |
Authors | Ville Rantanen (ville.rantanen@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R |
Source files | component.xml binning.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | CSV | Mandatory | Matrix data. |
Name | Type | Description |
---|---|---|
binned | CSV | Bin ID for each column. |
cluster | CSV | Cluster ID for each row. |
Name | Type | Default | Description |
---|---|---|---|
N | int | 2 | Number of bins. |
Test case | Parameters▼ | IN in |
OUT binned |
OUT cluster |
||
---|---|---|---|---|---|---|
case1 | properties | in | binned | cluster | ||
N=3 |