Re-orders cluster ID numbers based on a given data vector. The function is used to re-order cluster IDs of K-means results, where the cluster IDs are returned randomly.
The new order is decided by taking the mean of values, but other methods can be used aswell.
Version | 1.0 |
---|---|
Bundle | tools |
Categories | Clustering |
Authors | Ville Rantanen (ville.rantanen@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | R |
Source files | component.xml ClusterReorder.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | CSV | Mandatory | Data to sort with sorting column, and cluster id. |
Name | Type | Description |
---|---|---|
out | CSV | Re-ordered cluster IDs and input data |
Name | Type | Default | Description |
---|---|---|---|
clusterCol | string | "clusterId" | Cluster ID column name |
sortCol | string | (no default) | Sorting column to use. |
summaryFunction | string | "mean" | Function to summarize a cluster. min,max,mean,median or sd, or any R function |
Test case | Parameters▼ | IN in |
OUT out |
|||
---|---|---|---|---|---|---|
case1 | properties | in | out | |||
sortCol=Mean |
||||||
case2 | properties | in | out | |||
sortCol=Mean, |