Clusters rows in CSV files. The component is a wrapper for the R-package flowMeans. Clustering is based on k-means clustering but spherichal clusters are merged to find non-spherichal clusters.
Version | 1.0 |
---|---|
Bundle | flowand |
Categories | FlowCytometry |
Authors | Erkka Valo (erkka.valo@helsinki.fi) |
Requires | flowMeans (R-package) |
Source files | component.xml FlowMeans.r |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
inList | CSVList | Optional | CSV files to cluster. |
in | Array<CSV> | Optional | CSV files to cluster. |
Name | Type | Description |
---|---|---|
clustersList | CSVList | Clustered CSV files. |
clusters | Array<CSV> | Clustered CSV files. |
report | Latex | Report of the clustering. |
Name | Type | Default | Description |
---|---|---|---|
channelsToCluster | string | "*" | The columns used to cluster the data in a comma-separated list. Default value "*" uses all columns. If all the columns are not present in a input csv file, the clustering is done with the subset of columns that are present. |
clusterIDColName | string | "cluster" | The name of the column in the clusters output
which contains the cluster IDs of the rows. |
iterMax | int | 50 | The maximum number of iterations allowed. |
mahalanobis | boolean | true | If TRUE (default) mahalanobis distance will be used. Otherwised, euclidean distance will be used. |
maxN | int | -1 | Maximum number of clusters. If set to a negative value (default) the value will be estimated automatically from the data. |
nStart | int | 10 | The number of random sets used for initialization. |
numC | int | -1 | Number of clusters. If set to a negative value (default), the value will be estimated automatically. |
standardize | boolean | true | If TRUE, the data will be transformed to the [0,1] interval for clustering. The transformed values are not put into the output files. |