Trains a classifier based on the given sample data, or predicts with a classifier trained earlier.
MATLAB code: Kerstin Bunte (modified based on the code of Marc Strickert http://www.mloss.org/software/view/323/ and Petra Schneider). uses the Fast Limited Memory Optimizer fminlbfgs.m written by Dirk-Jan Kroon available at the MATLAB central. kerstin.bunte@googlemail.com
Classifier methods:
Version | 1.1 |
---|---|
Bundle | tools |
Categories | Classification |
Authors | Ville Rantanen (ville.rantanen@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | Matlab |
Source files | component.xml gmlvq_confusion.m gmlvq_evaluation.m gmlvq_predict_start.m gmlvq_train_start.m gmlvq_preprocess.m gmlvq_call.m |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
data | CSV | Optional | Sample data for the supervised learning. |
testData | CSV | Optional | Validation data to estimate accuracy of the new classifier. If not given, input data is used. |
classifyData | CSV | Optional | Data for which classes are predicted. NOTE This is not used in training or in validation! Weka requires class-column also for this dataset. You should add a column named with the parameter 'classColumn' to this dataset. It is a good trick to name id-column as 'classColumn', in this case it is also added to the 'predictedClasses' data set. |
inClassifier | MatlabBinary | Optional | A classifier object that is used instead of building new classifier based on training data. NOTE If this is set parameter 'method' or input 'data' are not used, you should still provide these values (empty values). |
Name | Type | Description |
---|---|---|
outClassifier | MatlabBinary | A new classifier that has been produced. |
confusion | Matrix | Confusion matrix with the class prediction frequencies as columns |
importances | CSV | Importances of the features when training. Cannot be produced when not in training mode. |
evaluation | CSV | Evaluation |
predictedClasses | CSV | If input 'classifydata' is provided, classes are predicted for the data and results are in this output. Otherwise this output is an empty file. |
Name | Type | Default | Description |
---|---|---|---|
classColumn | string | "" | Column name for the column that contains the reference class. |
columnsToRemove | string | "" | Comma separated list of names of columns not to be used in classification. Useful if you want to ignore some attribute in the data while teaching the classifier. |
iterations | int | 1 | Iterate training, return the one that creates minimum validation data error. |
method | string | "GMLVQ" | Choose from GMLVQ, LGMLVQ, GRLVQ. |
parameters | string | "" | A space separated list of parameters passed to clustering method. |
prototypes | int | 1 | Prototypes per class |
Test case | Parameters▼ | IN data |
IN testData |
IN classifyData |
IN inClassifier |
OUT outClassifier |
OUT confusion |
OUT importances |
OUT evaluation |
OUT predictedClasses |
---|---|---|---|---|---|---|---|---|---|---|
case1_train | properties | data | (missing) | (missing) | (missing) | (missing) | confusion | importances | evaluation | predictedClasses |
classColumn = Diagnosis, |
||||||||||
case2_classifying | properties | (missing) | (missing) | classifyData | inClassifier | (missing) | confusion | (missing) | evaluation | predictedClasses |
columnsToRemove = File,Diagnosis, |
||||||||||
case3_classify_with_known | properties | (missing) | (missing) | classifyData | inClassifier | (missing) | confusion | (missing) | evaluation | predictedClasses |
classColumn = Diagnosis, |
||||||||||
case4_train_and_predict | properties | data | (missing) | classifyData | (missing) | (missing) | confusion | importances | evaluation | predictedClasses |
classColumn = Diagnosis, |