.

Polynomial Cascades for Data Mining - Classification and Regression for large, high dimensional datasets

Polynomial Cascades were originally developed by Greg Grudic et.al. for high-dimensional regression problems. An extension for classification, Polynomial MPMC Cascades, were developed by Sander Bohte, Greg Grudic and myself. On this page I list some of the features of the regression and the classification method.

Features (Regression)

• Fast - can handle large, high-dimensional problems.
• Accurate models - minimizes mean-squared error
• Can fit non-linear functions without the use of kernels.
• No parameter tuning required - can be run by non-experts.
• Interpretable model - the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is not an ensemble of classifiers.

Features (Classification)

• Fast - Runtime for using the features only is O(L x N x d) with L being the number of levels (data dependant, but usually <400), N being the number of training examples and d being the dimensionality of the problem. The method scales to very large problems with millions of examples (read: millions of rows in terms of SQL Databases).
• No parameter tuning required - You can just run the algorithm and it will give you a good model. This means no cross validation is needed to determine a suitable kernel and kernel parameters.
• Can classify non linearly separable classification problems without fine tuning of parameters
• Competitive Performance on benchmarks with the Minimax Probability Machine for Classification (MPMC) and Support Vector Machines (SVM)
• Incremental building of the model - You can stop the learning at any point in time and have a usable model. This can be useful if you have time constraints and a classifier must be learned in real time.
• Bounded misclassification error - The PMC generates a worst case bound on the misclassification error for future, unseen data.
• The underlying assumptions are minimal - No specific distributions are assumed.
• Interpretable model - the learned hypothesis is one high-degree, multi-variate polynomial. The hypothesis is not an ensemble of classifiers.
• Feature Selection - One feature per level is used, chosen by a theoretically well founded metric.
• Kernels - the algorithm can be easily extended by kernels.

Future Work - Possible extensions

• Handling multiple classes - currently the PMC only handles binary classification problems.
• Handling missing values by not accounting for them during the computation of mean and covariance.

Code

You can download Matlab code for the classification cascade from here: Download Polynomial MPMC Cascades sourcecode.

Papers

This page is Copyright © Markus Breitenbach 2010. All rights reserved. Any opinions expressed here are my own and might not reflect my employers opinion.
[This page: http://cervisia.org/cascades.php was last modified: July 08 2006 15:41:43.]   [Home].   Email me   Visit Markus Breitenbach's other homepage.

classification regression data mining machine learning scalable ensemble interpretable