![]() |
Klimasauskas GroupTM |
| Home Applications News Feedback Site Map About Articles & Papers Recipes |
| Home > News > Press Release | ||
|
| Breakthrough in Small Data Set Classifiers June 18, 2003, Sewickley, PA - Klimasauskas Group announced today a breakthrough in small data set classifiers. There are thousands of situations where data is expensive to collect, yet where the ability to model this data could have tremendous value. Examples range from pharmacological research, to classifying oceanic strata, to predicting behavior. In many of these cases, not only is the data expensive to collect, but there are potentially dozens or even hundreds of possible inputs. The challenge for both statistics and neural technologies has been developing meaningful models that predict effectively in new situations. Klimasauskas Group has developed an approach that is statistically sound, and produces highly effective models. The basic insights that led to this development were:
This approach to solving was developed under a project funded by the Defense Advanced Research Projects Agency (DARPA). In the DARPA research, the typical problem had 25-100 cases and 40-90 input variables. A genetic algorithm is used to find synergistic subsets of variables that are highly predictive of the target outcomes. To address the issue of small data set sizes, solutions are ranked based on a full n-fold cross-validation of a linear, sigmoid, or softmax single layer neural network. Special genetic operators were developed that act not only on individuals, but all possible nearby (in a Hamming sense) solutions, as well as trans-generational operators that are capable to performing hybrid cross-over like operations on ensembles of individuals across multiple generations. In an approach similar to bootstrapping or bagging, multiple networks are collected together into ensembles to produce a final prediction. However, unlike bootstrapping or bagging, solutions are selected based on minimally overlapping (maximally spanning) subsets of variables. "From our research, this approach represents a real break-through in building effective classifiers using small data sets" according to Casey Klimasauskas, President. "We are looking forward to working with several companies in applying this to their data-modeling problems in the future." Contact:
|
| Send mail to
with questions or comments about this web site. All rights reserved. Updated: 02/25/2007 . |