Classification problems in high dimensional data with a small number of observations are becoming more common especially in microarray data. During the last two decades, lots of efficient classification models and feature selection (FS) algorithms have been proposed for higher prediction accuracies. However, the result of an FS algorithm based on the prediction accuracy will be unstable over the variations in the training set, especially in high dimensional data. This paper proposes a new evaluation measure Q-statistic that incorporates the stability of the selected feature subset in addition to the prediction accuracy. Then, we propose the Booster of an FS algorithm that boosts the value of the Q-statistic of the algorithm applied. Empirical studies based on synthetic data and 14 microarray data sets show that Booster boosts not only the value of the Q-statistic but also the prediction accuracy of the algorithm applied unless the data set is intrinsically difficult to predict with the given algorithm.
Hardware – Pentium
Speed – 1.1 GHz
RAM – 1GB
Hard Disk – 20 GB
Key Board – Standard Windows Keyboard
Booster in High Dimensional Data Classification Booster in High Dimensional Data Classification Booster in High Dimensional Data Classification Booster in High Dimensional Data Classification Booster in High Dimensional Data Classification PPT BASEPAPERS