Home Image Processing Classification Random Forests Variable Importance in Random Forests | |||||
See also: Spectral Descriptor Editor, Random Decision Forest, Random Forest Model Validation
|
|||||
Variable Importance in Random Forests |
|||||
When designing classifiers the success of a classifier largely depends on the selection of proper spectral descriptors. While the selection of descriptors can be achieved by many techniques, random forests provide some kind of a built-in support for selecting the right variables. When random forests are trained the algorithm tracks how often each descriptor is used by the trees of the forest and how many of the training data points are affected by the decision within a tree. This information can be compiled into a characteristic number which reflects the importance of a variable. The variable importance is calculated for each class separately, and in addition, the overall importance for all classes is calculated as well. The results are displayed both in tabular and graphical form, and can be used to prune the list of descriptors. The overall importance is calculated by determining the maximum for each descriptor over all classes. Please note that the variable importance is a relative measure and it is scaled to a maximum of 1.0 for each class. Thus the variable importance has to be judged in combination with the classification results. The following example shows an example of the importance of variables used to detect an apple. One can clearly see that for successfully detecting apples only 10 of a total of 111 descriptors are actually necessary.
|