CreateRndForestModel

Declaration: CreateRndForestModel (FNameSpdc, FNameTrnData, FNameRFModel: string; NTrees: integer; ResampFact: double; var YHat, ModelQuality: TDouble2DArray; var VarImp: TDouble2DArray): integer;
The function CreateRndForestModel creates a random forest-based model and stores it in the file FNameRFModel. The data used for the training of the random forest must be loaded in Epina ImageLab before performing the training (use, for example, LoadILabFile to load the data). The spectral descriptors used for the training are loaded from the file FNameSpdc. The file FNameTrnData must contain the training dataset.

The parameter NTrees controls the number of trees used for the random forest, the parameter ResampFact specifies the resampling factor of the trees (valid range: 0.0 to 0.666). The function CreateRndForestModel returns the estimated data in YHat, the model quality estimates in ModelQuality and the variable importance in the array VarImp.

The ModelQuality array is a two-dimensional array having 10 rows. The number of columns is controlled by the number of classes contained in the training dataset. The rows hold the following charateristic parameters for each class:
0 Training Set - Relative classification error (percent of incorrectly classified cases)
1 Training Set - Average cross-entropy (in bits per element)
2 Training Set - root mean square error when estimating posterior probabilities
3 Training Set - Average error when estimating posterior probabilities
4 Training Set - Average relative error when estimating posterior probability of belonging to the correct class
5 OOB Test - Relative classification error (percent of incorrectly classified cases)
6 OOB Test - Average cross-entropy (in bits per element)
7 OOB Test - root mean square error when estimating posterior probabilities
8 OOB Test - Average error when estimating posterior probabilities
9 OOB Test - Average relative error when estimating posterior probability of belonging to the correct class

The VarImp array contains the variable importance for each class. The rows of the matrix are corresponding to the classes, the columns to the spectral descriptors.

The function returns the following error codes:

 0 ... everything is OK
-1 ... invalid ResampFact (valid range 0.05 to 0.66)
-2 ... invalid number of trees (valid range: 1..500)
-3 ... cannot find spectral descriptors
-4 ... cannot find training data
-5 ... mismatch of type of spectral descriptors, descriptors are loaded anyway (after confirmation by the user)
-6 ... no descriptors are loaded after user denied to load them
-7 ... no descriptors loaded, due to wrong file which does not contain descriptors
-8 ... no descriptors loaded, file is empty
-9 ... cannot load training data
-10 ... training data and spec data do not match (warning only)

Hint 1: The spectral descriptor file may be either set up interactively by using the spectral descriptor editor, or by using the ILabPascal (see Spectral & Image Processing - section Spectral Descriptors for a list of available functions).

Hint 2: The training dataset can be compiled either by using the dataset editor or programmatically using the class TIlabTrnDataSet.