You are here: Start » Filter Reference » Sample Based Inspection » TrainOrganicModel

TrainOrganicModel

Trains OrganicModel, which has to be previously initialized with LoadOrganicTrainingData.

Name	Type	Range	Description
inIterationCount	Integer		Number of iteration of internal learning process.
inLearningRate	Real		How aggressive the learning should be.
inMomentum	Real		Factor that helps to move out of local minima.
inModelCapacity	Integer		Internal size of model.
inPreprocessing	DataPreprocessing		Method of cleaning the data before learning.
inVarianceToLeave	Real *	0.0 - 1.0	Value of variance to leave, when PCA preprocessing is done.
ioOrganicModel	OrganicModel		Resulting model.

Description

This filter trains OrganicModel previously initialized with LoadOrganicTrainingData. The training is iterative process of establishing internal parameters of some Machine Learning based classification method. The most important external parameter is inModelCapacity. This should be adjusted to obtain best results. Recommended values are those that are close to the number of features used, but this should generally to be adjusted experimentally. If the PCA preprocessing is used, the feature space can be drastically reduced, thus there is a need to adjust inModelCapacity. For large models with lot of samples, greater than default values of inIterationCount, inLearningRate and inMomentum may be needed.

Remarks

Organic object classification is divided into several steps: preparing training data, loading it with LoadOrganicTrainingData, processing and training with use of TrainOrganicModel filter, and finally classification using RecognizeOrganicObject. Details of those steps are described in this section. It is advised to train the model during development phase, and than save trained OrganicModel and import it to production system, but it is also feasible to re-train such a model within deployed system. In terms of execution time, LoadOrganicTrainingData is the most time-consuming, and thus it is advised to load data once, and than save initialized mode. The TrainOrganicModel is quite fast (but this depends on parameters used and size of input data) which makes it plausible to adjust model directly in runtime.

The classification of organic objects is based on numerous transformations of given sample images. The training images should present best representants of given class to avoid adjustment of model to bad examples, i.e. there should not be any partially visible object, multiple objects or objects of different class in set of images associated with given class.

Because of variety of organic objects being classified, there is no universal method for segmenting them from the image. There is a need to devise extraction method for each new type of objects. To accommodate to this difficulty, the model is being trained with user provided images and regions. The regions should be covering all of the object in image. The best way is to prepare macro which extracts object ROI from image. This macro should be saved during training phase, and than used again when classification is performed.

Extracted from training images regions should be saved in the directory in which images are located. This can be done with SaveObject filter . Resulting region file name has to be the same, as image file name, from which the region has been extracted. Name should only differ in extension: for region, it has to be .avdata.

Each directory, filled with images and regions, should represents one class of objects. This is crucial - mixing images between training directories will result with faulty model.

LoadOrganicTrainingData scans through provided directories and loads images of given extension simultaneously with corresponding region files. Than it applies some transformations to the images, building coarse model of object classes. This process is time-consuming, but it is parameterless. It is advised to build good set of training images (few hundreds images per class should be sufficient) and execute LoadOrganicTrainingData only once. Saved OrganicModel weights much less than whole set of images with accompanying regions. Initialized model can later be loaded and trained.

The training process is performed with TrainOrganicModel filter. Internally it is an iterative process which transforms coarse model of provided object classes into more specific, better model. Initially, the data can be preprocessed. It is recommended to use raw data only for easy classification tasks - it makes training a little bit faster, but raw data can be full of noise, which prevents classifier to obtain good fit. Popular and fast pre-processing method is Normalization - using it often results in good fit. The PCA method transforms coarse model using Principal Components Analysis, effectively removing parts of model which are not giving a lot of insight into the data. This, however, reduces internal model size, and has to be reflected with changes of TrainOrganicModel.inModelCapacity parameter.

After training, the model should be assessed. For the assessment, another set of images and regions should be prepared - it is called "validation set". The validation set should comprise images, that are not included in previously used training set - this is crucial for avoiding so called overfitting. Model assessment is to perform classification on validation set with trained model, and compare resulting assignments to real classes of objects provided. To classify object with trained model, RecognizeOrganicObject filter can be used. To calculate performance metrics of classification, MeasureClassificationQuality_Multiclass or MeasureClassificationQuality_Binary can be used. Mentioned filters are calculating few scores and, so called, confusion matrix, from which it is clearly visible, how the classifier is doing its job. After assessment, the trained model can be used in production system, or - if the chosen metric is not good enough - it can be trained again with different parameters.

Errors

This filter can throw an exception to report error. Read how to deal with errors in Error Handling.

List of possible exceptions:

Error type	Description
DomainError	Trying to train already trained model!
DomainError	Trying to train uninitialized model!

Complexity Level

This filter is available on Advanced Complexity Level.

TrainOrganicModel

Description

Remarks

Errors

Complexity Level

See Also