Back to Adaptive Vision Library website

You are here: Start » Function Reference » Computer Vision » Optical Character Recognition » TrainOcr_MLP


Header: AVL.h
Namespace: avl
Module: OCR

Trains an OCR multilayer perceptron classifier.


void avl::TrainOcr_MLP
	const atl::Array<avl::CharacterSample>& inCharacterSamples,
	const avl::Size& inNormalizationSize,
	atl::Optional<const atl::Array<int>& > inHiddenLayerSizes,
	atl::Optional<int> inRandomSeed,
	const avl::CharacterFeatures& inCharacterFeatures,
	float inLearningRate,
	float inMomentum,
	int inIterationCount,
	atl::Optional<const avl::Size&> inCharacterSize,
	avl::OcrModel& outOcrModel,
	float& outTrainingAccuracy,
	avl::Profile& diagError,
	atl::Array<avl::Image>& diagNormalizedCharacters


Name Type Range Default Description
Input value
inCharacterSamples const Array<CharacterSample>& Training font created from sample regions
Input value
inNormalizationSize const Size& (Width: 16, Height: 16) The character size after normalization
Input value
inHiddenLayerSizes Optional<const Array<int>& > NIL Internal structure of neuron layers used in classifier
Input value
inRandomSeed Optional<int> 0 - + NIL Random seed used by MLP classifier
Input value
inCharacterFeatures const CharacterFeatures& (Pixels: True) Character features used to distinguish characters from each other
Input value
inLearningRate float 0.01 - 1.0 0.6f Suppression level of changes during learning process
Input value
inMomentum float 0.0 - 1.0 0.75f Value of classifier learning momentum
Input value
inIterationCount int 1 - + 100 Learning iteration count
Input value
inCharacterSize Optional<const Size&> NIL Size of fixed width font
Output value
outOcrModel OcrModel& Trained OcrMlpModel used to recognize characters
Output value
outTrainingAccuracy float& The overall training score
Diagnostic input
diagError Profile& Changes of mean error level progress during learning process
Diagnostic input
diagNormalizedCharacters Array<Image>& Images of normalized characters used to train classifier


This filter prepares a MLP classifier for the further OCR operations.

Filter requires a set of prepared CharacterSample which can be created using MakeCharacterSamples.

Parameter inCharacterSize defines the size of character cropping box. It is especially useful when characters are much bigger than normalization size. When it has Nil value the character is cropped to its bounding box.

Parameters inHiddenLayerSizes, inRandomSeed are used in the process of learning of a newly created MLP classifier. For further parameter description please refer to the documentation of the MLP_Init filter.

All the input regions in filters RecognizeCharacters and TrainOcr_MLP are resized to the size specified in the inNormalizationSize input parameter. The further classification is performed on the normalized regions. Therefore, it is important to select the appropriate normalization size.

The selection of too small normalization size may result in loss of character details. However, too large value of normalization size increases the classifier learning time. The best recognition results are obtained when the size of character is nearly the same as the normalization size.

The character classification depends on character features that are selected in the inCharacterFeatures parameter. At least one feature must be selected. By the default the feature Pixels is selected.

The table below contains the description of each available character feature:

Feature name Description Filter origin Normalized
Pixels Values of the image pixels after normalization. False
NormalizedPixels Values of the image pixels after normalization normalized to range <0, 1.0>. True
Convexity Ratio of the input region area to area of its convex hull. RegionConvexity True
Circularity Ratio of the region area to area of its bounding circle. RegionCircularity True
NumberOfHoles Number of holes found in the input region. RegionHoles True
AspectRatio Ratio of input region width to its height. RegionBoundingBox False
Width Region bounding box width. RegionBoundingBox False
Height Region bounding box height. RegionBoundingBox False
AreaRatio Ratio of the input region area to area of its bounding box. True
DiameterRatio Ratio of the input region diameter to diameter of its bounding box. RegionDiameter True
Elongation Ratio of longer axis of the approximating ellipse to the shorter one. RegionElongation False
Orientation Further details in the filter RegionOrientation documentation. RegionOrientation True
Zoning4x4 Normalized pixel values of region reduced to size 4x4 pixel. True
HorizontalProjection Values of normalized image projection normalized by region height. ImageProjection True
VerticalProjection Values of normalized image projection normalized by region height. ImageProjection True
HoughCircles Count of circles found in the normalized image. True
Moment_11 Character geometric moment type M11. RegionMoment False
Moment_20 Character geometric moment type M20. RegionMoment False
Moment_02 Character geometric moment type M02. RegionMoment False


It is recommended not to set normalization size greater than 50 pixels in each dimension. That could make learning time too long.

For more remarks about using MLP classifier please refer to the documentation of the MLP_Init filter.

To read more about how to use OCR technique, refer to Machine Vision Guide: Optical Character Recognition


List of possible exceptions:

Error type Description
DomainError At least a single feature must be selected in inCharacterFeatures in TrainOcr_MLP.
DomainError Hidden layer should have at least a single hidden layer in TrainOcr_MLP.
DomainError Invalid character sample in TrainOcr_MLP.
DomainError Invalid normalization size in InitOcr_MLP.

See Also

  • TrainOcr_SVM – Trains an OCR support vector machines classifier.
  • RecognizeCharacters – Classifies input regions into characters. Based on the Multi-Layer Perceptron model.