You are here: Start » Function Reference » Optical Character Recognition » RecognizeCharacters

RecognizeCharacters

Header:	AVL.h
Namespace:	avl

Classifies input regions into characters. Based on the Multi-Layer Perceptron model.

Syntax

void avl::RecognizeCharacters
(
	const atl::Array<avl::Region>& inCharacterRegions,
	const avl::OcrModel& inOcrModel,
	atl::Optional<const avl::Size&> inCharacterSize,
	const bool inDotPrint,
	const avl::CharacterSortingOrder::Type inCharacterSorting,
	atl::Optional<float> inMinScore,
	atl::Optional<int> inMinSpaceWidth,
	atl::String& outCharacters,
	atl::Array<float>& outScores,
	atl::Array<atl::Array<avl::OcrCandidate> >& outCandidates,
	atl::Array<avl::Image>& diagNormalizedCharacters,
	atl::Array<avl::Box>& diagCharactersBoxes
)

Parameters

Name	Type	Range	Default	Description
inCharacterRegions	const Array<Region>&			Array of character regions to recognize
inOcrModel	const OcrModel&			Trained OcrMlpModel used to recognize characters
inCharacterSize	Optional<const Size&>		NIL	Size of single monospaced character if needed
inDotPrint	const bool			Dot-printed characters preprocessing
inCharacterSorting	const CharacterSortingOrder::Type		LeftToRight	Sorting order of input characters
inMinScore	Optional<float>	0.0 - 1.0	NIL	Minimal value of accepted result. Otherwise char '*' will be placed.
inMinSpaceWidth	Optional<int>	0 -	NIL	Minimal distance between characters where space character will be inserted
outCharacters	String&			Result of characters recognition
outScores	Array<float>&			Classification result score
outCandidates	Array<Array<OcrCandidate> >&			Array of a character classification results and their score
diagNormalizedCharacters	Array<Image>&			Images of normalized characters used in character recognition
diagCharactersBoxes	Array<Box>&			Bounding boxes of characters

Description

This operation performs reading a text from a set of regions. This operation requires a trained OCR classifier, which can be created using the TrainOcr_MLP or TrainOcr_SVM filter.

Filter requires regions specified in the inCharacterRegions. Each of the input region must contain single character. To get regions from an image use filter ExtractText.

The inCharacterSorting parameter defines the sorting order of the input characters provided in inCharacterRegions.

The parameter inDotPrint turns on the dedicated smoothing for characters printed using a jet printer in a dot-matrix form.

The parameter inCharacterSize defines size of monospaced (fixed-width) font. This size defines the cutting character bounding box.

Image shows extracted character cutting boxes with different cutting size.

If the parameter is set to auto character will be recognized as proportional font. For further information about font types please refer to the documentation of filters TrainOcr_SVM or TrainOcr_MLP.

Characters boxes with inCharacterSize set to Auto.

The input inMinSpaceWidth value indicates a minimal distance between characters where space character will be inserted to result string. When the value is marked as Auto no spaces will be inserted. Distance is presented in pixels.

If inMinScore is set, the characters with smaller score will be replaced with *

The output outCharacters contains recognized characters. The recognition score of each recognized character is stored in the outScores.

Remarks

To read more about how to use OCR technique, refer to Machine Vision Guide: Optical Character Recognition

Errors

List of possible exceptions: