Back to Aurora Vision Library website

You are here: Start » Function Reference » Computer Vision » Optical Character Recognition » ExtractText

ExtractText


Header: AVL.h
Namespace: avl
Module: OCR

Ready-to-use tool for extracting and splitting character to single characters.

Syntax

C++
C#
 
void avl::ExtractText
(
	const avl::Image& inImage,
	const avl::Rectangle2D& inRoi,
	const avl::CoordinateSystem2D& inRoiAlignment,
	const avl::TextSegmentation& inSegmentationModel,
	atl::Array<avl::Region>& outCharacters,
	avl::Region& diagTextRegion,
	atl::Array<avl::Region>& diagAlignedCharacters,
	avl::Rectangle2D& diagAlignedRoi
)

Parameters

Name Type Default Description
Input value inImage const Image& An input image with text
Input value inRoi const Rectangle2D& Location of the text
Input value inRoiAlignment const CoordinateSystem2D& Adjusts the region of interest to the position of the inspected object
Input value inSegmentationModel const TextSegmentation& Model used for separating text from background
Output value outCharacters Array<Region>& Split characters aligned to ROI
Diagnostic input diagTextRegion Region& Region of text after extraction
Diagnostic input diagAlignedCharacters Array<Region>& Character regions preserving original image orientation
Diagnostic input diagAlignedRoi Rectangle2D& ROI rectangle after alignment

Description

This filter distinguish characters from the background using a set of algorithms. This filter performs two steps: text extraction and text segmentation:

  • Text extraction – in this step filter uses some basic thresholding methods to get a characters region from the image.
  • Text segmentation – split previously found region into separate regions contains character.

Input inRoi should contains only a single line of text.

Hints

  • If the object location is variable, pass an appropriate local coordinate system to inRoiAlignment.
  • Before defining inSegmentationModel first specify inRoi – the rectangle in which characters will be extracted.

Examples

Result of reading text using the ExtractText.

Remarks

To read more about how to use OCR technique, refer to Machine Vision Guide: Optical Character Recognition

Errors

List of possible exceptions:

Error type Description
DomainError Invalid segmentation algorithm in ExtractText.
DomainError Invalid thresholding algorithm in ExtractText.

See Also

  • ReadText – Ready-to-use tool for reading text from images using the OCR technique.
  • RecognizeCharacters – Classifies input regions into characters. Based on the Multi-Layer Perceptron model.