You are here: Start » Filter Reference » Computer Vision » Optical Character Recognition » ExtractText

ExtractText

Module:	OCR

Ready-to-use tool for extracting and splitting character to single characters.

Name	Type	Description
inImage	Image	An input image with text
inRoi	Rectangle2D	Location of the text
inRoiAlignment	CoordinateSystem2D	Adjusts the region of interest to the position of the inspected object
inSegmentationModel	TextSegmentation	Model used for separating text from background
outCharacters	Region Array	Split characters aligned to ROI
diagTextRegion	Region	Region of text after extraction
diagAlignedCharacters	Region Array	Character regions preserving original image orientation
diagAlignedRoi	Rectangle2D	ROI rectangle after alignment

Description

This filter distinguish characters from the background using a set of algorithms. This filter performs two steps: text extraction and text segmentation:

Text extraction – in this step filter uses some basic thresholding methods to get a characters region from the image.
Text segmentation – split previously found region into separate regions contains character.

Input inRoi should contains only a single line of text.

Hints

Connect the inImage input with an appropriate image source. Make sure that this image is available (the program was previously run).
If the object location is variable, connect an appropriate local coordinate system to inRoiAlignment.
Before defining inSegmentationModel first specify inRoi – the rectangle in which characters will be extracted.
After inRoi is defined, enter the graphical editor for the inSegmentationModel. Answer a few simple questions to choose one of the predefined sets of parameters.
Within the graphical editor select parameters for Text Extraction and for Text Segmentation.

Examples

Description of usage of this filter can be found in examples and tutorial: OCR Read Number, Reading Numbers from Images, Typical OCR application.

Result of reading text using the ExtractText.

Remarks

To read more about how to use traditional OCR technique, refer to Machine Vision Guide: Optical Character Recognition - traditional method

Errors

This filter can throw an exception to report error. Read how to deal with errors in Error Handling.

List of possible exceptions:

Error type	Description
DomainError	Invalid segmentation algorithm in ExtractText.
DomainError	Invalid thresholding algorithm in ExtractText.

Complexity Level

This filter is available on Basic Complexity Level.