Back to Aurora Vision Studio website

You are here: Start » Filter Reference » Computer Vision » Optical Character Recognition » ExtractText

ExtractText


Module: OCR

Ready-to-use tool for extracting and splitting character to single characters.

Name Type Description
Input value inImage Image An input image with text
Input value inRoi Rectangle2D Location of the text
Input value inRoiAlignment CoordinateSystem2D Adjusts the region of interest to the position of the inspected object
Input value inSegmentationModel TextSegmentation Model used for separating text from background
Output value outCharacters RegionArray Split characters aligned to ROI
Diagnostic input diagTextRegion Region Region of text after extraction
Diagnostic input diagAlignedCharacters RegionArray Character regions preserving original image orientation
Diagnostic input diagAlignedRoi Rectangle2D ROI rectangle after alignment

Description

This filter distinguish characters from the background using a set of algorithms. This filter performs two steps: text extraction and text segmentation:

  • Text extraction – in this step filter uses some basic thresholding methods to get a characters region from the image.
  • Text segmentation – split previously found region into separate regions contains character.

Input inRoi should contains only a single line of text.

Hints

  • Connect the inImage input with an appropriate image source. Make sure that this image is available (the program was previously run).
  • If the object location is variable, connect an appropriate local coordinate system to inRoiAlignment.
  • Before defining inSegmentationModel first specify inRoi – the rectangle in which characters will be extracted.
  • After inRoi is defined, enter the graphical editor for the inSegmentationModel. Answer a few simple questions to choose one of the predefined sets of parameters.
  • Within the graphical editor select parameters for Text Extraction and for Text Segmentation.

Examples

Description of usage of this filter can be found in examples and tutorial: OCR Read Number, Reading Numbers from Images, Typical OCR application.

Result of reading text using the ExtractText.

Remarks

To read more about how to use OCR technique, refer to Machine Vision Guide: Optical Character Recognition

Errors

This filter can throw an exception to report error. Read how to deal with errors in Error Handling.

List of possible exceptions:

Error type Description
DomainError Invalid segmentation algorithm in ExtractText.
DomainError Invalid thresholding algorithm in ExtractText.

Complexity Level

This filter is available on Basic Complexity Level.

See Also

  • ReadText – Ready-to-use tool for reading text from images using the OCR technique.
  • RecognizeCharacters – Classifies input regions into characters. Based on the Multi-Layer Perceptron model.