Free Trial
Web API version
Licensing
Request A Quote
HAVE QUESTIONS OR NEED HELP? SUBMIT THE SUPPORT REQUEST FORM or write email to SUPPORT@BYTESCOUT.COM
Represents text recognizer that able to extract text from scanned PDF files
and PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
Inheritance Hierarchy
ByteScout.TextRecognitionBaseRecognizer
ByteScout.TextRecognitionTextRecognizer
Namespace: ByteScout.TextRecognition
Assembly: ByteScout.TextRecognition (in ByteScout.TextRecognition.dll) Version: 2.6.1.323-master
Syntax
The TextRecognizer type exposes the following members.
Constructors
Name | Description | |
---|---|---|
![]() | TextRecognizer |
Initializes a new instance of the TextRecognizer class.
|
![]() | TextRecognizer(String, String) |
Initializes a new instance of the TextRecognizer class.
|
Properties
Name | Description | |
---|---|---|
![]() | AutoDetectPageRotation |
Gets or sets a value indicating whether the TextRecognizer will try to automatically detect
the rotation of a scanned page. Default is false.
|
![]() | BlackList |
A set of characters not allowed to be recognized from scanned document.
The resulting text will only contain characters that are not in this list.
This helps improve uncertain recognition.
|
![]() | ComHelpers |
Set of helping methods for use from COM/ActiveX.
|
![]() | Corrections |
Collection of corrections automatically applied to recognized text to fix repeating recognition errors.
|
![]() | ImagePreprocessingFilters |
Collection of image preprocessing filters.
|
![]() | IsDocumentLoaded |
Gets whether a document is loaded.
(Inherited from BaseRecognizer.) |
![]() | KeepTextFormatting |
Gets or sets whether to try to keep the text formatting.
|
![]() | LicenseInfo |
Gets license information.
(Inherited from BaseRecognizer.) |
![]() | MaximizeCPUUtilization |
Gets or sets maximum OCR performance using Intel OpenMP (if available) to accelerate to approximately 30%.
Default is false.
(Inherited from BaseRecognizer.) |
![]() | OCRLanguage |
Language for Optical Character Recognition (OCR). The valid values are:
(Inherited from BaseRecognizer.)
Download more languages at https://github.com/bytescout/ocrdata. |
![]() | OCRLanguageDataFolder |
Folder containing OCR language data files.
(Inherited from BaseRecognizer.) |
![]() | PageSeparator |
Gets or sets the page separator character or string. Default is "\r\n".
|
![]() | PDFRenderingOptions |
Gets or sets PDF rendering options.
(Inherited from BaseRecognizer.) |
![]() | PDFRenderingResolution |
Gets or sets PDF rendering resolution. Default is 300 DPI.
(Inherited from BaseRecognizer.) |
![]() | RecognitionAreas |
Collection of page areas intended for text recognition.
|
![]() | RegistrationKey |
Gets or sets the key number part of registration information.
(Inherited from BaseRecognizer.) |
![]() | RegistrationName |
Gets or sets the name part of the registration information.
(Inherited from BaseRecognizer.) |
![]() | TrimLeadingSpaces |
Gets or sets whether to trim redundant leading spaces. Default is false.
Works only if KeepTextFormatting is true.
|
![]() | UnwrapParagraphs |
Gets or sets whether to unwrap paragraph text. Default is false.
Works only if KeepTextFormatting is true.
|
![]() | Version |
Gets version of the component.
(Inherited from BaseRecognizer.) |
![]() | WhiteList |
A set of characters allowed to be recognized from scanned document.
Only characters from this list will appear in the result text.
This helps improve uncertain recognition.
|
Methods
Name | Description | |
---|---|---|
![]() | CheckOCRComponents | (Inherited from BaseRecognizer.) |
![]() | Clear |
Releases loaded document and allocated resources.
(Inherited from BaseRecognizer.) |
![]() | Dispose |
Releases managed resources of the component.
(Inherited from BaseRecognizer.) |
![]() | Equals | (Inherited from Object.) |
![]() | Finalize | (Inherited from Object.) |
![]() | GetHashCode | (Inherited from Object.) |
![]() | GetOCRObjects |
Performs the recognition and returns list of recognized text objects of specified level of discretization.
|
![]() | GetOCRObjectsAsJSON |
Performs the recognition and returns the list of recognized text objects of specified level of discretization as JSON string.
|
![]() | GetOCRObjectsAsXML |
Performs the recognition and returns the list of recognized text objects of specified level of discretization as XML string.
|
![]() | GetPageCount |
Returns number of pages in loaded document.
(Inherited from BaseRecognizer.) |
![]() | GetPageHeight |
Returns document page height in pixels.
(Inherited from BaseRecognizer.) |
![]() | GetPageSize |
Returns document page dimensions in pixels.
(Inherited from BaseRecognizer.) |
![]() | GetPageWidth |
Returns document page width in pixels.
(Inherited from BaseRecognizer.) |
![]() | GetPreprocessedPageBitmap |
Returns preview image of document page with preprocessing filters applied.
|
![]() | GetText |
Reads text from specified document page range.
|
![]() | GetType | (Inherited from Object.) |
![]() | LoadDocument(Byte) |
Loads document from byte array.
(Inherited from BaseRecognizer.) |
![]() | LoadDocument(Image) |
Loads document from Image object.
(Inherited from BaseRecognizer.) |
![]() | LoadDocument(Int64) |
Loads document from Win32 HBITMAP structure.
(Inherited from BaseRecognizer.) |
![]() | LoadDocument(Stream) |
Loads document from stream.
(Inherited from BaseRecognizer.) |
![]() | LoadDocument(String) |
Loads document from file.
(Inherited from BaseRecognizer.) |
![]() | LoadDocument(ScreenshotMaker) |
Load screenshot from the main display. Use SetScreenshotArea(Int32, Int32, Int32, Int32)
to set a portion of the screen to take screenshot from.
(Inherited from BaseRecognizer.) |
![]() | MemberwiseClone | (Inherited from Object.) |
![]() | OnPasswordRequired | (Inherited from BaseRecognizer.) |
![]() | SaveOCRObjectsAsJSON |
Performs the recognition and saves the list of recognized text objects of specified level of discretization to JSON file.
|
![]() | SaveOCRObjectsAsXML |
Performs the recognition and saves the list of recognized text objects of specified level of discretization to XML file.
|
![]() | SavePreprocessedPageBitmap |
Saves bitmap of document page with preprocessing filters applied. The image is saved in PNG format.
|
![]() | SaveText(Stream, Int32, Int32, Encoding) |
Saves text from specified page range to Stream.
|
![]() | SaveText(String, Int32, Int32, Encoding) |
Saves text from specified page range to file.
|
![]() | ToString | (Inherited from Object.) |
Events
Name | Description | |
---|---|---|
![]() | PasswordRequired |
Occurs when a password is required to open PDF document.
(Inherited from BaseRecognizer.) |
See Also