ITextExtractor PropertiesByteScout PDF Extractor SDK

The ITextExtractor type exposes the following members.

Properties

  NameDescription
Public propertyFoundText
Public propertyFuzzySearch
Sets whether to use "fuzzy" text search algorithm. It allows to find "approximately equal" strings. For example, the search string "fox" will also find "fix" and "fax. This might be useful for compensation of some common OCR errors, like "paralle1" or "paralle|".
Public propertyFuzzySearchPermissibleErrors
Sets the string equality approximation for the fuzzy search algorithm. Simply, this is the number of permissible errors in the search string. Value 1 or 2 is okay, 3 is iffy, 4 is a poor match. Default is 1.
Public propertyPageSeparator
Sets the page separator character or string. Default is '\f' (Form Feed).
Public propertyRegexSearch
Sets whether to search the text using regular expressions.
Public propertyWordMatchingMode
Sets the word matching mode (used in text search and automatic removal of hyphens). This option is ignored when regular expressions are enabled (when is true). In case of regular expressions, you should use '\b' metacharacter to specify word bounds.
Public propertyWordMatchingPunctuationMarks
Sets punctuation marks used by word matching. These marks are considered as a part of a word. Default are: ."'“”
Top
See Also

Reference