Free Trial
Web API version
Licensing
Request A Quote
HAVE QUESTIONS OR NEED HELP? SUBMIT THE SUPPORT REQUEST FORM or write email to SUPPORT@BYTESCOUT.COM
Defines the PDF to Text extractor interface.
Namespace: Bytescout.PDFExtractor
Assembly: Bytescout.PDFExtractor (in Bytescout.PDFExtractor.dll) Version: 12.0.0.4062-master
Syntax
The ITextExtractor type exposes the following members.
Properties
Name | Description | |
---|---|---|
![]() | FoundText |
Contains the search result of Find(Int32, String, Boolean)or FindNext methods.
|
![]() | PageSeparator |
Gets or sets the page separator character or string. Default is '\f' (Form Feed).
|
![]() | RegexSearch |
Gets or sets a value indicating whether to search the text using regular expressions.
|
![]() | WordMatchingMode |
Gets or sets a value indicating word matching mode (used in text search and auto removal of hyphens).
This option is ignored when regular expressions are enabled (via .RegexSearch to True).
In case of regular expressions you should use \b metacharacter to specify word bounds.
|
![]() | WordMatchingPunctuationMarks |
Punctuation marks used by word matching. These marks are considered as a part of a word. Default are: ."'“” |
Methods
Name | Description | |
---|---|---|
![]() | Find(Int32, String, Boolean) |
Searches the document page for specified text.
|
![]() | Find(Int32, String, RegexOptions) |
Searches the document page for specified text in Regex mode with specified options.
|
![]() | FindAll |
Searches for all occurrences of specified text in specified document page or in entire document.
|
![]() | FindAllToJSON |
Searches for all occurrences of specified text in specified document page or in entire document and returns result as JSON string.
|
![]() | FindNext |
Continues the text search started by Find(Int32, String, Boolean) method.
|
![]() | GetText |
Extracts text from whole document.
|
![]() | GetText(Int32, Int32) |
Extracts text from specified page range.
|
![]() | GetTextFromPage |
Extracts text from specified document page.
|
![]() | SavePageTextToFile(Int32, String) |
Saves page text to file.
|
![]() | SavePageTextToFile(Int32, String, Encoding) |
Saves page text to file in specified encoding.
|
![]() | SavePageTextToStream(Int32, Stream) |
Saves page text to stream.
|
![]() | SavePageTextToStream(Int32, Stream, Encoding) |
Saves page text to stream in specified encoding.
|
![]() | SaveTextToFile(String) |
Saves document text to file.
|
![]() | SaveTextToFile(String, Encoding) |
Saves document text to file in specified encoding.
|
![]() | SaveTextToFile(Int32, Int32, String) |
Saves text from specified page range to file.
|
![]() | SaveTextToFile(Int32, Int32, String, Encoding) |
Saves text from specified page range to file in specified encoding.
|
![]() | SaveTextToStream(Stream) |
Saves document text to stream.
|
![]() | SaveTextToStream(Stream, Encoding) |
Saves document text to stream in specified encoding.
|
![]() | SaveTextToStream(Int32, Int32, Stream) |
Saves text from specified page range to stream.
|
![]() | SaveTextToStream(Int32, Int32, Stream, Encoding) |
Saves text from specified page range to stream in specified encoding.
|
See Also