XMLExtractor MethodsByteScout PDF Extractor SDK

The XMLExtractor type exposes the following members.

Methods

  NameDescription
Public methodAddFilter(String, Boolean, Boolean)
Adds a filter to remove a text from extracted data.
(Inherited from BaseTextExtractor.)
Public methodAddFilter(String, Int32, Boolean)
Adds filter to exclude text objects with specified attributes.
(Inherited from BaseTextExtractor.)
Public methodAddFilter(String, Int32, Color, Boolean)
Adds filter to exclude text objects with specified attributes.
(Inherited from BaseTextExtractor.)
Public methodAddFilter(String, String, Boolean, Boolean)
Adds a filter to replace a text in extracted data.
(Inherited from BaseTextExtractor.)
Public methodAddFilter(String, Int32, Int32, Int32, Int32, Boolean)
Adds filter to exclude text objects with specified attributes.
(Inherited from BaseTextExtractor.)
Public methodCreateProfile(String, Boolean) (Inherited from BaseExtractor.)
Public methodCreateProfile(String, String, Boolean) (Inherited from BaseExtractor.)
Public methodDispose
Releases the unmanaged resources used by the instance and optionally releases the managed resources.
(Inherited from BaseExtractor.)
Public methodDisposePage
Disposes the page object. Uses this method carefully to destroy the page object that should not be used further. Useful to free allocated memory when processing huge PDF documents.
(Inherited from BaseTextExtractor.)
Public methodEquals (Inherited from Object.)
Protected methodFinalize (Inherited from Object.)
Protected methodFireParsingError (Inherited from BaseExtractor.)
Protected methodFireProgressChanged (Inherited from BaseExtractor.)
Public methodGetHashCode (Inherited from Object.)
Public methodGetPageCount
Returns document page count.
(Inherited from BaseExtractor.)
Public methodGetPageRect_Height
Gets the specified page height.
(Inherited from BaseExtractor.)
Public methodGetPageRect_Left
Gets the specified page left coordinate.
(Inherited from BaseExtractor.)
Public methodGetPageRect_Top
Gets the specified page top coordinate.
(Inherited from BaseExtractor.)
Public methodGetPageRect_Width
Gets the specified page width.
(Inherited from BaseExtractor.)
Public methodGetPageRectangle(Int32)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
(Inherited from BaseExtractor.)
Public methodGetPageRectangle(Int32, Boolean)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
(Inherited from BaseExtractor.)
Public methodGetPageRotationAngle
Returns the rotation angle of specified page.
(Inherited from BaseExtractor.)
Public methodGetPreprocessedPagePreview
Returns preview image of document page with preprocessing filters applied.
(Inherited from BaseTextExtractor.)
Public methodGetType (Inherited from Object.)
Public methodGetXML
Extracts XML data from whole document as string.
Public methodGetXML(Int32, Int32)
Extracts XML data from specifed page range.
Public methodGetXMLDocument
Extracts XML data from whole document as XmlDocument.
Public methodGetXMLDocument(Int32, Int32)
Extracts XML data from whole document as XmlDocument.
Public methodGetXMLDocumentFromPage
Extracts XML data from specified document page as XmlDocument.
Public methodGetXMLFromPage
Extracts XML data from specified document page as string.
Public methodIsEncrypted
Gets the document encrypted state.
(Inherited from BaseExtractor.)
Public methodIsOCRRecommendedForPage
Detects whether OCR is recommended for specified page. OCR (Optical Character Recognition) is recommended when pages has no text objects bat has an image that might contain text.
(Inherited from BaseTextExtractor.)
Public methodLoadAndApplyProfiles
Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction.
(Inherited from BaseExtractor.)
Public methodLoadDocumentFromFile
Loads PDF document from specified file.
(Inherited from BaseExtractor.)
Public methodLoadDocumentFromStream
Loads PDF document from provided stream.
(Inherited from BaseExtractor.)
Public methodLoadProfiles
Loads profiles from JSON file.
(Inherited from BaseExtractor.)
Public methodLoadProfilesFromString
Loads profiles from JSON string.
(Inherited from BaseExtractor.)
Protected methodMemberwiseClone (Inherited from Object.)
Protected methodPerformTextAnalysis (Inherited from BaseTextExtractor.)
Public methodReset (Overrides BaseTextExtractorReset.)
Protected methodResetBaseExtractionData (Inherited from BaseTextExtractor.)
Public methodResetExtractionArea
Resets the extraction area to the full page.
(Inherited from BaseExtractor.)
Public methodResetFilters
Reset text filters.
(Inherited from BaseTextExtractor.)
Public methodSavePageXMLToFile
Saves page XML data to file.
Public methodSavePageXMLToStream
Saves page XML data to stream.
Public methodSavePreprocessedPagePreview
Saves preview image of document page with preprocessing filters applied. Image is saved in PNG format.
(Inherited from BaseTextExtractor.)
Public methodSaveXMLToFile(String)
Saves XML data to file.
Public methodSaveXMLToFile(Int32, Int32, String)
Saves XML data from specified page range to file.
Public methodSaveXMLToStream(Stream)
Saves XML data to stream.
Public methodSaveXMLToStream(Int32, Int32, Stream)
Saves XML data from specified page range to stream.
Public methodSetCustomExtractionColumns
Helper method to set CustomExtractionColumns property when using the extractor though COM from VC++ VB, VBA, VBScript, or Delphi.
(Inherited from BaseTextExtractor.)
Public methodSetExtractionArea(RectangleF)
Sets the extraction area by rectangle.
(Inherited from BaseExtractor.)
Public methodSetExtractionArea(Double, Double, Double, Double)
Sets the extraction area by coordinates and dimensions.
(Inherited from BaseExtractor.)
Public methodSetExtractionArea(Single, Single, Single, Single)
Sets the extraction area by coordinates and dimensions.
(Inherited from BaseExtractor.)
Public methodToString (Inherited from Object.)
Top
See Also

Reference