BaseExtractor ClassByteScout PDF Extractor SDK
Defines a base class for PDF extractors.
Inheritance Hierarchy

SystemObject
  Bytescout.PDFExtractorBaseExtractor
    More...

Namespace:  Bytescout.PDFExtractor
Assembly:  Bytescout.PDFExtractor (in Bytescout.PDFExtractor.dll) Version: 12.0.0.4062-master
Syntax

public abstract class BaseExtractor : IBaseExtractor, 
	IDisposable, IExtractionArea, IProfiles

The BaseExtractor type exposes the following members.

Constructors

  NameDescription
Protected methodBaseExtractor
Default constructor.
Protected methodBaseExtractor(String, String)
Initializes a new instance of the extractor class.
Top
Properties

  NameDescription
Public propertyCheckPermissions
Defines whether to respect permissions set by document owner. If True, extractor throws exception when the extraction is prohibited. IMPORTANT: THIS OPTION HAVE TO BE ENABLED AND SET TO "TRUE" TO RESPECT OWNERS OF PDF DOCUMENTS. IF YOU SET IT TO FALSE TO IGNORE PERMISSIONS WHICH ARE SET IN PDF DOCUMENT THEN YOU ARE SOLELY LIABLE FOR THIS ACTION AND ANY COPYRIGHT OR OTHER VIOLATIONS AT YOUR OWN RISK. BYTESCOUT IS NOT LIABLE FOR ANY DAMAGES, LOSSES, COPYRIGHT INFRINGEMENTS OR ANY OTHER CONSEQUENCES CAUSED BY IGNORING PERMISSIONS OF PDF DOCUMENT. BY CHANGING THIS OPTION YOU ARE CONFIRMING YOU ARE UNDERSTANDING ALL WRITTEN ABOVE AND DOING IT AT YOUR OWN RISK.
Public propertyComHelpers
Set of utility functions and properties to use from COM/ActiveX.
Public propertyContentType
Returns content type of PDF document: normal document, portfolio or XFA form. To extract files from PDF portfolio use AttachmentExtractor class. To extract XFA form content use XFAFormExtractor class.
Public propertyEmbeddedFileCount Obsolete.
Property is disabled to speed up the document loading. Use AttachmentExtractor to work with attachments.
Public propertyEncrypted
Gets whether the document is encrypted.
Public propertyExtractionArea
Sets the extraction area by coordinates and dimensions.
Public propertyExtractionAreaRect
Sets the extraction area by rectangle.
Public propertyExtractionAreaUsageMode
Controls how an extraction area (if any defined) is used when doing a text search to control if we are searching within any objects intersecting with an area or only within objects completely inside an area
Public propertyIsDocumentLoaded
Get the document loaded state.
Public propertyLicenseInfo
Gets license information.
Public propertyPageDataCaching
Controls page data caching behavior.
Public propertyPassword
PDF document password.
Public propertyProfiles
Comma-separated list of profiles to apply to the extractor. Profiles must be previously loaded.
Public propertyRegistrationKey
Registration key.
Public propertyRegistrationName
Registration name.
Public propertyVersion
Gets the component version number.
Top
Methods

  NameDescription
Public methodCreateProfile(String, Boolean)
Public methodCreateProfile(String, String, Boolean)
Public methodDispose
Releases the unmanaged resources used by the instance and optionally releases the managed resources.
Public methodEquals (Inherited from Object.)
Protected methodFinalize (Inherited from Object.)
Protected methodFireParsingError
Protected methodFireProgressChanged
Public methodGetHashCode (Inherited from Object.)
Public methodGetPageCount
Returns document page count.
Public methodGetPageRect_Height
Gets the specified page height.
Public methodGetPageRect_Left
Gets the specified page left coordinate.
Public methodGetPageRect_Top
Gets the specified page top coordinate.
Public methodGetPageRect_Width
Gets the specified page width.
Public methodGetPageRectangle(Int32)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
Public methodGetPageRectangle(Int32, Boolean)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
Public methodGetPageRotationAngle
Returns the rotation angle of specified page.
Public methodGetType (Inherited from Object.)
Public methodIsEncrypted
Gets the document encrypted state.
Public methodLoadAndApplyProfiles
Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction.
Public methodLoadDocumentFromFile
Loads PDF document from specified file.
Public methodLoadDocumentFromStream
Loads PDF document from provided stream.
Public methodLoadProfiles
Loads profiles from JSON file.
Public methodLoadProfilesFromString
Loads profiles from JSON string.
Protected methodMemberwiseClone (Inherited from Object.)
Public methodReset
Resets the instance, disposes internal resources and releases the file. Use this method before loading another PDF file.
Public methodResetExtractionArea
Resets the extraction area to the full page.
Public methodSetExtractionArea(RectangleF)
Sets the extraction area by rectangle.
Public methodSetExtractionArea(Double, Double, Double, Double)
Sets the extraction area by coordinates and dimensions.
Public methodSetExtractionArea(Single, Single, Single, Single)
Sets the extraction area by coordinates and dimensions.
Public methodToString (Inherited from Object.)
Top
Events

  NameDescription
Public eventParsingError
Raised on PDF document parsing errors. This usually indicates a damaged document.
Public eventPasswordRequired
Occurs when the password required to decrypt the document.
Public eventProgressChanged
Raised for each reported progress value. Allows to cancel the processing.
Top
Fields

  NameDescription
Protected fieldExtractionAreaInternal
Top
See Also

Reference

Inheritance Hierarchy

SystemObject
  Bytescout.PDFExtractorBaseExtractor
    Bytescout.PDFExtractorAnnotationExtractor
    Bytescout.PDFExtractorAttachmentExtractor
    Bytescout.PDFExtractorBaseTextExtractor
    Bytescout.PDFExtractorImageExtractor
    Bytescout.PDFExtractorLineDetector
    Bytescout.PDFExtractorMultimediaExtractor
    Bytescout.PDFExtractorOCRAnalyzer
    Bytescout.PDFExtractorPDFAValidator
    Bytescout.PDFExtractorSearchablePDFMaker
    Bytescout.PDFExtractorTableDetector2
    Bytescout.PDFExtractorUnsearchablePDFMaker
    Bytescout.PDFExtractorXFAFormExtractor