Bytescout.PDFExtractor NamespaceByteScout PDF Extractor SDK
 
Classes

  ClassDescription
Public classAnnotationExtractor
Extracts annotations from PDF file.
Public classAnnotationInfo
Defines various attachment information.
Public classAttachmentExtractor
Extracts attachments from PDF file.
Public classAttachmentInfo
Defines various attachment information.
Public classBaseExtractor
Defines a base class for PDF extractors.
Public classBaseTextExtractor
Defines a base class for text-related PDF extractors.
Public classBookmarkRemover
Represents tool that remove bookmarks from PDF document.
Public classComHelpers
Class containing helping methods to use the SDK as ActiveX object from VBScript, VBA, VB6, Delphi, Visual C++.
Public classCSVExtractor
Represents PDF to CSV extractor. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
Public classDetectedSensitiveItem
Represents results of sensitive data detection. See SensitiveDataDetector.
Public classDetectedTable
Represents a table detected by TableDetector2.
Public classDetectedTableList
Represents list of tables detected by TableDetector2.
Public classDocumentMerger
Represents PDF document merger.
Public classDocumentOptimizer
Represents PDF document optimizer.
Public classDocumentRotator
Represents PDF document rotator.
Public classDocumentRotatorPageAndAngle
Public classDocumentSplitter
Represents PDF and TIFF document splitter.
Public classDocumentSplitter2
Represents PDF document splitter that splits a document by pages containing specific text.
Public classFoundLine
Represents a line object found by LineDetector.
Public classFoundLinesCollection
Represents collection of lines found by LineDetector.
Public classImageExtractor
Extracts images from PDF document.
Public classImagePreprocessingFiltersCollection
Collection of image preprocessing filters.
Public classInfoExtractor
Provides information about PDF document.
Public classJSONExtractor
Represents PDF to XML extractor.
Public classLicenseInfo
License information.
Public classLineDetector
Represents line detector.
Public classLogger
Public classMultimediaExtractor
Extracts videos from PDF document.
Public classOCRAnalysisResults
Represents OCR Analyzer. It is designed for analysis of scanned documents in PDF or raster image formats to find best parameters for Optical Character Recognition (OCR) that provide highest recognition quality.
Public classOCRAnalyzer
Represents OCR Analyzer. It is designed for analysis of scanned documents in PDF or raster image formats to find best parameters for Optical Character Recognition (OCR) that provide highest recognition quality.
Public classOCRCell
Represents OCR cell (word) data.
Public classOCRCorrection
Represents a correction automatically applied to recognized text to fix repeating recognition errors.
Public classOCRCorrectionList
Represents collection of corrections automatically applied to recognized text to fix repeating recognition errors.
Public classOptimizationOptions
Represents PDF document optimizer.
Public classPDFAValidator
Creates the object.
Public classPDFExtractorCancellationException
Cancellation exception.
Public classPDFExtractorDamagedDocumentException
Damaged document exception.
Public classPDFExtractorException
Represents errors that occur during PDF extraction process.
Public classPDFExtractorInvalidPasswordException
Invalid password exception.
Public classPDFExtractorPermissionsException
Permissions exception.
Public classPDFExtractorProfileException
Public classRemover
Utility class to remove objects from PDF document.
Public classRemover2
Utility class to remove text and image objects from PDF document. Improved version of Remover class.
Public classSearchablePDFMaker
Represents Searchable PDF Maker tool.
Public classSearchResult
Defines result of the text search.
Public classSearchResultElement
Defines the search result element.
Public classSensitiveDataDetectionResults
Represents results of sensitive data detection. See SensitiveDataDetector.
Public classSensitiveDataDetector
Class that detects sensitive data in PDF documents.
Public classSensitiveDataPolicy
Public classStamper
Stamps PDF document with an image.
Public classStructuredExtractor
Defines the table structure extractor interface.
Public classTableDetector
Represents PDF tables detector.
Public classTableDetector2
Represents experimental detector of tables in PDF documents.
Public classTextAnalysisResults
Text analysis results.
Public classTextComparer
Represents PDF text comparer.
Public classTextComparerDiffPiece
Public classTextExtractor
Represents PDF to Text extractor. Also able to extract text from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
Public classUnsearchablePDFMaker
Represents Unsearchable PDF Maker tool.
Public classXFAFormExtractor
Extracts XFA Form attachments from PDF file.
Public classXFDFExtractor
Represents forms data extractor in XFDF (XML Forms Data Format) format.
Public classXLSExtractor
Defines XLS extractor interface. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
Public classXMLExtractor
Represents PDF to XML extractor. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
Interfaces

  InterfaceDescription
Public interfaceIAnnotationExtractor
Defines annotation extractor.
Public interfaceIAttachmentExtractor
Defines the PDF attachment extractor interface.
Public interfaceIAttachmentInfo
Defines various attachment information.
Public interfaceIBaseExtractor
Defines a base interface for PDF extractors.
Public interfaceIBaseOCRExtractor
Defines a base interface for PDF text extractors.
Public interfaceIBaseTextExtractor
Defines a base interface for PDF text extractors.
Public interfaceIBookmarkRemover
Represents tool that remove bookmarks from PDF document.
Public interfaceICSVExtractor
Defines the PDF to CSV extractor interface.
Public interfaceIDocumentMerger
Represents PDF document merger.
Public interfaceIDocumentOptimizer
Represents PDF document optimizer.
Public interfaceIDocumentRotator
Represents PDF document rotator.
Public interfaceIDocumentSplitter
Represents PDF document splitter.
Public interfaceIDocumentSplitter2
Represents PDF document splitter that splits a document by pages containing specific text.
Public interfaceIExtractionArea
Defines extraction area support for extractors
Public interfaceIFoundLine
Represents a line object found by LineDetector.
Public interfaceIFoundLinesCollection
Represents collection of lines found by LineDetector.
Public interfaceIImageExtractor
Defines the image extractor interface.
Public interfaceIImagePreprocessingFiltersCollection
Public interfaceIInfoExtractor
Defines the PDF info extractor interface.
Public interfaceIJSONExtractor
Defines the PDF to JSON extractor interface.
Public interfaceILineDetector
Represents line detector.
Public interfaceIMultimediaExtractor
Defines the video extractor interface.
Public interfaceIOCRAnalyzer
Public interfaceIOptimizationOptions
Represents PDF document optimizer.
Public interfaceIProfiles
Defines profiles support.
Public interfaceIRemover
Defines a class for PDF extractors.
Public interfaceIRemover2
Defines a class for PDF extractors.
Public interfaceISearchablePDFMaker
Public interfaceISearchResult
Defines search result interface.
Public interfaceISearchResultElement
Defines the search result element interface.
Public interfaceISensitiveDataDetector
Class that detects sensitive data in PDF documents.
Public interfaceIStamper
Interface of Stamper utility class. Allows you to add a stamp or sign picture to PDF document pages.
Public interfaceIStructuredExtractor
Defines the table structure extractor interface.
Public interfaceITableDetector
Represents PDF tables detector.
Public interfaceITextExtractor
Defines the PDF to Text extractor interface.
Public interfaceIUnsearchablePDFMaker
Public interfaceIXFAFormExtractor
Defines the XFA Form attachments extractor interface.
Public interfaceIXFDFExtractor
Defines the PDF to XML extractor interface.
Public interfaceIXLSExtractor
Defines XLS extractor interface.
Public interfaceIXMLExtractor
Defines the PDF to XML extractor interface.
Delegates

  DelegateDescription
Public delegateBaseExtractorParsingErrorEventHandler
Defines ParsingError event parameters.
Public delegateBaseExtractorProgressEventHandler
Defines Progress event parameters.
Public delegateDocumentOptimizerProgressEventHandler
Defines progress event parameters.
Public delegateDocumentSplitterProgressEventHandler
Defines progress event parameters.
Public delegateOCRAnalyzerProgressEventHandler
Defines Progress event parameters.
Public delegatePasswordEventHandler
Represents parameters for PasswordRequired event.
Enumerations

  EnumerationDescription
Public enumerationAudioType
Defines embedded audio resource types.
Public enumerationColumnDetectionByTextAlignment
Defines text alignments for detection of table column. See ColumnDetectionByTextAlignment property.
Public enumerationColumnDetectionMode
Defines how columns are detected on the document page.
Public enumerationEmbeddedImageFormat
Image format to convert PDF pages to.
Public enumerationExtractionAreaUsageMode
Defines how extraction area (if any) is treated when doing text extraction or text search.
Public enumerationGraphics3DType
Defines embedded audio resource types.
Public enumerationImageHandling
Defines the image handling way during the XML extraction.
Public enumerationImageOptimizationFormat
Defines image compression types used for image optimization in PDF
Public enumerationInfoExtractorPDFEncryptionAlgorithm
PDF encryption algorithm.
Public enumerationLineGroupingMode
Sets if lines are not checked to be merged, can be merged by rows, or inside columns
Public enumerationLineOrientation
Represents line types.
Public enumerationLineOrientationsToFind
Represents line detector.
Public enumerationOCRCacheMode
OCR results caching behavior. Turned off by default (no cache is used). In "WholePage" caching mode you may save processing time as the SDK will check if need to re-run OCR on the page or can just pull results from previously cached OCR results.
Public enumerationOCRMode
OCR (Optical Character Recognition) usage mode.
Public enumerationOngoingOperation
The ongoing operation for ProgressChanged event.
Public enumerationOutputImageFormat
Defines format for output images.
Public enumerationOutputStructure
Public enumerationPageDataCaching
Page data caching behaviour.
Public enumerationPDFContentType
Defines PDF content types.
Public enumerationRotationAngle
Represents angle for document rotation.
Public enumerationSensitiveDataReportFormat
Defines formats of sensitive data detection report.
Public enumerationSpreadseetOutputFormat
Defines spreadsheet output formats.
Public enumerationTextAnalysisStatus
Defines statuses of the text analysis. See TextAnalysisResults.
Public enumerationTextComparerChangeType
Public enumerationVideoType
Defines embedded video resource types.
Public enumerationWordMatchingMode
Word matching mode (for search).
Public enumerationXFAFormContentType
Specifies XFA Form content part types.