Classifier ClassByteScout Document Parser SDK
Represents a class that uses list of rules to classify a document (PDF or image).
Inheritance Hierarchy

SystemObject
  ByteScout.DocumentParserClassifier

Namespace:  ByteScout.DocumentParser
Assembly:  ByteScout.DocumentParser (in ByteScout.DocumentParser.dll) Version: 6.4.1.617-master
Syntax

public class Classifier : IDisposable

The Classifier type exposes the following members.

Constructors

  NameDescription
Public methodClassifier
Initializes a new instance of the Classifier class.
Public methodClassifier(String, String)
Initializes a new instance of the Classifier class.
Top
Properties

  NameDescription
Public propertyIgnorePDFPermissions
This option instructs if the SDK should ignore permissions in the PDF document and not generate ParserPermissionsException when permissions is not set for the desired action.

Default is false.

IMPORTANT: THIS OPTION HAVE NOT TO BE ENABLED TO RESPECT OWNERS OF PDF DOCUMENTS. IF YOU SET IT TO TRUE TO IGNORE PERMISSIONS WHICH ARE SET IN PDF DOCUMENT THEN YOU ARE SOLELY LIABLE FOR THIS ACTION AND FOR ANY COPYRIGHT OR OTHER VIOLATIONS AT YOUR OWN RISK. BYTESCOUT IS NOT LIABLE FOR ANY DAMAGES, LOSSES, COPYRIGHT INFRINGEMENTS OR ANY OTHER CONSEQUENCES CAUSED BY IGNORING PERMISSIONS OF PDF DOCUMENT. BY CHANGING THIS OPTION YOU ARE CONFIRMING YOU ARE UNDERSTANDING ALL WRITTEN ABOVE AND DOING IT AT YOUR OWN RISK.

Public propertyOCRDetectPageRotation
Detect scanned page rotation.
Public propertyOCRLanguage
The default language for Optical Character Recognition (OCR). It can be overridden by the template option "ocrLanguage". The valid values are:
  • "eng" - English (default)
  • "deu" - German
  • "fra" - French
  • "spa" - Spanish
  • "nld" - Dutch

Download more languages at https://github.com/bytescout/ocrdata.

Public propertyOCRLanguageDataFolder
Folder containing OCR language data files.
Public propertyOCRMaximizeCPUUtilization
Gets or sets maximum OCR performance using Intel OpenMP (if available) to accelerate to approximately 30%. Default is false.
Public propertyOCRMode
Recognizes text from embedded images using Optical Character Recognition (OCR).

This option requires appropriate language files in OCRLanguageDataFolder folder. The SDK is shipped with language files for English, French, German and Spanish. You can download more at https://github.com/bytescout/ocrdata.

Public propertyOCRResolution
Resolution of Optical Character Recognition (OCR). Default is 300 DPI.
Public propertyRegistrationKey
Gets or sets the key number part of registration information.
Public propertyRegistrationName
Gets or sets the name part of the registration information.
Top
Methods

  NameDescription
Public methodAddRule
Adds a rule for the detection of a document by its content
Public methodAddRulesFromSpreadsheet(Stream, Boolean)
Adds rules and key phrases from a spreadsheet file stream (XLS, XLSX, CSV, ODS).
Public methodAddRulesFromSpreadsheet(String, Boolean)
Adds rules and key phrases from a spreadsheet file (XLS, XLSX, CSV, ODS).
Public methodClassifyDocument(Stream)
Public methodClassifyDocument(String)
Public methodDispose
Releases managed resources of the component.
Public methodEquals (Inherited from Object.)
Protected methodFinalize (Inherited from Object.)
Public methodGetHashCode (Inherited from Object.)
Public methodGetType (Inherited from Object.)
Protected methodMemberwiseClone (Inherited from Object.)
Public methodReset
Resets Classifier.
Public methodToString (Inherited from Object.)
Top
Events

  NameDescription
Public eventPasswordRequired
Occurs when a password is required to open PDF document.
Top
See Also

Reference