Get Word Coordinates in XML | VBScriptByteScout PDF Extractor SDK

Get Word Coordinates in XML | VBScript

PdfToXml.vbs:

VB
' Create Bytescout.PDFExtractor.XMLExtractor object
Set extractor = CreateObject("Bytescout.PDFExtractor.XMLExtractor")
extractor.RegistrationName = "demo"
extractor.RegistrationKey = "demo"

' Load sample PDF document
extractor.LoadDocumentFromFile "sample3.pdf"

' Add the following params to get clean data with word nodes only:
extractor.DetectNewColumnBySpacesRatio = 0.1 ' this splits all text into words
extractor.PreserveFormattingOnTextExtraction = false ' get rid of empty nodes

extractor.SaveXMLToFile "output.xml"

WScript.Echo "Extracted data saved to 'output.xml' file."