Free Trial
Web API version
Licensing
Request A Quote
HAVE QUESTIONS OR NEED HELP? SUBMIT THE SUPPORT REQUEST FORM or write email to SUPPORT@BYTESCOUT.COM
Advanced Conversion Options | VB.NET
Module1.vb:
VB
Imports System.IO Imports System.Net Imports Newtonsoft.Json Imports Newtonsoft.Json.Linq Module Module1 ' The authentication key (API Key). ' Get your own by registering at https://app.pdf.co Const API_KEY As String = "***********************************" ' Direct URL of source PDF file. ' You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/ Const SourceFileUrl As String = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-text/sample.pdf" ' Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'. const Pages as String = "" ' PDF document password. Leave empty for unprotected documents. const Password As string = "" ' Destination TXT file name const DestinationFile as string = ".\result.txt" ' Some of advanced options available through profiles: ' (JSON can be single/double-quoted and contain comments.) ' { ' "profiles": [ ' { ' "profile1": { ' "ExtractInvisibleText": true, // Invisible text extraction. Values: true / false ' "ExtractShadowLikeText": true, // Shadow-like text extraction. Values: true / false ' "ExtractAnnotations": true, // Whether to extract PDF annotations. ' "CheckPermissions": true, // Ignore document permissions. Values: true / false ' "DetectNewColumnBySpacesRatio": 1.2, // A ratio affecting number of spaces between words. ' } ' } ' ] ' } ' Sample profile that sets advanced conversion options ' Advanced options are properties of TextExtractor class from ByteScout Text Extractor SDK used in the back-end: ' https://cdn.bytescout.com/help/BytescoutPDFExtractorSDK/html/8a2bae5a-346f-8338-b5aa-6f3522dca0d4.htm ReadOnly Profiles = File.ReadAllText("profile.json") Sub Main() ' Create standard .NET web client instance Dim webClient As WebClient = New WebClient() ' Set API Key webClient.Headers.Add("x-api-key", API_KEY) ' Set JSON content type webClient.Headers.Add("Content-Type", "application/json") ' Prepare URL for `PDF To TXT` API call Dim url As String = "https://api.pdf.co/v1/pdf/convert/to/text" ' Prepare requests params as JSON ' See documentation: https : //apidocs.pdf.co Dim parameters As New Dictionary(Of String, Object) parameters.Add("name", Path.GetFileName(DestinationFile)) parameters.Add("password", Password) parameters.Add("pages", Pages) parameters.Add("url", SourceFileUrl) parameters.Add("profiles", Profiles) ' Convert dictionary of params to JSON Dim jsonPayload As String = JsonConvert.SerializeObject(parameters) Try ' Execute POST request with JSON payload Dim response As String = webClient.UploadString(url, jsonPayload) ' Parse JSON response Dim json As JObject = JObject.Parse(response) If json("error").ToObject(Of Boolean) = False Then ' Get URL of generated TXT file Dim resultFileUrl As String = json("url").ToString() ' Download TXT file webClient.DownloadFile(resultFileUrl, DestinationFile) Console.WriteLine("Generated TXT file saved as ""{0}"" file.", DestinationFile) Else Console.WriteLine(json("message").ToString()) End If Catch ex As WebException Console.WriteLine(ex.ToString()) End Try webClient.Dispose() Console.WriteLine() Console.WriteLine("Press any key...") Console.ReadKey() End Sub End Module