Extract Data from PDF Online

Extract Tables, Images and Texts from PDF using the PDFix SDK.

Try Online

  Choose a file you want to convert
  Enter your email address to receive converted file
  Click Convert button and check your email inbox

Logical Data Extraction from unstructured PDF is not an easy task and the quality depends on the original PDF layout. There is no perfect algorithm that works under all circumstances and our online engine uses only general configuration file which should be ok for majority of cases.We are always trying to improve our recognition and conversion engines. If you’re not satisfied with the output feel free to contact us. We can customize settings of the automated extraction tool for your document set and improve quality of the extracted data.

Command Line usage

Integrate Extract Data from PDF functionality very easily with our command-line utility. A basic usage example:

Extract Text:
pdfix_app {EMAIL} {LICENSEKEY} -pdf2txt input.pdf output.txt
Extract Tables:
pdfix_app {EMAIL} {LICENSEKEY} -pdf2csv input.pdf output
Extract Images:
pdfix_app {EMAIL} {LICENSEKEY} -pdf2image input.pdf output -page_width 1200 -image_format 1 -image_quality 75

Integrate into your solution!

Everything you need to quickly build a prototype. Check real examples, based on common tasks that many users want to accomplish.

PDFix GitHub repository contains samples to showcase how to easily run PDFix technology on your environment.
Check Extract Text, Extract Tables and Extract Images samples in our documentation.

Windows, MacOS, Linux

Java, Python, C#, C++