Getting Started

PDFix SDK is a cross-platform PDF processing tool that offers multiple integration options. The Command-Line Interface provides a quick and easy way to add PDF processing functionality to various workflows without writing code.

The SDK is available in various programming languages to gain full control over PDF documents

Before getting started, check the system requirements to ensure compatibility with your system.

Integration Using the CLI

PDFix provides a simple, fast, and automated way to process PDFs via the command-line interface:

$ ./pdfix_app make-accessible -i test.pdf -o output.pdf

For additional CLI options, refer to the PDFix SDK Command-Line Interface documentation. The CLI application is included in the ZIP package available on the download page.

Integrating the SDK Programmatically

To integrate the PDFix SDK programmatically using the API, refer to the code examples on GitHub for your preferred programming language

Code Examples:

  • C++ – Native applications
  • .NET – For .NET Framework, .NET Core, and .NET 5+
  • Java – For Maven or Gradle projects
  • Python – Applications
  • JavaScript – For frameworks like Node.js, React.js, Angular, and similar

Practical Use Cases

SDK offers many different ways of editing the PDF document. The most typical use cases are:

For other other use cases please reffer to documentation, code examples or contact us.

Fix Accessibility Issues

To fix PDF/UA compliance issues in a PDF document use methods available in:

PDFix Actions for Accessibility – a flexible pdf manipulation without coding allows configurable workflows with JSON file.

SDK API methods to access and edit

  • Structure Tree and its Elements
  • Page Objects and their Content Marks
  • Annotations
  • Document Metadata

See code examples to address PDF accessibility using the API.

Extract PDF Content

To extract the data from a PDF document a conversion to JSON. The data extraction can methods can provide:

  • Raw Data Extraction to access
    • Document Metadata, Form Fields, and classification such as tagged, signed, secured
    • Page Size, Rotation, Annotations, Content including text, images, positions, colors
  • Layout Recognition to access the logical structure in non-tagged documents such as:
    • Paragraphs,
    • Headings,
    • Figures,
    • Tables,
    • Headers, Footers
  • Document Structure from a tagged document to access
    • Complete document structure tree and elements with
      • Properties, Attributes
      • Position, Content, Style, etc.

See the CLI or code examples to convert a PDF to JSON.

Convert PDF to HTML

PDFix SDK can convert a PDF into an HTML using three customizable conversion methods tailored to your specific needs

  • Original Layout converts the PDF retaining the original layout and visual appearance
  • Responsive Layout uses Layout Recognition to create a responsive web page adjusted for small screens
  • Layout defined by PDF Tags creates a true web presentation of a tagged documents

See the CLI or code examples to convert a PDF to HTML.

Multi-Threaded Environments

PDFix SDK operates in a single-threaded manner, allowing only one API method to run at a time within a single process. Any additional method calls from other threads will be queued until the current operation finishes.

For parallel processing, use separate processes rather than threads.

Licensing

We offer flexible licensing plans based on:

  • Volume Licensing – Allows users to create a plan based on the number of processed pages. For details and recommendations on page counting, click here.
  • Flat Licensing – Based on the number of concurrently executed threads, making it ideal for high-volume processing. Contact us for more information.

For inquiries about special license types, please reach out to us.

License Management

SDK allows full control of the license from the code. Check the PDFix SDK License Management for more details.

Trial License and Limitations

A free Trial license for all platforms and programming languages is available with the followng limitations:

  • Extracted text may include randomly replaced characters with “*”.
  • Saved PDFs may contain redacted content with watermark.

System Requiremens