Getting Started

PDF Accessibility SDK – Getting Started Guide

Automate PDF/UA & WCAG Compliance

Introduction to PDFix SDK

PDFix SDK is a powerful cross-platform PDF processing tool that enables developers to automate PDF accessibility, extract data, and convert documents programmatically. Choose between Command-Line Interface (CLI) or API integration in multiple programming languages.

System Requirements

Windows: Windows 7+, Server 2016+ (Requires latest Microsoft Visual C++ Redistributable for Visual Studio)
macOS: macOS 10.15+
Linux: Ubuntu 16+, Debian 10+, CentOS 8+, RHEL 8+

Trial License and Limitations

Test PDFix SDK with a free trial license:

Extracted text may include “*” placeholders
Saved PDFs may contain a PDFix watermark

How to Integrate PDFix SDK

Quickly process PDFs without coding using the PDFix CLI. Example command for automated PDF accessibility:

$ ./pdfix_app make-accessible -i test.pdf -o output.pdf

For additional CLI options, refer to the PDFix SDK CLI Documentation.
The CLI application is included in the ZIP package available on the Download page.

Programmatic Integration

To integrate the PDFix SDK programmatically using the API, refer to the code examples on GitHub in your preferred programming language:

Code Examples:

C++ – Native applications
.NET – For .NET Framework, .NET Core, and .NET 5+
Java – For Maven or Gradle projects
Python – Applications
JavaScript – For frameworks like Node.js, React.js, Angular, and similar

Practical Use Cases for PDF Automation

1. Fix PDF Accessibility Issues

Make documents WCAG and PDF/UA compliant using:

PDFix Actions for Accessibility – No-code JSON workflows
Template Language for PDF Auto-Tagging – Fine-tune document structure
SDK API Methods – Edit structure tree, page objects, annotations and document metadata

See PDF Accessibility Code Examples using the API.

2. Extract PDF Content

To extract the data from a PDF document a conversion to JSON. The data extraction can methods can provide:

Raw Data Extraction to access:
- Document Metadata, Form Fields, and classification such as tagged, signed, secured
- Page Size, Rotation, Annotations, Content including text, images, positions, colors
Layout Recognition to access the logical structure in non-tagged documents such as:
- Paragraphs,
- Headings,
- Figures,
- Tables,
- Headers, Footers
Document Structure from a tagged document to access:
- Complete document structure tree and elements with
  - Properties, Attributes
  - Position, Content, Style, etc.

See the CLI or Code Examples to convert a PDF to JSON.

3. Convert PDF to HTML (3 Methods)

Original Layout: Preserves original layout and exact design
Responsive Layout: Adapts for mobile screens (uses Layout Recognition)
Tags-Based Conversion: True web presentation of a tagged documents

See the CLI or Code Examples to convert a PDF to HTML.

Multi-Threaded Environments

PDFix SDK operates in a single-threaded manner, allowing only one API method to run at a time within a single process. Any additional method calls from other threads will be queued until the current operation finishes.

For parallel processing, use separate processes rather than threads.

Flexible Licensing

We offer flexible licensing plans based on:

Volume Licensing: Plan based on the number of processed pages. For details on page counting, click here.
Flat Licensing: Based on the number of concurrently executed threads, ideal for high-volume processing.

For more details, refer to PDFix SDK License Management.

Need a custom license? Contact us.