PDF Accessibility SDK – Getting Started Guide
Automate PDF/UA & WCAG Compliance
Introduction to PDFix SDK
PDFix SDK is a powerful cross-platform PDF processing tool that enables developers to automate PDF accessibility, extract data, and convert documents programmatically. Choose between Command-Line Interface (CLI) or API integration in multiple programming languages.
System Requirements
- Windows: Windows 7+, Server 2016+ (Requires latest Microsoft Visual C++ Redistributable for Visual Studio)
- macOS: macOS 10.15+
- Linux: Ubuntu 16+, Debian 10+, CentOS 8+, RHEL 8+
Trial License and Limitations
Test PDFix SDK with a free trial license:
- Extracted text may include “*” placeholders
- Saved PDFs may contain a PDFix watermark
How to Integrate PDFix SDK
Quickly process PDFs without coding using the PDFix CLI. Example command for automated PDF accessibility:
$ ./pdfix_app make-accessible -i test.pdf -o output.pdf
- For additional CLI options, refer to the PDFix SDK CLI Documentation.
- The CLI application is included in the ZIP package available on the Download page.
Programmatic Integration
To integrate the PDFix SDK programmatically using the API, refer to the code examples on GitHub in your preferred programming language:
Code Examples:
- C++ – Native applications
- .NET – For .NET Framework, .NET Core, and .NET 5+
- Java – For Maven or Gradle projects
- Python – Applications
- JavaScript – For frameworks like Node.js, React.js, Angular, and similar
Practical Use Cases for PDF Automation
1. Fix PDF Accessibility Issues
Make documents WCAG and PDF/UA compliant using:
- PDFix Actions for Accessibility – No-code JSON workflows
- Template Language for PDF Auto-Tagging – Fine-tune document structure
- SDK API Methods – Edit structure tree, page objects, annotations and document metadata
See PDF Accessibility Code Examples using the API.
2. Extract PDF Content
To extract the data from a PDF document a conversion to JSON. The data extraction can methods can provide:
- Raw Data Extraction to access:
- Document Metadata, Form Fields, and classification such as tagged, signed, secured
- Page Size, Rotation, Annotations, Content including text, images, positions, colors
- Layout Recognition to access the logical structure in non-tagged documents such as:
- Paragraphs,
- Headings,
- Figures,
- Tables,
- Headers, Footers
- Document Structure from a tagged document to access:
- Complete document structure tree and elements with
- Properties, Attributes
- Position, Content, Style, etc.
- Complete document structure tree and elements with
See the CLI or Code Examples to convert a PDF to JSON.
3. Convert PDF to HTML (3 Methods)
- Original Layout: Preserves original layout and exact design
- Responsive Layout: Adapts for mobile screens (uses Layout Recognition)
- Tags-Based Conversion: True web presentation of a tagged documents
See the CLI or Code Examples to convert a PDF to HTML.
Multi-Threaded Environments
PDFix SDK operates in a single-threaded manner, allowing only one API method to run at a time within a single process. Any additional method calls from other threads will be queued until the current operation finishes.
For parallel processing, use separate processes rather than threads.
Flexible Licensing
We offer flexible licensing plans based on:
- Volume Licensing: Plan based on the number of processed pages. For details on page counting, click here.
- Flat Licensing: Based on the number of concurrently executed threads, ideal for high-volume processing.
For more details, refer to PDFix SDK License Management.
Need a custom license? Contact us.