Ensuring PDF accessibility is essential for both compliance and usability. However, manual tagging is time-consuming, inconsistent, and nearly impossible at scale. With PDFix SDK, you can automate the process – giving developers and organizations control to automatically tag PDFs and convert raw documents into structured, standards-compliant PDFs.
This tutorial walks you through the main methods of auto-tagging PDFs using PDFix SDK – from quick fixes to advanced AI-driven templates.
You’ll explore the four main approaches to PDF auto-tagging with PDFix SDK:
- Basic Auto-Tagging without Layout Template – achieve a basic accessibility pass across mixed or high-volume PDFs
- Auto-Tagging with Preflight – Auto-Generated Layout Template – improve tagging accuracy with automated analysis
- Auto-Tagging with an AI-Generated Layout Template – use AI models for intelligent, adaptive tagging
- Auto-Tagging with a Pre-Defined Layout Template – ensure consistent tag structures for repetitive or print-ready document
Each section includes:

SDK code examples

Command-Line Tool usage

Documentation references

Video from PDFix Desktop
By the end, you’ll understand how to integrate auto-tagging into your workflows for reliable, repeatable accessibility results – and you can download the sample documents to test it yourself.
We selected some of our test files for this demonstration.
You can find it on our GitHub under the repository Weekly_Market_Commentary.
Four Smart Auto-Tagging Methods in PDFix SDK
1. Basic Auto-Tagging without Layout Template – Fast Results, No Setup
- Use case: Fastest way to achieve a structured, tagged document
- How it works: The SDK detects text blocks, reading order, tables, and lists automatically
- Pros: Zero setup, immediate results
- Cons: Less predictable structure, may require refinements

SDK Code Example [Python]
from pdfixsdk import *
pdfix = GetPdfix()
doc = pdfix.OpenDoc("weekly_market_commentary_2025_0505.pdf", "")
if not doc:
raise RuntimeError(pdfix.GetError())
if not doc.AddTags(PdfTagsParams()):
raise RuntimeError(pdfix.GetError())
doc.Save("weekly_market_commentary_2025_0505_tagged.pdf", kSaveFull)
doc.Close()

Command-Line Tool Example
#!/bin/bash
INPUT_PDF=weekly_market_commentary_2025_0505.pdf
OUTPUT_PDF=weekly_market_commentary_2025_0505_tagged.pdf
# auto-tag with template
pdfix_app add-tags -i $INPUT_PDF -o $OUTPUT_PDF

Documentation
- https://github.com/pdfix/pdfix_sdk_example_python
- https://pdfix.github.io/pdfix_sdk_builds
- https://pdfix.net/support/pdfix-command-line

PDFix Desktop
2. Auto-Tagging with Preflight – Auto-Generated Layout Template
- Use case: Improve document structure by identifying headings, and removing headers or footers automatically.
- How it works: Preflight analyzes the document and generates a template to enhance auto-tagging.
- Pros: Better results than plain auto-tagging.
- Cons: Results vary based on document complexity.

SDK Code Example [Python]
from pdfixsdk import *
pdfix = GetPdfix()
doc = pdfix.OpenDoc("weekly_market_commentary_2025_0505.pdf", "")
if not doc:
raise RuntimeError(pdfix.GetError())
tmpl = doc.GetTemplate()
for i in range(doc.GetNumPages() - 1):
tmpl.AddPage(i)
tmpl.Update()
if not doc.AddTags(PdfTagsParams()):
raise RuntimeError(pdfix.GetError())
doc.Save("weekly_market_commentary_2025_0505_tagged.pdf", kSaveFull)
doc.Close()

Command-Line Tool Example
#!/bin/bash
INPUT_PDF=weekly_market_commentary_2025_0505.pdf
OUTPUT_PDF=weekly_market_commentary_2025_0505_tagged.pdf
# auto-tag with template
pdfix_app add-tags -i $INPUT_PDF -o $OUTPUT_PDF --preflight

Documentation
- https://github.com/pdfix/pdfix_sdk_example_python
- https://pdfix.github.io/pdfix_sdk_builds
- https://pdfix.net/support/pdfix-command-line

PDFix Desktop
3. Auto-Tagging with an AI-Generated Layout Template
External AI actions let you integrate AI models and advanced tools directly into your workflow.
For document layout detection, we currently support three external AI models – but you’re not limited to just these. The external actions system allows you to connect any other model, including your own, giving you full flexibility to tailor automation to your needs.
When dealing with complex documents or diverse layouts, an AI-generated template can deliver the most intelligent, adaptive tagging results.
- Use case: Documents too complex for pre-defined templates.
- How it works: AI models create JSON templates that guide PDFix SDK in tagging.
- Pros: Smarter detection, flexible structure recognition.
- Cons: Requires AI integration setup.
Currenty

SDK Code Example [Python]
from pdfixsdk import *
import os, subprocess
def create_pdfix_ai_template(input_pdf: str, output_json: str, zoom=4.0):
workdir = os.path.abspath(os.path.dirname(input_pdf))
cmd = [
"docker", "run", "--rm",
"-v", f"{workdir}:/data", "-w", "/data",
"pdfix/pdf-accessibility-docling:v1.0.4",
"template", "-i", f"/data/{os.path.basename(input_pdf)}",
"-o", f"/data/{os.path.basename(output_json)}",
"--zoom", str(zoom)
]
subprocess.run(cmd, check=True)
pdfix = GetPdfix()
doc = pdfix.OpenDoc("input.pdf", "")
if not doc:
raise RuntimeError(pdfix.GetError())
create_pdfix_ai_template("weekly_market_commentary_2025_0505.pdf", "ai-template.json")
tmpl = doc.GetTemplate()
stm = pdfix.CreateFileStream("ai-template.json", kPsReadOnly)
tmpl.LoadFromStream(stm, kDataFormatJson)
stm.Destroy()
if not doc.AddTags(PdfTagsParams()):
raise RuntimeError(pdfix.GetError())
doc.Save("weekly_market_commentary_2025_0505_tagged.pdf", kSaveFull)
doc.Close()

Command-Line Tool Example
#!/bin/bash
INPUT_PDF=weekly_market_commentary_2025_0505.pdf
OUTPUT_PDF=weekly_market_commentary_2025_0505_tagged.pdf
AI_TEMPLATE=ai_template.json
docker run -v $(pwd):/data -w /data --rm pdfix/pdf-accessibility-docling:v1.0.4 template -i $INPUT_PDF -o $AI_TEMPLATE --zoom 4.0
pdfix_app add-tags -i $INPUT_PDF -o $OUTPUT_PDF -c $AI_TEMPLATE

Documentation
- https://github.com/pdfix/pdfix_sdk_example_python
- https://pdfix.github.io/pdfix_sdk_builds
- https://pdfix.net/support/pdfix-command-line
- https://pdfix.net/products/actions-marketplace/

PDFix Desktop
from pdfixsdk import *
pdfix = GetPdfix()
doc = pdfix.OpenDoc("weekly_market_commentary_2025_0505.pdf", "")
if not doc:
raise RuntimeError(pdfix.GetError())
tmpl = doc.GetTemplate()
stm = pdfix.CreateFileStream("template.json", kPsReadOnly)
tmpl.LoadFromStream(stm, kDataFormatJson)
stm.Destroy()
if not doc.AddTags(PdfTagsParams()):
raise RuntimeError(pdfix.GetError())
doc.Save("weekly_market_commentary_2025_0505_tagged.pdf", kSaveFull)
doc.Close()
#!/bin/bash
INPUT_PDF=weekly_market_commentary_2025_0505.pdf
OUTPUT_PDF=weekly_market_commentary_2025_0505_tagged.pdf
CUSTOM_TEMPLATE=template.json
pdfix_app add-tags -i $INPUT_PDF -o $OUTPUT_PDF -c $CUSTOM_TEMPLATE
Batch Auto-Tagging – Scale Accessibility Across Documents
Review and Refinement: How to Reach Full PDF/UA Compliance
Auto-tagging dramatically accelerates accessibility workflows – but validation and refinement are key to full PDF/UA compliance. After tagging, run validation to check compliance with PDF/UA or WCAG, address detected issues, and perform manual checks like reading order.
verapdf -f ua1 weekly_market_commentary_2025_0505_tagged.pdfSummary and Next Steps
With PDFix SDK, you can:
- Run quick auto-tagging for unknown or mixed PDFs.
- Apply pre-defined templates for consistent layouts.
- Leverage AI for adaptive, high-quality tagging.
- Manage metadata, alt text, and other accessibility attributes for full compliance.
Watch the Webinar and Keep Learning
Watch our webinar to see how template-based workflows improve PDF accessibility or read the related Guide.
Download Sample Files from this webinar.









