PDFix Batch Actions revolutionize PDF document processing with a no-code solution powered by a simple JSON configuration file.

Each action is fully customizable and can be applied to specific objects, tags, selected elements, the entire document, or even a batch of documents. With just a few clicks, you can easily resolve most accessibility issues—streamlining compliance and improving PDF usability at scale.

Action Video Guide

Action User Guide

PDFix Batch Actions

Custom PDF actions automate the editing of PDFs and resolve various accessibility issues, streamlining document editing and remediation. This approach enhances efficiency and simplifies tasks by eliminating the need for programming skills. By utilizing a bespoke sequence of actions, you can process PDF documents tailored to your specific requirements

{
    "title": "PDFix Batch Action Example",
    "desc": "Custom action sequence to re-tag the PDF document and set the PDF/UA-1 identifier",
    "actions": [
        {
            "name": "clear_structure",
            "params": [
                {
                    "name": "clear_tags",
                    "value": true
                },
                {
                    "name": "clear_struct_tree",
                    "value": true
                },
                {
                    "name": "clear_bookmarks",
                    "value": false
                }
            ]
        },
        {
            "name": "add_tags",
            "params": [
                {
                    "name": "standard_attrs",
                    "value": false
                },
                {
                    "name": "sequential_headings",
                    "value": true
                }
            ]
        },
        {
            "name": "set_pdf_ua_standard",
            "params": [
                {
                    "name": "part_number",
                    "value": 1
                }
            ]
        }
    ]
}

Accessibility

Fix Role Mapping

remove_standard_tags_mapping

Resolve issues in the document’s Role Map to ensure correct structure type mappings

params:

  • standard_role_mapping (bool) Remove standard tags mapping – Remove role mapping of standard structure types

  • circular_role_mapping (bool) Remove circular role mapping – Detect and remove circular role mappings that is explicitly permitted

  • clear_rolemap (bool) Clear Role Map – Clear the role map specified in the structure tree root

example:

{
    "name": "remove_standard_tags_mapping",
    "params": [
        {
            "name": "standard_role_mapping",
            "value": true
        },
        {
            "name": "circular_role_mapping",
            "value": true
        },
        {
            "name": "clear_rolemap",
            "value": true
        }
    ]
}

AutoTag

add_tags

Automatically add accessibility tags to an untagged document

params:

  • template (file_path) Template – Load the layout template from the file as the current template. If the file is empty, the default template will be applied

  • preflight (bool) Preflight – Preflight the document and combine the preflight values with the current template

  • standard_attrs (bool) Add Layout Attributes – Add all detected layout attributes

  • sequential_headings (bool) Sequential Heading Levels – Keep headings in sequentially descending order

example:

{
    "name": "add_tags",
    "params": [
        {
            "name": "template",
            "value": ""
        },
        {
            "name": "preflight",
            "value": false
        },
        {
            "name": "standard_attrs",
            "value": false
        },
        {
            "name": "sequential_headings",
            "value": false
        }
    ]
}

Clear Document Structure

clear_structure

Clear the document structure

params:

  • clear_tags (bool) Clear Content Marks – Clear content marks

  • clear_struct_tree (bool) Clear Structure Tree – Clear the structure tree

  • clear_bookmarks (bool) Clear Bookmarks – Clear bookmarks

example:

{
    "name": "clear_structure",
    "params": [
        {
            "name": "clear_tags",
            "value": true
        },
        {
            "name": "clear_struct_tree",
            "value": true
        },
        {
            "name": "clear_bookmarks",
            "value": true
        }
    ]
}

Fix ID Tree

fix_id_tree

Fix the ID tree

example:

{
    "name": "fix_id_tree"
}

Fix Parent Tree

fix_parent_tree

Fix the parent tree

example:

{
    "name": "fix_parent_tree"
}

Fix Spaces

fix_structure_spaces

Add missing or resolve duplicate white spaces within a structure element

params:

  • add_missing_spaces (bool) Add Missing Spaces – Identify words in the structure and add missing spaces

  • remove_unnecessary_spaces (bool) Remove Unnecessary Spaces – Remove duplicate spaces after each word

  • artifact_unnecessary_spaces (bool) Artifact Unnecessary Spaces – Mark duplicate spaces as artifacts

example:

{
    "name": "fix_structure_spaces",
    "params": [
        {
            "name": "add_missing_spaces",
            "value": true
        },
        {
            "name": "remove_unnecessary_spaces",
            "value": false
        },
        {
            "name": "artifact_unnecessary_spaces",
            "value": true
        }
    ]
}

Fix Headings

fix_headings

Correct an invalid heading structure to maintain sequentially descending order

params:

  • renumber_headings (int) Renumber Headings – Renumber all headings

    • 0 – Change headings to
    • 1 – Move headings up a level
    • 2 – Add empty headings
  • change_headings_to (string) Change Headings to – Change all headings to a specified level

    • H – H
    • H1 – H1
    • H2 – H2
    • H3 – H3
    • H4 – H4

example:

{
    "name": "fix_headings",
    "params": [
        {
            "name": "renumber_headings",
            "value": 2
        },
        {
            "name": "change_headings_to",
            "value": "H"
        }
    ]
}

Annotations

Fix Media Clip

fix_media_clip_keys

Define a MIME type for the media clip annotation file

params:

  • ct_key (string) Media Clip – CT key

    • text/plain – text/plain
    • text/html – text/html
    • image/jpeg – image/jpeg
    • audio/mp3 – audio/mp3
    • video/mp4 – video/mp4

example:

{
    "name": "fix_media_clip_keys",
    "params": [
        {
            "name": "ct_key",
            "value": "text/plain"
        }
    ]
}

Set Tab Order

set_tabs_key

Set the tab order key for every page. Every page containing an annotation must have the Tabs key set to S

params:

  • tabs_key (string) Tabs Key – Specify the tab order key

  • overwrite (bool) Overwrite – Replace the current tab order key if it already exists

example:

{
    "name": "set_tabs_key",
    "params": [
        {
            "name": "tabs_key",
            "value": "S"
        },
        {
            "name": "overwrite",
            "value": true
        }
    ]
}

Tag Annotations

tag_annot

Tag untagged annotations by placing them in the closest matching tag

params:

  • annot_types (annot) Annotations – Specify annotation types using a ECMAScript regular expression or define them by template annot_update

example:

{
    "name": "tag_annot",
    "params": [
        {
            "name": "annot_types",
            "value": "^(?!.*Popup).*$"
        }
    ]
}

Set Annotation Contents

set_annot_contents

Set an alternative description for an annotation using the Contents key or TU key for widget annotations

params:

  • annot_types (annot) Select Annotations – Specify annotation types using a ECMAScript regular expression or define them by template annot_update

  • alt_type (int) Contents – Define the source for detecting alternative text

    • 0 – Custom text
    • 1 – Text from annotation bounding box
    • 2 – Action destination
    • 3 – Auto generated text
  • custom_text (string) Custom – Enter custom text for the Contents key

  • bbox_padding_x (float) Left BBox Padding – Adjust horizontal padding (X axis) for the left edge of the BBox

  • bbox_padding_x_right (float) Right BBox Padding – Adjust horizontal padding (X axis) for the right edge of the BBox

  • bbox_padding_y_top (float) Top BBox Padding – Adjust vertical padding (Y axis) for the top edge of the BBox

  • bbox_padding_y (float) Bottom BBox Padding – Adjust vertical padding (Y axis) for the bottom edge of the BBox

  • overwrite (bool) Overwrite – Replace the current alternative description if it already exists

example:

{
    "name": "set_annot_contents",
    "params": [
        {
            "name": "annot_types",
            "value": ".*"
        },
        {
            "name": "alt_type",
            "value": 1
        },
        {
            "name": "custom_text",
            "value": "Decorative"
        },
        {
            "name": "bbox_padding_x",
            "value": 4
        },
        {
            "name": "bbox_padding_x_right",
            "value": 4
        },
        {
            "name": "bbox_padding_y_top",
            "value": 4
        },
        {
            "name": "bbox_padding_y",
            "value": 4
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Remove Annotation Properties

remove_annot_data

Remove properties from the annotations

params:

  • annot_types (annot) Annotations – Specify annotation types using a ECMAScript regular expression or define them by template annot_update

  • remove_contents (bool) Remove Contents – Remove the Contents key

example:

{
    "name": "remove_annot_data",
    "params": [
        {
            "name": "annot_types",
            "value": ".*"
        },
        {
            "name": "remove_contents",
            "value": true
        }
    ]
}

Flatten Annotations

flatten_annot

Flatten the visual representation of annotations into the content layer. This prevents issues with annotation tagging when interactivity is not needed in a PDF/UA-compliant document

params:

  • annot_types (annot) Annotations – Specify annotation types using a ECMAScript regular expression or define them by template annot_update

example:

{
    "name": "flatten_annot",
    "params": [
        {
            "name": "annot_types",
            "value": "^(?!.*Link|.*Widget|.*Popup).*$"
        }
    ]
}

create_web_links

Create link annotations from web addresses and email patterns found in the page content

params:

  • url_regex (string) URL Pattern – Specify a ECMAScript regular expression to identify web or email links in the content. The matched text will be used as the link target unless overridden by the URL Address or modified by adding the URL Prefix

    • ^(((http(s)?|ftp):\/\/)|(mailto:)|www.)[^\s\/$.?#].[^\s]*
    • _^[a-zA-Z0-9.%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}__
  • url_prefix (string) URL Prefix – Prepend this prefix to the detected URL or email if it does not already begin with a known scheme (e.g., http, mailto). This is useful for ensuring that URLs are correctly formatted


    • http://
    • https://
    • ftp://
    • file://
    • mailto:
    • tel:
    • data:
    • ws://
    • wss://
  • url (string) URL Address – Set the destination URL. If this is set, it overrides the matched text and any prefix added by the URL Prefix

example:

{
    "name": "create_web_links",
    "params": [
        {
            "name": "url_regex",
            "value": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"
        },
        {
            "name": "url_prefix",
            "value": "data:"
        },
        {
            "name": "url",
            "value": ""
        }
    ]
}

Delete Annotations

delete_annot

Completely remove an annotation from the PDF. Use this action when certain annotation types (e.g., TrapNet) are not permitted in a PDF/UA-compliant document

params:

  • annot_types (annot) Annotations – Specify annotation types using a ECMAScript regular expression or define them by template annot_update

example:

{
    "name": "delete_annot",
    "params": [
        {
            "name": "annot_types",
            "value": "^TrapNet$"
        }
    ]
}

Bookmarks

Create Bookmarks

create_bookmarks

Create bookmarks from the tag tree hierarchy

params:

  • tag_1 (tag) Level 1 – Define tag types using a ECMAScript regular expression or specify them by template

  • tag_2 (tag) Level 2 – Define the tag that represents second level

  • tag_3 (tag) Level 3 – Define the tag that represents third level

  • tag_4 (tag) Level 4 – Define the tag that represents fourth level

  • tag_5 (tag) Level 5 – Define the tag that represents fifth level

  • tag_6 (tag) Level 6 – Define the tag that represents sixth level

  • overwrite (bool) Overwrite – Replace existing bookmarks if they already exist

example:

{
    "name": "create_bookmarks",
    "params": [
        {
            "name": "tag_1",
            "value": "^H1$"
        },
        {
            "name": "tag_2",
            "value": "^H2$"
        },
        {
            "name": "tag_3",
            "value": "^H3$"
        },
        {
            "name": "tag_4",
            "value": "^H4$"
        },
        {
            "name": "tag_5",
            "value": "^H5$"
        },
        {
            "name": "tag_6",
            "value": "^H6$"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Content

Set Content Language

set_content_language

Set the content language

params:

  • object_types (object) Objects – Define the objects by the template object_update

  • lang (lang) Language – Content language

  • overwrite (bool) Overwrite – Replace the current language if it already exists

example:

{
    "name": "set_content_language",
    "params": [
        {
            "name": "object_types",
            "value": ".*"
        },
        {
            "name": "lang",
            "value": "en-US"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Delete Content

delete_content

Completely remove a content from the PDF

params:

  • object_types (object) Objects – Define the page content objects to be deleted

example:

{
    "name": "delete_content",
    "params": [
        {
            "name": "object_types",
            "value": ""
        }
    ]
}

Artifact Content

artifact_content

Mark defined content as an artifact

params:

  • object_types (object) Objects – Specify the objects using the object_update template

  • artifact_type (int) Mark as – Mark the content as an artifact, header, or footer

    • 0 – Artifact
    • 1 – Header
    • 2 – Footer

example:

{
    "name": "artifact_content",
    "params": [
        {
            "name": "object_types",
            "value": {
                "template": {
                    "object_update": [
                        {
                            "query": {
                                "$and": [
                                    {
                                        "$0_artifact": "false"
                                    },
                                    {
                                        "$0_mcid": "-1"
                                    }
                                ],
                                "param": [
                                    "pds_object"
                                ]
                            },
                            "statement": "$if"
                        }
                    ]
                }
            }
        },
        {
            "name": "artifact_type",
            "value": 0
        }
    ]
}

Flatten Form XObjects

flatten_xobject

Flatten Form XObjects

params:

  • object_types (object) Objects – Define the objects by the template object_update

example:

{
    "name": "flatten_xobject",
    "params": [
        {
            "name": "object_types",
            "value": "^pds_form$"
        }
    ]
}

Clone Form XObjects

clone_xobject

Clone Form XObjects

params:

  • object_types (object) Objects – Define the objects by the template object_update

example:

{
    "name": "clone_xobject",
    "params": [
        {
            "name": "object_types",
            "value": "^pds_form$"
        }
    ]
}

Remove Content Marks

remove_content_marks

Remove artifacts, MCIDs, or any custom tags from page content objects. Remove All Containers by checking all flags.

params:

  • object_types (object) Objects – Define the objects by the template object_update

  • flags (flag) Remove – Specify types of marked content to be removed

    • 8 – Remove Invalid MCID
    • 4 – Remove Custom Content Mark
    • 1 – Remove MCID
    • 2 – Remove Artifact

example:

{
    "name": "remove_content_marks",
    "params": [
        {
            "name": "object_types",
            "value": ".*"
        },
        {
            "name": "flags",
            "value": 8
        }
    ]
}

Set Content Color

set_content_color

Change the fill and/or stroke color of specified content objects

params:

  • object_types (object) Objects – Define the page objects using the object_update template

  • fill_color (string) Fill Color – Specify a new fill color using the format RGB(127,255,0) or CMYK(25,84,50,100). Leave empty to keep the current color unchanged

  • stroke_color (string) Stroke Color – Specify a new stroke color using the format RGB(127,255,0) or CMYK(25,84,50,100). Leave empty to keep the current color unchanged

example:

{
    "name": "set_content_color",
    "params": [
        {
            "name": "object_types",
            "value": ".*"
        },
        {
            "name": "fill_color",
            "value": "RGB(0,0,0)"
        },
        {
            "name": "stroke_color",
            "value": "RGB(0,0,0)"
        }
    ]
}

Conversion

PDF to HTML

pdf_to_html

Convert a PDF to HTML

params:

  • input_pdf (file_path) Input PDF – Specify the input PDF file path

  • output_html (file_path) Ouput HTML – Specify the output HTML file

  • html_type (int) HTML Layout – Choose the HTML layout type

    • 0 – Original layout
    • 1 – Responsive layout
    • 2 – Layout defined by PDF Tags
  • template (file_path) Template – Load the template from a file as the current template. If the file is empty, the default template will be applied

  • preflight (bool) Preflight – Preflight the document and merge its preflight values with the current template

  • flags (flag) Conversion Flags – Define conversion flags

    • 1 – Export JavaScript
    • 2 – Export fonts
    • 4 – Use default font sizes
    • 8 – Retain text color
    • 32 – Inline CSS styles
    • 64 – Inline JavaScript code
    • 128 – Embed images within the document
    • 256 – Embed fonts within the document
    • 512 – Apply gray padding

example:

{
    "name": "pdf_to_html"
}

PDF to JSON

pdf_to_json

Convert a PDF to JSON

params:

  • input_pdf (file_path) Input PDF – Specify the input PDF file path

  • output_json (file_path) Output JSON – Specify the output JSON file path

  • flags (flag) Conversion Flags – Specify flags for the extracted content

    • 1 – Include document metadata
    • 2 – Include page information
    • 16 – Extract page content
    • 32 – Extract document structure tree
    • 64 – Extract layout recognition
    • 256 – Include bounding box data
    • 512 – Include content marks
    • 4096 – Include text content
    • 8192 – Include text style
    • 16384 – Include text state
    • 65536 – Extract images as base64
    • 131072 – Extract annotations

example:

{
    "name": "pdf_to_json"
}

Fonts

Embed Fonts

embed_font

Embed fonts in the document

example:

{
    "name": "embed_font"
}

Replace Font

replace_font

Replace a font

params:

  • font_name (string) Font Name – Specify the PDF font name to be replaced. ECMAScript regular expressions are supported

  • font_family (system_font) Font Family – Specify the font family name to be used as a replacement

example:

{
    "name": "replace_font",
    "params": [
        {
            "name": "font_name",
            "value": ""
        },
        {
            "name": "font_family",
            "value": ""
        }
    ]
}

Add Missing Unicodes

add_missing_unicode

Add missing Unicode mappings

example:

{
    "name": "add_missing_unicode"
}

Metadata

Set Document Properties

set_doc_info

Set document metadata and properties

params:

  • set_author (bool) Set Author – Set the document author

  • author (string) Author – Specify the author

  • set_title (bool) Set Title – Set the document title

  • title (string) Title – Specify the title

  • set_subject (bool) Set Subject – Set the document subject

  • subject (string) Subject – Specify the subject

  • set_keywords (bool) Set Keywords – Set the document keywords

  • keywords (string) Keywords – Specify the keywords

  • set_producer (bool) Set Producer – Set the document producer

  • producer (string) Producer – Specify the producer name

  • set_creator (bool) Set Creator – Set the document creator

  • creator (string) Creator – Specify the creator

example:

{
    "name": "set_doc_info",
    "params": [
        {
            "name": "set_author",
            "value": true
        },
        {
            "name": "author",
            "value": ""
        },
        {
            "name": "set_title",
            "value": true
        },
        {
            "name": "title",
            "value": ""
        },
        {
            "name": "set_subject",
            "value": true
        },
        {
            "name": "subject",
            "value": ""
        },
        {
            "name": "set_keywords",
            "value": true
        },
        {
            "name": "keywords",
            "value": ""
        },
        {
            "name": "set_producer",
            "value": true
        },
        {
            "name": "producer",
            "value": ""
        },
        {
            "name": "set_creator",
            "value": true
        },
        {
            "name": "creator",
            "value": ""
        }
    ]
}

Set PDF Version

set_pdf_version

Set the PDF version

params:

  • version_number (int) PDF Version – Choose the PDF version designation

    • 14 – PDF 1.4
    • 15 – PDF 1.5
    • 16 – PDF 1.6
    • 17 – PDF 1.7
    • 20 – PDF 2.0

example:

{
    "name": "set_pdf_version",
    "params": [
        {
            "name": "version_number",
            "value": 17
        }
    ]
}

Set PDF/UA Standard

set_pdf_ua_standard

Set the PDF/UA identifier

params:

  • part_number (string) Part Identifier – Specify the part number of the International Standard to which the file conforms

    • ____ – Remove PDF/UA Part Number
    • 1 – Set PDF/UA-1
    • 2 – Set PDF/UA-2
  • rev_number (string) Rev Number – Specify the four-digit year of publication or revision (ignored for part 1)

example:

{
    "name": "set_pdf_ua_standard",
    "params": [
        {
            "name": "part_number",
            "value": "1"
        },
        {
            "name": "rev_number",
            "value": "2023"
        }
    ]
}

Set Suspect Value

set_suspect_value

Fix the document MarkInfo dictionary and Suspects entry

example:

{
    "name": "set_suspect_value"
}

Fix Optional Content

fix_oc_name

Fix the optional content configuration dictionary

example:

{
    "name": "fix_oc_name"
}

Fix Display Document Title

set_display_doc_title

Fix the ViewerPreferences dictionary

example:

{
    "name": "set_display_doc_title"
}

Set Document Language

set_language

Set the document language

params:

  • lang (lang) Language – Document language

  • overwrite (bool) Overwrite – Replace the current language if it already exists

example:

{
    "name": "set_language",
    "params": [
        {
            "name": "lang",
            "value": "en-US"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Title

set_title

Set the document title

params:

  • title_type (int) Title – Define a source for detecting the document title

    • 0 – Define a custom title
    • 1 – Pick text from first tag
    • 2 – Get title from the file name
  • custom_text (string) Custom Title – Custom title

  • description_tag (string) Pick Text From Tag – Define the tag type whose content is used for the title text

  • overwrite (bool) Overwrite – Replace the current title if it already exists

example:

{
    "name": "set_title",
    "params": [
        {
            "name": "title_type",
            "value": 2
        },
        {
            "name": "custom_text",
            "value": ""
        },
        {
            "name": "description_tag",
            "value": "Caption"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Pages

Rotate Page

rotate_pages

Rotate pages

params:

  • object_types (object) Pages – Select pages using ECMAScript regular expression or by defining anchors in a template

  • rotation_type (int) Rotation Type – Specify the type of rotation

    • 0 – Set rotation angle
    • 1 – Rotate by angle
  • rotation_angle (int) Rotation Angle – Specify the rotation angle

    • 0 – 0 degrees
    • 90 – 90 degrees
    • 180 – 180 degrees
    • 270 – 270 degrees

example:

{
    "name": "rotate_pages",
    "params": [
        {
            "name": "object_types",
            "value": ".*"
        },
        {
            "name": "rotation_type",
            "value": 1
        },
        {
            "name": "rotation_angle",
            "value": 0
        }
    ]
}

Fix Page Orientation

fix_page_orientation

Correct the orientation of selected pages and normalize their transformation matrix and bounding box

params:

  • object_types (object) Pages – Select pages using ECMAScript regular expression or by defining anchors in a template

example:

{
    "name": "fix_page_orientation",
    "params": [
        {
            "name": "object_types",
            "value": ".*"
        }
    ]
}

Split Pages

split_pages

Split a PDF into multiple documents based on defined page rules or template anchors

params:

  • base_pdf (file_path) Output Path – Specify the output folder based on the input file pathh

  • object_types (object) Splitters – Specify split rules using ECMAScript regular expression for page numbers or template-defined anchors

example:

{
    "name": "split_pages",
    "params": [
        {
            "name": "base_pdf",
            "value": ""
        },
        {
            "name": "object_types",
            "value": ".*"
        }
    ]
}

Table

Fix Table Cells

set_table_header

Fix table header and data cells

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • scope (string) Scope – Specify whether the header cell applies to a row, column, or both

    • None – None
    • Row – Row
    • Column – Column
    • Both – Both
  • row_span (int) RowSpan – Set the row span for the cell. Use -1 to keep the existing RowSpan

  • col_span (int) ColSpan – Set the column span for the cell. Use -1 to keep the existing ColSpan

  • tag_name (string) Change to – Specify a new tag type (TD or TH) for the cell. Leave empty to retain the existing type

  • overwrite (bool) Overwrite – Replace current properties if they already exist

example:

{
    "name": "set_table_header",
    "params": [
        {
            "name": "tag_names",
            "value": "^TD$"
        },
        {
            "name": "scope",
            "value": "None"
        },
        {
            "name": "row_span",
            "value": -1
        },
        {
            "name": "col_span",
            "value": -1
        },
        {
            "name": "tag_name",
            "value": "TH"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Table Summary

set_table_summary

Provide a summary of the table. Only applicable to Table tags

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • summary_type (int) Summary – Define a source for detecting the summary

    • 0 – Define the Custom Summary
    • 1 – Use the associated TH content
    • 2 – Use the associated tag content
  • custom_text (string) Custom Summary – Enter custom text as the table summary

  • overwrite (bool) Overwrite – Replace the table summary if it already exists

example:

{
    "name": "set_table_summary",
    "params": [
        {
            "name": "tag_names",
            "value": "^Table$"
        },
        {
            "name": "summary_type",
            "value": 2
        },
        {
            "name": "custom_text",
            "value": "Summary"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Tags

Apply Standard Tags

apply_standard_tags

Changes all non-standard tags to standard tags according to their role mapping

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

example:

{
    "name": "apply_standard_tags",
    "params": [
        {
            "name": "tag_names",
            "value": ".*"
        }
    ]
}

Set Role Mapping

set_role_mapping

Map the name of structure types used in the document to the selected standard structure types

params:

  • tag_names (tag) Tags – Specify the tag types using a ECMAScript regular expression or define them by template tag_update

  • standard_tag_name (string) Standard tag type – Specify the standard tag name to which the selected tag will be role-mapped

  • overwrite (bool) Overwrite – Replace the current tag mapping

example:

{
    "name": "set_role_mapping",
    "params": [
        {
            "name": "tag_names",
            "value": ""
        },
        {
            "name": "standard_tag_name",
            "value": "P"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Import Tags

import_tags

Import a tag tree with predefined values and templates

params:

  • json_path (file_path) Json – Load a JSON file that represents the tag tree in an expected format

example:

{
    "name": "import_tags",
    "params": [
        {
            "name": "json_path",
            "value": ""
        }
    ]
}

Delete Tags

delete_tags

Delete defined tags

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • tag_content (string) Content – Handle the content of deleted tags

    • none – Leave content untagged
    • move – Move nested tags to the parent tag
    • artifact – Mark content as an artifact

example:

{
    "name": "delete_tags",
    "params": [
        {
            "name": "tag_names",
            "value": ".*"
        },
        {
            "name": "tag_content",
            "value": "none"
        }
    ]
}

Rename Tags

rename_tags

Rename tags

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • tag_name (string) Replace with – Specify a new tag name

example:

{
    "name": "rename_tags",
    "params": [
        {
            "name": "tag_names",
            "value": "^P$"
        },
        {
            "name": "tag_name",
            "value": "P"
        }
    ]
}

Clone Tag XObjects

clone_tag_xobject

Clone Form XObjects in tags

example:

{
    "name": "clone_tag_xobject"
}

Set Tag Language

set_tag_language

Set the tag language

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • lang (lang) Language – Tag language

  • overwrite (bool) Overwrite – Replace the current language if it already exists

example:

{
    "name": "set_tag_language",
    "params": [
        {
            "name": "tag_names",
            "value": ".*"
        },
        {
            "name": "lang",
            "value": "en-US"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Tag ID

set_tag_id

Generate a unique ID key for specific tags, such as Note tags required in PDF/UA-1

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • overwrite (bool) Overwrite – Replace the current tag ID if it already exists

example:

{
    "name": "set_tag_id",
    "params": [
        {
            "name": "tag_names",
            "value": "^Note$|^TH$"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Tag BBox

set_tag_bbox

Calculate the bounding box from the tag content and set it in the Layout attributes

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • overwrite (bool) Overwrite – Replace the current bounding box if it already exists

example:

{
    "name": "set_tag_bbox",
    "params": [
        {
            "name": "tag_names",
            "value": "^Figure$|^Formula$|^Form$|^Table$"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Alternate Description

set_alt

Set an alternative description for the tag. These text alternatives are crucial for accessibility, helping users with vision impairments understand the content

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • alt_type (int) Alternative Description – Define a source for detecting the alternative text

    • 0 – Define the Custom Alternative text
    • 1 – Use the first Description Tag above
    • 2 – Use the first Description Tag below
    • 3 – Use the first Description Tag from children
    • 4 – Use the associated tag content. If there is an Annotation among the children, its Contents key is used
  • custom_text (string) Custom Alternative – Enter custom text for the alternative description

  • description_tag (string) Description Tag – Define tags whose content is used for the alternative description

  • overwrite (bool) Overwrite – Replace the alternative description if it already exists

example:

{
    "name": "set_alt",
    "params": [
        {
            "name": "tag_names",
            "value": "^Figure$|^Formula$"
        },
        {
            "name": "alt_type",
            "value": 4
        },
        {
            "name": "custom_text",
            "value": "Decorative"
        },
        {
            "name": "description_tag",
            "value": "Caption"
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Set Actual Text

set_actual

Set a replacement text for the content, providing an equivalent text representation

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • actual_type (int) Actual Text – Define a source for detecting the replacement text

    • 0 – Custom actual text
    • 1 – Use the associated tag content
  • custom_text (string) Custom – Enter custom actual text as the replacement text

  • overwrite (bool) Overwrite – Replace the actual text if it already exists

example:

{
    "name": "set_actual",
    "params": [
        {
            "name": "tag_names",
            "value": "^Span$"
        },
        {
            "name": "actual_type",
            "value": 0
        },
        {
            "name": "custom_text",
            "value": ""
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Fix Placement

fix_placement

Fix incorrect placement attributes for specified tags

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

example:

{
    "name": "fix_placement",
    "params": [
        {
            "name": "tag_names",
            "value": "^Figure$|^Formula$|^Form$|^Note$"
        }
    ]
}

Fix Document Tag

fix_document_tag

Fix the document tag

example:

{
    "name": "fix_document_tag"
}

Fix List Tag

fix_list_tag

Fix list(L) tag errors

params:

  • tag_names (tag) Tags – Define the tags by the template tag_update

example:

{
    "name": "fix_list_tag",
    "params": [
        {
            "name": "tag_names",
            "value": "^L$"
        }
    ]
}

fix_link_tag

Fix link(Link) tag errors

params:

  • tag_names (tag) Tags – Define the link tags by the template tag_update

  • zoom (float) Destination Zoom – The zoom factor to set for the destination. If zero, use the predefined value to indicate a NULL zoom factor

  • bbox_padding_x (float) Left Padding – Adjust horizontal padding (X axis) for the left edge of the destination

  • bbox_padding_y_top (float) Top Padding – Adjust vertical padding (Y axis) for the top edge of the destination

  • overwrite (bool) Overwrite – Replace the current destination if it already exists

example:

{
    "name": "fix_link_tag",
    "params": [
        {
            "name": "tag_names",
            "value": "^Link$"
        },
        {
            "name": "zoom",
            "value": 0
        },
        {
            "name": "bbox_padding_x",
            "value": 4
        },
        {
            "name": "bbox_padding_y_top",
            "value": 4
        },
        {
            "name": "overwrite",
            "value": false
        }
    ]
}

Remove Tag Properties

remove_tag_data

Remove properties from the defined tags

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • accept_alternate_desc (bool) Remove Alternate Description – Remove Alt key

  • accept_actual_text (bool) Remove Actual Text – Remove Replacement Text, ActualText key

  • accept_expansion_text (bool) Remove Expansion Text – Remove the E key

  • accept_id (bool) Remove ID – Remove the ID key

  • accept_lang (bool) Remove Language – Remove the Lang key

  • accept_title (bool) Remove Title – Remove the T key

  • owner (string) Remove Attribute Owner – Remove standard attribute owners. If no attribute name is specified, all attributes in the group will be removed

    • None – None
    • Layout – Layout Attributes governing the layout of content
    • List – List Attributes governing the numbering of lists
    • PrintField – PrintField Attributes governing Form structure elements for non-interactive form fields
    • Table – Table Attributes governing the organisation of cells in tables
  • name (string) Remove Attribute Name – Remove a specific attribute by name. For example Headers from the previously defined Attribute Owner

example:

{
    "name": "remove_tag_data",
    "params": [
        {
            "name": "tag_names",
            "value": ".*"
        },
        {
            "name": "accept_alternate_desc",
            "value": false
        },
        {
            "name": "accept_actual_text",
            "value": false
        },
        {
            "name": "accept_expansion_text",
            "value": false
        },
        {
            "name": "accept_id",
            "value": false
        },
        {
            "name": "accept_lang",
            "value": false
        },
        {
            "name": "accept_title",
            "value": false
        },
        {
            "name": "owner",
            "value": "PrintField"
        },
        {
            "name": "name",
            "value": ""
        }
    ]
}

Set Tag Attributes

set_structure_attribute

Set standard structure attributes for tags. Each attribute object has an owner

params:

  • tag_names (tag) Tags – Specify the tags using a ECMAScript regular expression or define them by template tag_update

  • overwrite (bool) Overwrite – Replace the current attribute if it already exists

  • owner (string) Owner – Specify the standard attribute owner

    • Layout – Layout Attributes governing the layout of content
    • List – List Attributes governing the numbering of lists
    • PrintField – PrintField Attributes governing Form structure elements for non-interactive form fields
    • Table – Table Attributes governing the organisation of cells in tables
  • name (string) Name – Specify the attribute name

  • value (string) Value – Specify the attribute value

  • value_type (string) Value Type – Specify the attribute value type

    • string – string
    • name – name
    • array – array
    • number – number
  • overwrite (bool) Overwrite – Replace the current attribute if it already exists

example:

{
    "name": "set_structure_attribute",
    "params": [
        {
            "name": "tag_names",
            "value": ".*"
        },
        {
            "name": "overwrite",
            "value": false
        },
        {
            "name": "owner",
            "value": "PrintField"
        },
        {
            "name": "name",
            "value": ""
        },
        {
            "name": "value",
            "value": ""
        },
        {
            "name": "value_type",
            "value": "name"
        },
        {
            "name": "overwrite",
            "value": "false"
        }
    ]
}