How to Define Tags

Table of Content

Workspace

Preferences

General

Display

Panels

Shortcuts

App View

Accessibility

Page Map

Table Tool

Template

PDFix Actions

Action Manager

Selection Tools

Validation

Accessibility

Tags

Annotations

Content

Bookmarks

Conversion

Destinations

Browser

Template

License

To define the tags to which a specific action will be applied, use the Tags combo box. This provides several options for selecting the desired tags:

set alternate description on the document dialog screenshot

In the combo box, you can easily select the tag types you wish to process by checking the appropriate options.

SCREENSHOT of different tag types in selection dialog

Each action comes with predefined subsets of tags, which are defined by regular expressions or templates. For example, the Empty Tags Without Content action processes tags that do not contain any page content.

SCREENSHOT of different predefined tag sets in selection dialog

Tag types can be defined quickly using regular expressions. You can either enter a custom regular expression directly into the Tags Combo Box or select from predefined or previously saved regex patterns.

To save a frequently used regex as a favorite, use the Add Regex option. This can be accessed by right-clicking inside the combo box or by clicking the MENU ICON Menu icon.

For Example

All tags:

.*

Tags with the bbox attribute

^Figure$|^Formula$|^Form$|^Table$

Note and TH tags

^Note$|^TH$

Figure and Formula tags

^Figure$|^Formula$

Test the regex on https://regex101.com/


The Define by Template option allows you to create more advanced tag queries. To use it, select Add Template from the combo box menu.

In the Edit Template dialog, you can define conditions specifying which tags should be processed in the tag_update node. The method for defining conditions is the same as in the Template panel.

Below, you’ll find examples of commonly used templates. To apply these, select the Plain Template option in the Edit Template dialog and replace the default code with one of the examples provided.

Example 1: Empty tags without content

{
    "template": {
        "tag_update": [
            {
                "query": {
                    "$and": [
                        {
                            "$0_tag_type": {
                                "$regex": "^(?!H$|H\\d$|TH$|TD$|TR$|LBody$).*"
                            }
                        },
                        {
                            "$0_has_content": "false"
                        }
                    ],
                    "param": [
                        "pds_struct_elem"
                    ]
                },
                "statement": "$if"
            }
        ]
    }
}

Example 2: Tags with empty spaces

{
    "template": {
        "tag_update": [
            {
                "query": {
                    "$and": [
                        {
                            "$0_text": {
                                "$regex": "^ *$"
                            }
                        }
                    ],
                    "param": [
                        "pds_struct_elem"
                    ]
                },
                "statement": "$if"
            }
        ]
    }
}

Example 3: TD cells with specific font name and size

{
    "template": {
        "tag_update": [
            {
                "query": {
                    "$and": [
                        {
                            "$0_tag_type": {
                                "$regex": "^TD$"
                            }
                        },
                        {
                            "$0_font_name": {
                                "$regex": "AvenirNextLTPro-Demi"
                            }
                        },
                        {
                            "$0_font_size": "9"
                        }
                    ],
                    "param": [
                        "pds_struct_elem"
                    ]
                },
                "statement": "$if"
            }
        ]
    }
}