How to Define PDF Tags in PDFix Desktop
To define the tags to which a specific action will be applied, use the Tags combo box. This provides several options for selecting the desired tags:
- Check Tag Types
- Select Predefined Set of Tags
- Define tag by Regular Expression Format
- Define tag by Template

Check Tag Types
In the combo box, you can easily select the tag types you wish to process by checking the appropriate options.

Predefined Tag Sets
Each action comes with predefined subsets of tags, which are defined by regular expressions or templates. For example, the Empty Tags Without Content action processes tags that do not contain any page content.

Define Tags by Regex
Tag types can be defined quickly using regular expressions. You can either enter a custom regular expression directly into the Tags Combo Box or select from predefined or previously saved regex patterns.
To save a frequently used regex as a favorite, use the Add Regex option. This can be accessed by right-clicking inside the combo box or by clicking the Menu icon.

For Example
All tags:
.*
Tags with the bbox attribute
^Figure$|^Formula$|^Form$|^Table$
Note and TH tags
^Note$|^TH$
Figure and Formula tags
^Figure$|^Formula$
Test the regex on https://regex101.com/
Define Tags by Template
The Define by Template option allows you to create more advanced tag queries. To use it, select Add Template from the combo box menu.

In the Edit Template dialog, you can define conditions specifying which tags should be processed in the tag_update node. The method for defining conditions is the same as in the Template panel.
Below, you’ll find examples of commonly used templates. To apply these, select the Plain Template option in the Edit Template dialog and replace the default code with one of the examples provided.

Example 1: Empty tags without content
{
"template": {
"tag_update": [
{
"query": {
"$and": [
{
"$0_tag_type": {
"$regex": "^(?!H$|H\\d$|TH$|TD$|TR$|LBody$).*"
}
},
{
"$0_has_content": "false"
}
],
"param": [
"pds_struct_elem"
]
},
"statement": "$if"
}
]
}
}
Example 2: Tags with empty spaces
{
"template": {
"tag_update": [
{
"query": {
"$and": [
{
"$0_text": {
"$regex": "^ *$"
}
}
],
"param": [
"pds_struct_elem"
]
},
"statement": "$if"
}
]
}
}
Example 3: TD cells with specific font name and size
{
"template": {
"tag_update": [
{
"query": {
"$and": [
{
"$0_tag_type": {
"$regex": "^TD$"
}
},
{
"$0_font_name": {
"$regex": "AvenirNextLTPro-Demi"
}
},
{
"$0_font_size": "9"
}
],
"param": [
"pds_struct_elem"
]
},
"statement": "$if"
}
]
}
}