The Conversion panel offers conversions from PDF to different formats as HTML or JSON.
Basic Conversion Actions
Basic Conversion actions provide various options for converting to HTML format and extracting data to JSON.
There are three types of HTML conversion
Derivation
Tagged PDF to Responsive HTML
The Derivation algorithm produces conversion to HTML from a tagged PDF. A well-tagged PDF documents with fixed layout derives HTML layout from tags and based on this, quickly and very easily creates fully responsive HTML.
Convert to HTML
Untagged PDF to HTML
The vast majority of PDFs are unstructured and not tagged at all. Therefore, our team created conversion with the Layout Recognition Tool powered by AI and Machine Learning to insert the correct structure into PDF and easily convert it to HTML.
Convert to Fixed HTML
The most straightforward option is Convert to Fixed HTML. This conversion creates a traditional fixed (non-responsive) HTML that keeps the content layout and original formatting. However, nowadays, responsive HTML is needed and strongly preferred to preserve a unified look of the document on all possible devices.
Data Extraction
For data extraction, you can use the Convert to JSON action, which exports data based on your requirements.
Export Selection
With PDFix Desktop, you can export only the current selection, which is useful if you need to export specific data, such as tables.
To export a selection, use the Default Tool or Object Tool. Select the desired objects or area, then right-click and choose Convert to HTML from the menu.
Copy with Formatting
Copy with Formatting performs the same function as Export Selection to HTML but copies the output to the clipboard. You can paste the formatted data into applications like Excel or Google Docs to preserve structures such as tables, lists, formatting, and more.
Snapshot
Snapshot copies the selection area into the clipboard as a image.