PDF Header Detection

Intelligently detects H1–H6 headings from font sizes, bold text, and table of contents, preserving your document's full structure as clean Markdown.

How It Works

Font size, weight, and position of every text element in the PDF is measured and recorded.

Text elements are ranked by relative size and style to assign H1 through H6 levels.

Headings are written as # through ###### with correct nesting in the final Markdown.

Compares font metrics across the document to identify the heading hierarchy.

Uses a table of contents, when present, to confirm and refine heading levels.

Identifies section titles that rely on bold styling rather than font size alone.

Maintains the exact H1 → H2 → H3 nesting structure from your original document.

Section headings, subsections, and sub-subsections are detected cleanly.

Chapter and section structure is preserved automatically. No manual cleanup needed.

Full heading hierarchy from title to sub-section comes through intact.

Related features

Upload a PDF and watch heading levels get identified and preserved as clean Markdown.