A PDF carries two layers: the visible page you see, and an invisible
tag tree underneath that tells assistive technology what each piece of the
page actually is — a heading, a paragraph, a list, a figure, a table cell — and in what
order to read them. The visual layer is for sighted users; the tag tree is the document
for everyone else. PDF/UA (ISO 14289) is the standard that defines
what that tag tree must contain, and it lines up with the same WCAG criteria you apply
to web pages. When the tags are missing, wrong, or describe a picture instead of text,
the document falls apart for anyone not reading it with their eyes.
This lesson works through the four defects behind the majority of real-world document
failures: a PDF with no tags at all, images with no alternative text, a long document
with no real heading structure, and a scanned page that is only a picture of text. Each
one is fixed not by patching the PDF by hand but by authoring the source
correctly — in Word, InDesign, or PowerPoint — so a proper tagged PDF exports from it.
What you’ll learn
How to export a fully tagged PDF with a correct reading
order instead of an untagged jumble; how to give every informative image alt text
and mark decorative ones as artifacts; how to build real H1–H6
heading tags and bookmarks from heading styles in the source; and why a scanned,
image-only PDF needs OCR plus tagging — or an accessible alternative — before anyone
using assistive technology can read it.
Standards this lesson maps to
Standard
Criterion
Level
What it requires
PDF/UA-1
ISO 14289-1
—
A PDF must be fully tagged, with a logical structure tree and reading order, real text, and described images.
PDF/UA-2
ISO 14289-2 (PDF 2.0)
—
The updated edition for PDF 2.0, with refined tag semantics, namespaces, and associated files.
WCAG 2.2
1.1.1 Non-text Content
A
Informative images carry a text alternative; decorative images are marked as artifacts so they’re ignored.
WCAG 2.2
1.3.1 Info and Relationships
A
Headings, lists, tables, and reading order are conveyed by tags, not by visual layout alone.
WCAG 2.2
2.4.1 Bypass Blocks
A
Heading tags and bookmarks let users skip ahead and navigate a long document by structure.
WCAG 2.2
2.4.6 Headings and Labels
AA
Headings describe their section, and the heading levels follow a correct, nested order.
WCAG 2.2
1.4.5 Images of Text
AA
Text is real, selectable text — not a picture of text — so it can be read, resized, and reflowed.
EN 301 549
10 Non-web documents (incorporates WCAG)
—
European harmonised standard; clause 10 applies the WCAG A/AA set to documents such as PDF.
Section 508
502 / 504 (incorporates WCAG A & AA)
—
US federal electronic documents must meet WCAG 2.0 Level A and AA, including tagging and alt text.
ADA Title II
WCAG 2.1 AA (DOJ rule)
AA
US state/local government web content and documents must conform to WCAG 2.1 AA.
The four problems we’ll fix
Each card below isolates one common document defect. Because a PDF isn’t HTML, the
Bad and Good examples show the document’s underlying
structure — the PDF logical tag tree, or the source markup that exports to a
tagged PDF — written as escaped, non-running code so it can’t affect this page. For
every issue you get a plain-language statement of the problem, those examples, the
copyable Code, and an ordered fix.
Untagged PDF with no structure
PDF/UA-1WCAG 2.2 · 1.3.1AEN 301 549Section 508
An untagged PDF has no logical structure tree at all.
The text is still there visually, but nothing tells assistive technology which run
of characters is a heading, which is a paragraph, where a list starts, or what order
to read the page in. A screen reader falls back to guessing reading order from the
position of content on the page, which in a multi-column or boxed layout produces an
unordered jumble — a footer read before a heading, two columns interleaved
line-by-line. PDF/UA-1 requires a complete tag tree, and the fix is to export a
tagged PDF from the source rather than printing a flat one.
Bad
There is no structure tree — the page is a bag of positioned text and graphics
with no roles and no reading order. This is what “Print to PDF” or an untagged
export produces.
untagged-tag-tree.txt
Tags panel: (No Tags Available)
Document content is only loose page objects:
/Page
BT … "Quarterly Report" … ET <!-- looks like a title, but no tag -->
BT … "Revenue rose 8%…" … ET <!-- looks like a paragraph, no tag -->
BT … "Page 1 of 12" … ET <!-- footer, may be read first -->
Reading order: inferred from x/y position → unreliable
Good
A tagged export builds a logical structure tree: a Document root
with real H1, P, and L tags in the
intended reading order. Now a screen reader reads heading, then body, then list —
in order.
tagged-tag-tree.txt
<Document>
<H1>Quarterly Report</H1>
<P>Revenue rose 8% over the previous quarter.</P>
<L>
<LI><LBody>North region: up 12%</LBody></LI>
<LI><LBody>South region: flat</LBody></LI>
</L>
</Document>
<!-- "Page 1 of 12" is an Artifact, outside the reading order -->
Code
You almost never write tags by hand — you author the source so the export tags
it. Use real styles in Word and turn on tagged export; the structure carries
across automatically.
source-to-tagged-pdf.txt
Word source (real styles, not manual formatting):
Heading 1 → "Quarterly Report"
Normal → "Revenue rose 8%…"
List Bullet → "North region…", "South region…"
Export → "Best for electronic distribution and accessibility"
☑ Document structure tags for accessibility (tagged PDF)
☑ Create bookmarks using: Headings
InDesign: set the Articles panel order, then
File ▸ Export ▸ Adobe PDF (Print) ▸ ☑ Create Tagged PDF
How to fix
Author the source with real paragraph and list styles — never with manual
spacing and font changes that carry no meaning.
Export a tagged PDF: in Word choose the accessibility-preserving
option; in InDesign tick “Create Tagged PDF”. Avoid plain “Print to PDF”.
Open the Tags panel (or run a checker) and confirm the document is tagged and
not “No Tags Available”.
Verify the reading order in the tag tree matches the intended order, and that
page numbers, headers, and footers are marked as artifacts.
Images in the document with no alternative text
PDF/UA-1WCAG 2.2 · 1.1.1AEN 301 549ADA Title II
A figure tag with no alternative text is silent. When a
chart, logo, photo, or infographic is placed in a document and exported as a
Figure tag without an /Alt entry, a screen reader either
skips it or announces a bare “graphic” — and any information carried only by that
image is lost. Two cases need handling separately: an informative image
needs alt text that conveys its meaning, while a decorative image (a divider,
a background flourish) should be marked as an artifact so it is removed from the
reading order entirely rather than announced as empty.
Bad
The image is tagged as a figure but has no alternative text. The chart’s data is
conveyed only visually, so it is unavailable to anyone using a screen reader.
figure-no-alt.txt
<Figure>
<!-- placed image of a bar chart, no /Alt entry -->
</Figure>
Screen reader announces: "graphic" <!-- no meaning conveyed -->
Good
The figure carries an Alt attribute that conveys what the image
communicates. For a data chart, the alt summarises the takeaway; the full data
can also be offered as a real table nearby.
figure-with-alt.txt
<Figure Alt="Bar chart: revenue rose from $1.2M in Q1
to $1.6M in Q4, up 33% across the year.">
<!-- placed image of the bar chart -->
</Figure>
Screen reader announces the full alt text.
Code
A purely decorative image must be taken out of the reading order, not given empty
alt. Tag it as an Artifact so assistive technology ignores it. In the
source you set both with “Edit Alt Text” / “Mark as decorative”.
decorative-artifact.txt
Decorative divider → Artifact (removed from reading order):
<Artifact> … decorative rule … </Artifact>
In the source (Word / PowerPoint):
Right-click image ▸ "View Alt Text"
• Informative → type a concise description
• Decorative → ☑ "Mark as decorative"
InDesign: Object ▸ Object Export Options ▸ Alt Text
• Source: Custom → description, or
• Set the image as an artifact for decoration
How to fix
Decide for each image whether it is informative or decorative — that single
choice drives everything else.
Give every informative figure alt text that conveys its meaning, not its file
name; summarise charts and offer the underlying data as real text or a table.
Mark decorative images as artifacts (or “decorative” in the source) so they’re
removed from the reading order rather than announced as empty.
Don’t leave alt text as the image’s filename or “image” — that’s noise, not
information.
Run an accessibility check and confirm no figure is reported as missing
alternate text.
No real heading structure or bookmarks
PDF/UA-1WCAG 2.2 · 1.3.1A2.4.1A2.4.6AAEN 301 549
In a long document, headings are how everyone navigates —
but only if they are real headings. Text that is merely made big and bold
looks like a heading yet exports as an ordinary P tag, so a screen
reader user can’t pull up a list of headings or jump between sections, and there are
no bookmarks to move through the document. The page becomes one undifferentiated
scroll. The fix is to apply real heading styles in the source so they export as
nested H1–H6 tags and generate a bookmark tree — and to
keep the levels in order without skipping.
Bad
What looks like a section heading is just a large, bold paragraph. It exports as
a plain P, so it isn’t in the headings list and produces no
bookmark.
Real heading styles export as nested heading tags, in order, with no skipped
levels. The structure now drives both the headings list and the bookmark
tree.
real-headings.txt
<H1>Annual Accessibility Report</H1>
<H2>1. Introduction</H2>
<P>Body text…</P>
<H2>2. Methodology</H2>
<H3>2.1 Sampling</H3>
<H3>2.2 Tools</H3>
<!-- One H1 per document; no jump from H1 to H3 -->
Code
You get this from the source by using its heading styles and enabling
bookmark generation on export — never by manually enlarging text.
headings-and-bookmarks.txt
Word source:
Apply "Heading 1", "Heading 2", "Heading 3" styles
(Home ▸ Styles) — not bold + bigger font.
Export to PDF:
☑ Create bookmarks using: Headings
☑ Document structure tags for accessibility
Result in the PDF:
• Heading styles → H1…H6 structure tags
• Bookmarks panel mirrors the heading outline (2.4.1)
How to fix
Apply real heading styles in the source for every section title; don’t fake a
heading with large bold text.
Keep the levels nested and in order — one H1 for the document
title, then H2, H3 without skipping a level (2.4.6).
On export, enable “Create bookmarks using headings” so a long document gains a
navigable bookmark tree (2.4.1).
Write headings that actually describe their section, then check the tag tree:
the heading list and bookmarks should match the visible outline.
Scanned, image-only PDF
WCAG 2.2 · 1.4.5AA1.1.1APDF/UA-1EN 301 549
A scanned document is a photograph of paper: each page is
one big image, and the “text” on it is just pixels. There is no selectable,
searchable, taggable text underneath at all, so a screen reader finds nothing to read,
the content can’t be resized or reflowed without blurring, and it fails both Images of
Text (1.4.5) and Non-text Content (1.1.1). Tagging alone can’t rescue it because there
is no text to tag. The fix is to recover real text with OCR and then tag the result —
or, where the scan is poor, to provide an accessible HTML or properly tagged
alternative.
Bad
The whole page is a single scanned image. Selecting text selects nothing; a
screen reader has only an undescribed graphic, so the entire document is
unreadable.
image-only-scan.txt
/Page
/XObject /Image (full-page scan, e.g. 2480×3508 px)
No text layer. Select-all selects nothing.
Tag tree (if any): <Figure> with no /Alt → "graphic"
Reflow: unavailable (it's a picture)
Good
OCR recognises the characters and adds a real text layer, which is then tagged
into a proper structure. Now the text is selectable, searchable, reflowable, and
read aloud in order.
ocr-then-tagged.txt
After OCR (recognised text layer) + tagging:
<Document>
<H1>Notice of Public Meeting</H1>
<P>The council will meet on 14 March at 7 p.m.</P>
</Document>
Text is now selectable, searchable, and reflowable.
Code
Run recognition, fix what OCR got wrong, then tag — or give people a clean
accessible alternative. Where the original source still exists, re-exporting a
tagged PDF from it beats OCR every time.
recover-real-text.txt
Acrobat: Scan & OCR ▸ "Recognize Text"
→ adds a searchable text layer to the scan
Then: All tools ▸ Prepare for accessibility ▸ Autotag,
and fix reading order + alt text by hand.
Always proofread OCR output (it misreads characters).
Better alternative when available:
• Re-export a tagged PDF from the original source, or
• Publish an accessible HTML version of the content.
How to fix
If the original source still exists, re-export a tagged PDF from it instead of
working from the scan — that gives the cleanest real text.
Otherwise run OCR to add a real text layer, then proofread it: OCR routinely
misreads characters and merges columns.
Tag the recognised document and correct the reading order, headings, and alt
text by hand.
Where a scan is too poor to OCR reliably, publish an accessible HTML or
freshly tagged alternative and link people to it.
Confirm the result has selectable text and resizes without turning to mush
(1.4.5).
Recap
Tagged — export a fully tagged PDF from the source, so a logical
structure tree describes every paragraph, list, and table (PDF/UA-1, 1.3.1).
Reading order — check that the tag order matches the intended
reading order, not the order shapes happen to sit on the page (1.3.1).
Alt — give every informative figure a text alternative and mark
decorative images as artifacts (1.1.1).
Headings — use real heading styles so they become
H1–H6 tags and bookmarks; never fake them with big bold
text (1.3.1, 2.4.1, 2.4.6).
Real text — ship selectable, taggable text; OCR and tag any
scan, or provide an accessible alternative (1.4.5, 1.1.1).
The same structural fixes satisfy PDF/UA, WCAG, EN 301 549, Section
508, and ADA Title II at once — author the source correctly and the tagged PDF meets
them all.