Add accessibilty tagging by stuartwilkes · Pull Request #2 · wellcometrust/pdfmake

stuartwilkes · 2026-03-23T14:38:35Z

What does this change?

How to test

How can we measure success?

Have we considered potential risks?

Copilot

Pull request overview

This PR introduces initial support for generating tagged/accessible PDFs by building a PDF structure tree during rendering, with layout-time metadata to drive tagging.

Changes:

Add an AccessibilityTagger and integrate it into the render pipeline when docDefinition.tagged is enabled.
Annotate layout output (lines, images, SVGs, canvas vectors) with accessibility context (roles + table/list context) to drive structure tags.
Update unbreakable-block committing logic to iterate multiple pages (instead of only the first page).

Reviewed changes

Copilot reviewed 8 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/printer.js`	Wires accessibility tagging into page rendering; marks artifacts; handles table/list/link tagging during line rendering.
`src/layoutBuilder.js`	Produces `_accessibilityContext` metadata for lines and image/SVG/vector nodes; tracks list/table context while laying out content.
`src/accessibilityTagger.js`	New module that builds PDF structure elements (Document/Sect/P/H/L/Table/Figure/Link/Artifact) using PDFKit structure APIs.
`src/accessibilityMetadata.js`	New helper for determining accessibility roles and whether tables should be tagged/treated as TOC.
`src/pageElementWriter.js`	Changes unbreakable block commit logic to process all pages in the transaction context.
`src/elementWriter.js`	Fixes pushContext height calculation fallback to use `pageSize.height` when `page.height` is absent.
`webpack.config.js`	Keeps console statements in minified output by setting `drop_console: false`.
`package.json`	Renames package to `@wellcometrust/pdfmake` and bumps version to `2.1.0`.
`src/3rd-party/svg-to-pdfkit/source.js`	Vendor source update (appears to be a sync/copy of upstream file).

Comments suppressed due to low confidence (1)

src/pageElementWriter.js:126

commitUnbreakableBlock now iterates all unbreakableContext.pages, but the height calculation uses unbreakableContext.getCurrentPage().pageSize.height inside the loop. getCurrentPage() returns whatever unbreakableContext.page is currently set to, so for multi-page unbreakable blocks this can read the wrong page size for most fragments. Use fragment.pageSize.height (or set unbreakableContext.page = currentPage before reading) so each fragment uses its own page size.

		for (let currentPage = 0; currentPage < nbPages; currentPage++) {
			var fragment = unbreakableContext.pages[currentPage];
			fragment.xOffset = forcedX;
			fragment.yOffset = forcedY;

			//TODO: vectors can influence height in some situations
			if (nbPages > 1) {
				// on out-of-context blocs (headers, footers, background) height should be the whole DocumentContext height
				if (forcedX !== undefined || forcedY !== undefined) {
					fragment.height = unbreakableContext.getCurrentPage().pageSize.height - unbreakableContext.pageMargins.top - unbreakableContext.pageMargins.bottom;
				} else {
					fragment.height = this.writer.context.getCurrentPage().pageSize.height - this.writer.context.pageMargins.top - this.writer.context.pageMargins.bottom;
					for (var i = 0, l = this.repeatables.length; i < l; i++) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-23T14:45:08Z

src/printer.js

+	if (listCtx) {
+		if (!prevListCtx) {
+			// Entering a list
+			tagger.beginList();
+		} else if (listCtx.depth > prevListCtx.depth) {
+			// Entering a nested list
+			tagger.beginList();
+		} else if (listCtx.depth < prevListCtx.depth) {
+			// Leaving a nested list
+			tagger.endList();
+		}
+
+		// New list item?
+		if (listCtx.isFirstInItem) {
+			tagger.beginListItem();
+		}
+	} else if (prevListCtx) {
+		// Leaving list context entirely
+		tagger.endList();
+	}


List structure transitions only call tagger.endList() once when the list nesting depth decreases or when leaving list context entirely. If rendering jumps from a nested list depth >1 directly to depth 0 (e.g., the last list item contains only a nested list), this will leave the outer list(s) open and produce an invalid/incorrect structure tree. Update the logic to close lists in a loop until the open list depth matches the new listCtx.depth (or 0 when listCtx is null).

Copilot · 2026-03-23T14:45:09Z

src/pageElementWriter.js

-		if (nbPages > 0) {
-			// no support for multi-page unbreakableBlocks
-			var fragment = unbreakableContext.pages[0];
+		for (let currentPage = 0; currentPage < nbPages; currentPage++) {


This file is included in the browser bundle, but the webpack Babel rule only transpiles a limited set of files (see webpack.config.js include regex) targeting IE11. Introducing let here means the output may contain untranspiled ES2015 syntax and break IE11 builds. Either avoid let/const in non-transpiled files or update the Babel include list to cover this module.

Suggested change

for (let currentPage = 0; currentPage < nbPages; currentPage++) {

for (var currentPage = 0; currentPage < nbPages; currentPage++) {

Copilot · 2026-03-23T14:45:09Z

webpack.config.js

 					},
 					compress: {
-						drop_console: true
+						drop_console: false


Setting drop_console: false will retain console.* calls in the minified browser build. If this is intended, it should be documented; otherwise it increases bundle size and can leak internal/debug output in production. Consider restoring drop_console: true (or gating this by build mode).

Suggested change

drop_console: false

drop_console: true

Copilot · 2026-03-23T14:45:10Z

package.json

 {
-  "name": "pdfmake",
-  "version": "0.2.23",
+  "name": "@wellcometrust/pdfmake",
+  "version": "2.1.0",
  "description": "Client/server side PDF printing in pure JavaScript",


This PR is titled/described as adding accessibility tagging, but it also changes the published package name and bumps the version from 0.2.23 to 2.1.0. That’s a significant distribution/API change that should either be called out in the PR description/release notes or split into a separate PR to keep the change set scoped.

Copilot · 2026-03-23T14:45:10Z

src/printer.js

+	var tagger = null;
+	if (docDefinition.tagged) {
+		tagger = new AccessibilityTagger(this.pdfKitDoc);
+		tagger.initDocument();
+	}
+
+	renderPages(pages, this.fontProvider, this.pdfKitDoc, patterns, options.progressCallback, tagger);
+
+	if (tagger) {
+		tagger.finalise();
+	}


Accessibility tagging is newly introduced here (creating an AccessibilityTagger, passing it through rendering, and finalising it), but there are no corresponding tests asserting that tagged PDFs emit the expected structure tree / marked-content operations. Since tests/printer.js exists, consider adding coverage that verifies the tagger is invoked when docDefinition.tagged is true (e.g., spies on PDFKit struct/mark calls) and not invoked otherwise.

Copilot · 2026-03-23T14:45:10Z

src/accessibilityTagger.js

+	// Track what's currently open for page break handling
+	this.openStructures = [];
+


openStructures is initialized but never used, which makes it harder to tell what state is actually required for tagging. Consider removing it or wiring it into the page-break handling logic if it’s meant to track open elements.

Suggested change

// Track what's currently open for page break handling

this.openStructures = [];

stuartwilkes added 3 commits March 23, 2026 10:04

Initial restructure of pdfmake accessibility

8559e02

Issues resolved

d9bed8d

Released version 2.1.0: Refactor accessibility amends

346d921

stuartwilkes requested a review from Copilot March 23, 2026 14:38

Copilot started reviewing on behalf of stuartwilkes March 23, 2026 14:39 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add accessibilty tagging#2

Add accessibilty tagging#2
stuartwilkes wants to merge 3 commits intowellcome-mainfrom
task/accessibilty-v2

stuartwilkes commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	for (let currentPage = 0; currentPage < nbPages; currentPage++) {
	for (var currentPage = 0; currentPage < nbPages; currentPage++) {

		// Track what's currently open for page break handling
		this.openStructures = [];

Conversation

stuartwilkes commented Mar 23, 2026

What does this change?

How to test

How can we measure success?

Have we considered potential risks?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants