// guide

Pa11y CI Setup: Automated Accessibility Testing in Your Pipeline

Pa11y CI runs accessibility scans across your whole site and fails the build when violations appear. This guide walks through installing it, writing the config, testing from a sitemap, wiring it into GitHub Actions, and tracking results over time with Pa11y Dashboard.

beginner testing

// 01 · what pa11y ci is

What Pa11y CI Is

Pa11y CI is a command-line accessibility test runner built for continuous integration. Where the core pa11y command scans one URL and prints a report, pa11y-ci reads a list of URLs (or a sitemap), scans them all in a single run, and exits with a non-zero status code when any page exceeds its violation threshold. That exit code is the whole point: it is what makes a CI system mark the build as failed.

Under the hood Pa11y CI drives a headless Chromium instance through Puppeteer, loads each page, and evaluates the rendered DOM against either the HTML CodeSniffer ruleset or axe-core. It is the natural next step after the automated accessibility testing overview: that guide compares the tools; this one sets one of them up end to end.

Pa11y vs Pa11y CI pa11y https://example.com checks a single page interactively. pa11y-ci checks many pages from a config file and fails the build on violations. Use the former while developing, the latter in your pipeline.

// 02 · installation

Installation

Install Pa11y CI as a dev dependency so the exact version is pinned in your lockfile and every contributor and CI run uses the same one. It pulls in its own copy of Chromium, so no separate browser install is needed.

Terminal
# Install as a dev dependency (recommended)
npm install --save-dev pa11y-ci

# Run it through npx
npx pa11y-ci --help

# Or scan a single URL with no config to confirm it works
npx pa11y-ci https://example.com

Add a script to package.json so the command is short and consistent across local runs and CI:

package.json
{
  "scripts": {
    "test:a11y": "pa11y-ci --config .pa11yci.json"
  }
}

// 03 · the .pa11yci config file

The .pa11yci Config File

Pa11y CI reads its settings from a JSON file — by convention .pa11yci or .pa11yci.json. It has two parts: defaults, applied to every URL, and urls, the list of pages to scan. A URL entry can be a plain string or an object that overrides the defaults for that one page.

.pa11yci.json
{
  "defaults": {
    "standard": "WCAG2AA",
    "runners": ["axe"],
    "timeout": 30000,
    "threshold": 0,
    "hideElements": "#cookie-banner, .third-party-widget"
  },
  "urls": [
    "http://localhost:3000/",
    "http://localhost:3000/patterns/",
    {
      "url": "http://localhost:3000/legacy-page",
      "threshold": 8
    }
  ]
}
Option What it does Recommended value
standard WCAG conformance level to test against WCAG2AA
runners Which engine evaluates the page ["axe"]
threshold Violations allowed before the page fails 0 for new pages
timeout Max milliseconds to load and test a page 30000
hideElements CSS selectors to exclude from the scan Third-party widgets you cannot fix
Standardize on the axe runner Setting "runners": ["axe"] makes your CI results match the axe DevTools extension your developers already use in the browser — so a violation caught in CI is reproducible locally with one click.

// 04 · testing your whole site from a sitemap

Testing Your Whole Site from a Sitemap

Maintaining a hand-written URL list goes stale the moment someone adds a page. Point Pa11y CI at your sitemap.xml instead and every page in the sitemap is covered automatically:

Terminal
# Scan every URL listed in the sitemap
npx pa11y-ci --sitemap https://example.com/sitemap.xml

# Test a local build by rewriting the production host to localhost
npx pa11y-ci \
  --sitemap https://example.com/sitemap.xml \
  --sitemap-find https://example.com \
  --sitemap-replace http://localhost:3000

The --sitemap-find and --sitemap-replace flags rewrite each URL's hostname on the fly, so you can reuse your production sitemap to test a freshly built local copy or a deploy preview. Defaults from a .pa11yci file still apply when you scan from a sitemap, so you keep your standard, runners, and timeout settings.

The sitemap drives a global threshold Sitemap mode applies one threshold from defaults to every page — you cannot set per-URL thresholds the way you can with an explicit urls list. If you need different thresholds for legacy pages, list those pages explicitly instead.

// 05 · running pa11y ci in github actions

Running Pa11y CI in GitHub Actions

The highest-value place to run Pa11y CI is on every pull request, against a freshly built copy of the site. The workflow below builds the site, serves it locally, waits for the server, then scans it. A non-zero exit from pa11y-ci fails the job and blocks the merge.

.github/workflows/pa11y.yml
name: Accessibility (Pa11y CI)

on:
  pull_request:
    branches: [main]

jobs:
  pa11y:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Build site
        run: npm run build

      - name: Serve site
        run: npx http-server ./dist -p 3000 &

      - name: Wait for server
        run: npx wait-on http://localhost:3000

      - name: Run Pa11y CI
        run: npm run test:a11y

Because the config points at http://localhost:3000, the same npm run test:a11y script runs identically on a developer's machine and in CI — no environment-specific branching.

// 06 · reading and acting on results

Reading and Acting on Results

Pa11y CI prints a per-URL summary and, for each failing page, the list of violations with the rule that fired, a human-readable message, and a CSS selector pointing at the offending element:

Terminal output
Running Pa11y on 3 URLs:
 > http://localhost:3000/ - 0 errors
 > http://localhost:3000/patterns/ - 0 errors
 > http://localhost:3000/contact - 2 errors

Errors in http://localhost:3000/contact:
 • Elements must have sufficient color contrast (color-contrast)
   (#newsletter-cta > .btn)
 • Form elements must have labels (label)
   (#email)

✘ 2 errors found across 3 URLs

To capture results for a build artifact or a dashboard, add --reporter json and redirect the output to a file. The JSON reporter emits a machine-readable record of every URL and violation, which you can attach to the CI run or feed into your own reporting:

Terminal
npx pa11y-ci --config .pa11yci.json --reporter json > pa11y-results.json
Automated tests catch about 30% Pa11y CI is excellent at the deterministic third of WCAG — contrast, labels, ARIA validity, duplicate IDs. It cannot judge focus order, screen reader announcements, or whether alt text is meaningful. Pair it with the manual steps in the Testing Checklist.

// 07 · thresholds and baselines

Thresholds and Baselines

Turning on a zero-tolerance gate against an existing site floods the first build with failures and tempts the team to disable the whole thing. The fix is a baseline: set each page's threshold to its current violation count so the build passes today, then ratchet the numbers down as you fix issues. New pages start at zero.

.pa11yci.json
{
  "defaults": {
    "standard": "WCAG2AA",
    "runners": ["axe"],
    "threshold": 0
  },
  "urls": [
    "http://localhost:3000/",
    { "url": "http://localhost:3000/dashboard", "threshold": 4 },
    { "url": "http://localhost:3000/legacy-report", "threshold": 11 }
  ]
}

The discipline is one-directional: a threshold may only ever go down. Every time you fix a violation, lower the number so the gap cannot silently refill. Record the totals over time — a steadily shrinking sum is proof your accessibility debt is being paid down rather than just contained.

Start permissive, then tighten Adopt the gate with today's counts as the baseline, get the whole team used to a green build, then drive each threshold toward zero. A gate the team trusts and keeps is worth more than a strict one they switch off.

// 08 · tracking trends with pa11y dashboard

Tracking Trends with Pa11y Dashboard

A per-build pass/fail answers "did this change break anything?" but not "are we getting better or worse over time?" Pa11y Dashboard answers the second question. It is a separate self-hosted web app that runs Pa11y against your URLs on a schedule, stores each run in a database, and charts the violation count per URL over time.

It is a deliberately different tool from pa11y-ci:

Pa11y CI Pa11y Dashboard
Runs On every push / pull request On a schedule (e.g. nightly)
Output Pass/fail exit code in the build log Web UI with historical trend charts
Best for Blocking regressions before merge Reporting progress to stakeholders

Most teams do not need the dashboard on day one. Start with pa11y-ci in the pipeline to stop regressions, and add Pa11y Dashboard later when someone needs to see the trend line — for a compliance report, a quarterly review, or to prove the backlog is shrinking. Both tools share the same underlying engine, so a violation looks identical in either.

The two-layer setup pa11y-ci on pull requests catches new problems at the moment they are introduced. Pa11y Dashboard on a nightly schedule shows whether the overall trend is up or down. Together they cover both "don't regress" and "keep improving."

Frequently asked questions

What is the difference between Pa11y and Pa11y CI?

Pa11y is the core tool — it scans a single URL and prints a report. Pa11y CI is a separate command (pa11y-ci) built on top of it for automation: it reads a config file with a list of URLs (or a sitemap), scans them all in one run, and exits with a non-zero status code when any page exceeds its violation threshold. That non-zero exit is what lets a CI system fail the build. Use plain pa11y for ad-hoc checks while developing; use pa11y-ci in your pipeline.

How do I test my entire site with Pa11y CI?

Point it at your sitemap: pa11y-ci --sitemap https://example.com/sitemap.xml. Pa11y CI fetches the sitemap, extracts every <loc> URL, and scans each one. If you test a local build, use --sitemap-find and --sitemap-replace to rewrite the production hostname to http://localhost:3000. This is the lowest-maintenance setup because new pages are covered automatically as long as they appear in the sitemap.

Should Pa11y CI use the HTML CodeSniffer or axe-core runner?

Use axe-core. Set "runners": ["axe"] in your config. axe-core has broader, better-maintained rule coverage and produces fewer false positives than the default HTML CodeSniffer runner. You can also run both (["axe", "htmlcs"]) to maximize coverage, but expect more noise. Most teams standardize on axe so their CI results match the axe DevTools extension developers use locally.

How do I stop Pa11y CI from failing on pre-existing violations?

Set a threshold — the number of violations allowed before a page fails. On a legacy site, start with a per-URL threshold equal to its current violation count so the build passes on day one, then ratchet each number down as you fix issues. New pages get "threshold": 0. This adopts the CI gate without a flood of failures, while preventing regressions on the code you have not cleaned up yet.

What is Pa11y Dashboard and do I need it?

Pa11y Dashboard is a separate self-hosted web app that runs Pa11y on a schedule and stores results in a database, so you can see accessibility trends per URL over time. It answers "are we getting better or worse?" — a question a per-build pass/fail cannot. It is optional: most teams start with pa11y-ci in their pipeline and only add the dashboard once they want historical tracking for reporting or compliance.

Can Pa11y CI test pages that need login or interaction?

Yes. Each URL entry in the config can carry an actions array — a list of steps like set field #email to user@example.com, click element #submit, and wait for url to be .... Pa11y replays those steps in the headless browser before scanning, so you can reach authenticated or multi-step pages. For complex flows, many teams instead run axe-core inside their existing Cypress or Playwright suite, which already handles auth.