From 051e257a0f94250736b2405a14f18292bb82f388 Mon Sep 17 00:00:00 2001
From: Xuan-Ce Wang <x.wang4@uq.edu.au>
Date: Tue, 16 Jun 2026 01:49:51 +0800
Subject: [PATCH] feat(browser-qa): read-only safety default, baseline-or-die,
 honest a11y scope (#2186)

Additive-only hardening of skills/browser-qa/SKILL.md.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
---
 skills/browser-qa/SKILL.md | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/skills/browser-qa/SKILL.md b/skills/browser-qa/SKILL.md
index cda02445..bf069db0 100644
--- a/skills/browser-qa/SKILL.md
+++ b/skills/browser-qa/SKILL.md
@@ -18,6 +18,14 @@ origin: ECC
 
 Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to interact with live pages like a real user.
 
+### Safety first — blast radius (run read-only by default)
+
+Browser QA drives real auth and real user journeys, so treat the blast radius explicitly.
+Default to **read-only**: never run a **mutating** journey (checkout, payment, delete,
+mass-update) against a production URL — require an explicit opt-in **and** a staging/preview
+URL. Use seeded **test credentials**, never real production logins, and **redact**
+credentials/tokens/PII before saving any screenshot.
+
 ### Phase 1: Smoke Test
 ```
 1. Navigate to target URL
@@ -25,6 +33,7 @@ Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to
 3. Verify no 4xx/5xx in network requests
 4. Screenshot above-the-fold on desktop + mobile viewport
 5. Check Core Web Vitals: LCP < 2.5s, CLS < 0.1, INP < 200ms
+   (INP replaced FID in March 2024; thresholds per web.dev)
 ```
 
 ### Phase 2: Interaction Test
@@ -32,14 +41,17 @@ Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to
 1. Click every nav link — verify no dead links
 2. Submit forms with valid data — verify success state
 3. Submit forms with invalid data — verify error state
-4. Test auth flow: login → protected page → logout
+4. Test auth flow: login → protected page → logout (test creds only, never prod)
 5. Test critical user journeys (checkout, onboarding, search)
+   — read-only by default; only exercise mutating journeys against staging
+     with explicit opt-in (see "Safety first" above)
 ```
 
 ### Phase 3: Visual Regression
 ```
 1. Screenshot key pages at 3 breakpoints (375px, 768px, 1440px)
-2. Compare against baseline screenshots (if stored)
+2. Compare against committed baseline screenshots
+   — no baseline ⇒ report INCONCLUSIVE, never a silent PASS
 3. Flag layout shifts > 5px, missing elements, overflow
 4. Check dark mode if applicable
 ```
@@ -47,11 +59,15 @@ Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to
 ### Phase 4: Accessibility
 ```
 1. Run axe-core or equivalent on each page
-2. Flag WCAG AA violations (contrast, labels, focus order)
+2. Flag WCAG 2.2 AA violations (contrast, labels, focus order)
 3. Verify keyboard navigation works end-to-end
 4. Check screen reader landmarks
 ```
 
+> Note: axe-core automatically covers roughly 30–40% of WCAG. A clean run is **necessary,
+> not sufficient** — keyboard nav, focus order, and a screen-reader pass still need a manual
+> check. Don't report "accessible" from an automated pass alone.
+
 ## Output Format
 
 ```markdown
@@ -75,6 +91,7 @@ Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to
 - 2 AA violations: missing alt text on hero image, low contrast on footer links
 
 ### Verdict: SHIP WITH FIXES (2 issues, 0 blockers)
+# verdict ∈ SHIP / SHIP WITH FIXES / DO NOT SHIP; use INCONCLUSIVE if no visual baseline
 ```
 
 ## Integration