Skip to content

Eivind's autonomous UX walk — skill scaffolded and routine wired

Eivind’s autonomous UX walk — skill scaffolded and routine wired

Status: Delivered
CAS: CAS-2737
Delivered: 2026-05-14
PRs: direct to master (88f7ef8a)

What’s new

Every UI-touching pull request now gets a structured UX walk from Eivind the Listener (the court’s UX Specialist agent) — automatically, without the engineer having to request it. Eivind knows which routes to walk based on which files changed, runs a five-step checklist per route, captures screenshots, diffs against the established visual baseline, and posts a PASS / FIND list / FAILED-TO-WALK verdict on the PR. The walk fires every 30 minutes via a Paperclip routine.

How to use it

For engineers: No action needed. When your PR touches src/components/**, src/pages/**, or native Apple files, Eivind is woken automatically. His verdict appears as a PR comment before the CTO reviews.

For the CTO: A PR with a FIND verdict from Eivind must be addressed before merge — either the engineer fixes the finding or the finding is documented as a known limitation. A PASS is a green signal on the UX dimension.

For the board: You can read Eivind’s walk verdicts directly on any UI PR. Findings cite a UX-* rule ID so you can see exactly which guideline was violated and how it was remediated.

What changed under the hood

  • .agents/skills/eivind-pr-ux-walk/SKILL.md — the new skill file: route derivation table (26 page routes mapped to app paths), five-step walk checklist (cold launch → navigate → action → return → portrait/landscape), screenshot naming convention, visual diff instructions against docs/ux-baseline/, and the three verdict templates (PASS, FIND, FAILED-TO-WALK).
  • detect-ui-pr.sh — script that checks a PR’s changed files against the UI globs and returns the list of routes to walk.
  • routine-ux-walk-trigger.sh — polls open PRs, deduplicates via a state file (never walks the same PR + commit twice), and creates a per-PR Paperclip CAS assigned to Eivind.
  • Paperclip routine “Envoy’s Walk” — registered at */30 * * * *; runs routine-ux-walk-trigger.sh.
  • docs/ux-baseline/ios/ and docs/ux-baseline/macos/ — directory scaffold for baseline screenshots; Eivind populates these on first walk.
  • .agents/personas/envoy.md — updated to reference the new skill.

Why we built it

Every significant UI regression in the CAS-2460–2587 mobile arc — dead bands, stacked FABs, content obscured by the status bar — was discovered by the regent on a physical device after a TestFlight build shipped. Eivind demonstrated in CAS-2390 that a structured walk catches 12 issues in a single pass. The problem was the walk only happened when someone remembered to ask. This CAS makes the walk structural: it fires on every UI PR, automatically, before the CTO sees the diff.

Known limitations / follow-on work

  • The walk is currently manual-guided (Eivind reads the skill and executes the checklist); it is not yet scripted end-to-end. Full automation (script-driven Simulator launch, automated screenshot capture) is deferred follow-on work.
  • The Paperclip routine requires the Paperclip instance to be running. During the offline billing window the routine queues silently and fires on next startup.
  • The request-review.sh script also auto-adds Eivind when --screens is passed (CAS-2734); the routine is a belt-and-suspenders catch for PRs opened without that flag.