Astrid UX advisor: AI critique on every UI pull request

Status: Delivered (skill scaffold)
CAS: CAS-2743
Delivered: 2026-05-14
PRs: #727

What’s new

Astrid (the Reader) now has an astrid-ux-advisor skill that lets her review screenshots from a UI PR and post a ranked list of UX issues a real user would notice — critical, major, or minor — before the PR merges. This is an advisory layer on top of the mechanical gates (Loki visual diff, Playwright sanity, axe accessibility): Astrid adds the judgment that deterministic tools can’t provide. Her output is a PR comment tagged [UX Advisor — advisory only], not a merge blocker.

How to use it

Astrid’s skill is advisory and does not block merge on its own. When triggered on a UI PR:

Astrid receives the PR diff, new screenshots (from Loki or Playwright CI artifacts), the affected route names, and relevant component snippets.
She identifies UX issues a real user would file and ranks them: Critical (blocks usage) / Major (degrades UX) / Minor (polish).
For each finding she cites the element or route, explains the failure mode in one sentence, and suggests a fix shape in one sentence.
The output is posted as a PR comment. Eivind reviews and triages the findings before approving.

The skill is loaded via Astrid’s existing subscription mechanism — no changes to her agent configuration were required. Dispatch wiring (the GitHub Action or Paperclip routine that fires Astrid when screenshots are ready) is a follow-on step.

What changed under the hood

.agents/skills/astrid-ux-advisor/SKILL.md — New skill defining the system prompt, input schema (pr_diff, screenshots, routes, component_snippets), output schema (finding list with severity + citation + fix shape), and the PR comment template including the [UX Advisor — advisory only] tag.
Reuses existing Astrid infrastructure from CAS-2664 (project awareness, screenshot vision, LocalCli routing) — no new model spend.

Why we built it

Loki catches pixel-level drift; axe catches accessibility violations. Neither catches the UX judgment calls — a button that is technically correct but confusingly labelled, a flow that works but overwhelms the user, a state transition that is logically sound but visually disorienting. Eivind’s manual walk catches these, but Eivind can only walk a PR when he’s explicitly invited and the simulator build succeeds. Astrid provides the judgment layer from screenshots alone, without a device, in seconds. The two signals (Eivind’s walk + Astrid’s critique) give the CTO a richer picture of a UI PR than either signal alone.

Known limitations / follow-on work

The dispatch trigger (GitHub Action workflow_run or Paperclip routine that fires Astrid when CI screenshots are ready) is not yet wired. Saga owns this follow-on work.
End-to-end verification (Astrid posting a real advisory comment on a real PR) is pending the dispatch trigger.
Astrid’s system prompt is a first draft; Folke will refine it after seeing the first few real outputs.