Astrid UX advisor: AI critique on every UI pull request
Astrid UX advisor: AI critique on every UI pull request
Status: Delivered (skill scaffold)
CAS: CAS-2743
Delivered: 2026-05-14
PRs: #727
What’s new
Astrid (the Reader) now has an astrid-ux-advisor skill that lets her review screenshots from a UI PR and post a ranked list of UX issues a real user would notice — critical, major, or minor — before the PR merges. This is an advisory layer on top of the mechanical gates (Loki visual diff, Playwright sanity, axe accessibility): Astrid adds the judgment that deterministic tools can’t provide. Her output is a PR comment tagged [UX Advisor — advisory only], not a merge blocker.
How to use it
Astrid’s skill is advisory and does not block merge on its own. When triggered on a UI PR:
- Astrid receives the PR diff, new screenshots (from Loki or Playwright CI artifacts), the affected route names, and relevant component snippets.
- She identifies UX issues a real user would file and ranks them: Critical (blocks usage) / Major (degrades UX) / Minor (polish).
- For each finding she cites the element or route, explains the failure mode in one sentence, and suggests a fix shape in one sentence.
- The output is posted as a PR comment. Eivind reviews and triages the findings before approving.
The skill is loaded via Astrid’s existing subscription mechanism — no changes to her agent configuration were required. Dispatch wiring (the GitHub Action or Paperclip routine that fires Astrid when screenshots are ready) is a follow-on step.
What changed under the hood
.agents/skills/astrid-ux-advisor/SKILL.md— New skill defining the system prompt, input schema (pr_diff,screenshots,routes,component_snippets), output schema (finding list with severity + citation + fix shape), and the PR comment template including the[UX Advisor — advisory only]tag.- Reuses existing Astrid infrastructure from CAS-2664 (project awareness, screenshot vision, LocalCli routing) — no new model spend.
Why we built it
Loki catches pixel-level drift; axe catches accessibility violations. Neither catches the UX judgment calls — a button that is technically correct but confusingly labelled, a flow that works but overwhelms the user, a state transition that is logically sound but visually disorienting. Eivind’s manual walk catches these, but Eivind can only walk a PR when he’s explicitly invited and the simulator build succeeds. Astrid provides the judgment layer from screenshots alone, without a device, in seconds. The two signals (Eivind’s walk + Astrid’s critique) give the CTO a richer picture of a UI PR than either signal alone.
Known limitations / follow-on work
- The dispatch trigger (GitHub Action
workflow_runor Paperclip routine that fires Astrid when CI screenshots are ready) is not yet wired. Saga owns this follow-on work. - End-to-end verification (Astrid posting a real advisory comment on a real PR) is pending the dispatch trigger.
- Astrid’s system prompt is a first draft; Folke will refine it after seeing the first few real outputs.