Skip to content

Astrid UX advisor: AI critique on every UI pull request

Astrid UX advisor: AI critique on every UI pull request

Status: Delivered (skill scaffold)
CAS: CAS-2743
Delivered: 2026-05-14
PRs: #727

What’s new

Astrid (the Reader) now has an astrid-ux-advisor skill that lets her review screenshots from a UI PR and post a ranked list of UX issues a real user would notice — critical, major, or minor — before the PR merges. This is an advisory layer on top of the mechanical gates (Loki visual diff, Playwright sanity, axe accessibility): Astrid adds the judgment that deterministic tools can’t provide. Her output is a PR comment tagged [UX Advisor — advisory only], not a merge blocker.

How to use it

Astrid’s skill is advisory and does not block merge on its own. When triggered on a UI PR:

  1. Astrid receives the PR diff, new screenshots (from Loki or Playwright CI artifacts), the affected route names, and relevant component snippets.
  2. She identifies UX issues a real user would file and ranks them: Critical (blocks usage) / Major (degrades UX) / Minor (polish).
  3. For each finding she cites the element or route, explains the failure mode in one sentence, and suggests a fix shape in one sentence.
  4. The output is posted as a PR comment. Eivind reviews and triages the findings before approving.

The skill is loaded via Astrid’s existing subscription mechanism — no changes to her agent configuration were required. Dispatch wiring (the GitHub Action or Paperclip routine that fires Astrid when screenshots are ready) is a follow-on step.

What changed under the hood

  • .agents/skills/astrid-ux-advisor/SKILL.md — New skill defining the system prompt, input schema (pr_diff, screenshots, routes, component_snippets), output schema (finding list with severity + citation + fix shape), and the PR comment template including the [UX Advisor — advisory only] tag.
  • Reuses existing Astrid infrastructure from CAS-2664 (project awareness, screenshot vision, LocalCli routing) — no new model spend.

Why we built it

Loki catches pixel-level drift; axe catches accessibility violations. Neither catches the UX judgment calls — a button that is technically correct but confusingly labelled, a flow that works but overwhelms the user, a state transition that is logically sound but visually disorienting. Eivind’s manual walk catches these, but Eivind can only walk a PR when he’s explicitly invited and the simulator build succeeds. Astrid provides the judgment layer from screenshots alone, without a device, in seconds. The two signals (Eivind’s walk + Astrid’s critique) give the CTO a richer picture of a UI PR than either signal alone.

Known limitations / follow-on work

  • The dispatch trigger (GitHub Action workflow_run or Paperclip routine that fires Astrid when CI screenshots are ready) is not yet wired. Saga owns this follow-on work.
  • End-to-end verification (Astrid posting a real advisory comment on a real PR) is pending the dispatch trigger.
  • Astrid’s system prompt is a first draft; Folke will refine it after seeing the first few real outputs.