CLAUDE.mdStorybook MCPFrontend Review MCPPlaywrightESLintPercyFigma MCPAdvanced~6 hours

Managing Design Consistency During Agentic Builds

Build steps

01

Set Up the Screenshot-Driven QA Loop

Install Frontend Review MCP — the visual verification layer that lets agents self-check their work. The workflow: agent makes UI changes → captures before/after screenshots → Frontend Review MCP compares them → agent gets 'yes' (approved) or 'no' (with explanation) → agent refines. This creates a closed verification loop that catches design drift at the moment it happens, not after merge. For more advanced iterative refinement, use Playwright MCP + Pixelmatch to generate visual diff images that feed directly back to the agent as context.

// 1. Install Frontend Review MCP
// In your Claude Code config or Cursor settings:
{
  "mcpServers": {
    "frontend-review": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/frontend-review-mcp"],
      "env": {
        "REVIEW_MODEL": "Qwen/Qwen2-VL-72B-Instruct"
      }
    },
    "browser-tools": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/browser-tools-mcp"]
    }
  }
}

// 2. Playwright + Pixelmatch iterative loop
// scripts/visual-diff.ts
import { chromium } from "playwright";
import { PNG } from "pngjs";
import pixelmatch from "pixelmatch";
import { readFileSync, writeFileSync } from "fs";

async function captureAndDiff(url: string, baselinePath: string) {
  const browser = await chromium.launch();
  const page = await browser.newPage({ viewport: { width: 1280, height: 720 } });
  await page.goto(url);
  await page.waitForLoadState("networkidle");

  const currentBuffer = await page.screenshot({ fullPage: true });
  writeFileSync("current.png", currentBuffer);

  const baseline = PNG.sync.read(readFileSync(baselinePath));
  const current = PNG.sync.read(currentBuffer);
  const diff = new PNG({ width: baseline.width, height: baseline.height });

  const numDiffPixels = pixelmatch(
    baseline.data, current.data, diff.data,
    baseline.width, baseline.height,
    { threshold: 0.1 }
  );

  writeFileSync("diff.png", PNG.sync.write(diff));
  const totalPixels = baseline.width * baseline.height;
  const diffPercent = ((numDiffPixels / totalPixels) * 100).toFixed(2);

  console.log(`Diff: ${diffPercent}% (${numDiffPixels} pixels)`);
  await browser.close();
  return parseFloat(diffPercent);
}

// Usage: agent runs this, sees diff%, refines code, runs again
// Loop until diff < 1%

02

Add Design System MCP for Live Component Context

Set up Storybook MCP so your agent has live access to your component catalog during coding. This is how monday.com achieved code that 'looks like someone who deeply understands the system wrote it.' The MCP exposes component lists, prop types with defaults, example code from stories, and documentation — all as structured JSON the agent queries in real-time.

// .storybook/main.ts — enable Component Manifest + MCP
const config = {
  addons: ["@storybook/addon-mcp"],
  experimentalComponentsManifest: true,
};

// Claude Code MCP config:
{
  "mcpServers": {
    "storybook": {
      "command": "npx",
      "args": ["storybook", "mcp", "--config-dir", ".storybook"]
    }
  }
}

// Now your agent can query:
// "What props does the Card component accept?"
// "Show me the Button component variants"
// "What's the correct import path for PageLayout?"

// For Figma-connected workflows, add Figma MCP:
{
  "mcpServers": {
    "figma": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/figma-mcp"],
      "env": { "FIGMA_ACCESS_TOKEN": "your-token" }
    }
  }
}

03

Implement CI/CD Design Gates

Add automated enforcement that blocks merges when design drift is detected. Three gates: (1) ESLint custom rules ban raw values, (2) Stylelint detects non-token CSS values, (3) Playwright visual regression catches pixel-level drift. Builds fail on violations — this is the 'wash your hands' of design consistency.

// .github/workflows/design-gate.yml
name: Design Consistency Gate
on: [pull_request]

jobs:
  lint-design:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: "20" }
      - run: npm ci

      # Gate 1: No hardcoded design values
      - name: Lint design tokens
        run: npx eslint --rule 'no-hardcoded-design-values: error' 'src/**/*.{ts,tsx}'

      # Gate 2: No non-token CSS values
      - name: Lint CSS
        run: npx stylelint 'src/**/*.css'

      # Gate 3: Visual regression
      - name: Install Playwright
        run: npx playwright install --with-deps chromium
      - name: Start dev server
        run: npm run dev &
        env: { PORT: "3000" }
      - name: Wait for server
        run: npx wait-on http://localhost:3000 --timeout 30000
      - name: Run visual regression
        run: npx playwright test tests/visual-regression.spec.ts
      - name: Upload diff artifacts
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: visual-diffs
          path: tests/visual-regression.spec.ts-snapshots/

  # For Percy (premium):
  # - name: Percy snapshot
  #   run: npx percy exec -- npx playwright test
  #   env: { PERCY_TOKEN: ${{ secrets.PERCY_TOKEN }} }

04

Create a Session-Start Design Checklist

Every agent session should start by loading design context. Create a hook or checklist that runs before any UI work. This is the 'context engineering' layer — making sure the agent has the right information before it starts coding.

// scripts/design-preflight.ts
// Run at the start of every agent session that touches UI
import { readFileSync, existsSync } from "fs";

function preflight() {
  const checks: { name: string; pass: boolean; detail: string }[] = [];

  // Check 1: Design tokens exist
  const tokensExist = existsSync("src/styles/tokens.css");
  checks.push({
    name: "Design tokens file",
    pass: tokensExist,
    detail: tokensExist ? "src/styles/tokens.css found" : "MISSING — create tokens first",
  });

  // Check 2: design-system.ts exists
  const dsExist = existsSync("src/lib/design-system.ts");
  checks.push({
    name: "Design system constants",
    pass: dsExist,
    detail: dsExist ? "src/lib/design-system.ts found" : "MISSING — create DS constants",
  });

  // Check 3: CLAUDE.md has design section
  if (existsSync("CLAUDE.md")) {
    const claude = readFileSync("CLAUDE.md", "utf-8");
    const hasDesign = claude.includes("Design System") || claude.includes("design-system");
    checks.push({
      name: "CLAUDE.md design section",
      pass: hasDesign,
      detail: hasDesign ? "Design rules found in CLAUDE.md" : "MISSING — add design rules to CLAUDE.md",
    });
  }

  // Check 4: Visual regression baselines exist
  const baselinesExist = existsSync("tests/visual-regression.spec.ts-snapshots");
  checks.push({
    name: "VRT baselines",
    pass: baselinesExist,
    detail: baselinesExist ? "Baselines captured" : "Run: npx playwright test --update-snapshots",
  });

  // Report
  console.log("\n=== Design Preflight Check ===\n");
  for (const c of checks) {
    console.log(`  ${c.pass ? "✓" : "✗"} ${c.name}: ${c.detail}`);
  }
  const allPass = checks.every((c) => c.pass);
  console.log(`\n${allPass ? "All checks passed." : "⚠ Fix failing checks before UI work."}\n`);
  return allPass;
}

preflight();

05

Build the Design Audit Script (Retrospective Sweep)

Create a script that audits the entire codebase for design drift — hardcoded hex values, non-standard fonts, incorrect max-widths, missing breadcrumbs, and other violations. Run this after each major build session or as part of your weekly maintenance. This is the retrospective layer that catches anything the CI gates missed.

// scripts/lint-design.ts
import { readFileSync, readdirSync, statSync } from "fs";
import { join } from "path";

interface Violation {
  file: string;
  line: number;
  rule: string;
  value: string;
}

const violations: Violation[] = [];
const HEX_REGEX = /#[0-9a-fA-F]{3,8}/g;
const BANNED_FONTS = /["']?(Inter|Georgia|DM Serif|Times|serif)["']?/gi;
const BANNED_WIDTHS = /max-width:\s*(\d+)px/g;
const VALID_WIDTHS = [1200, 860];

function scanFile(filePath: string) {
  if (!filePath.match(/\.(tsx?|css)$/)) return;
  if (filePath.includes("node_modules")) return;

  const content = readFileSync(filePath, "utf-8");
  const lines = content.split("\n");

  lines.forEach((line, i) => {
    // Rule 1: No hardcoded hex
    const hexMatches = line.match(HEX_REGEX);
    if (hexMatches) {
      for (const hex of hexMatches) {
        // Skip CSS variable declarations in globals.css
        if (filePath.endsWith("globals.css") && line.includes("--")) continue;
        violations.push({ file: filePath, line: i + 1, rule: "no-hardcoded-hex", value: hex });
      }
    }

    // Rule 2: No banned fonts
    const fontMatches = line.match(BANNED_FONTS);
    if (fontMatches) {
      for (const font of fontMatches) {
        violations.push({ file: filePath, line: i + 1, rule: "no-banned-font", value: font });
      }
    }

    // Rule 3: Only valid max-widths
    let widthMatch;
    while ((widthMatch = BANNED_WIDTHS.exec(line)) !== null) {
      const width = parseInt(widthMatch[1]);
      if (!VALID_WIDTHS.includes(width)) {
        violations.push({ file: filePath, line: i + 1, rule: "invalid-max-width", value: `${width}px` });
      }
    }
  });
}

function scanDir(dir: string) {
  for (const entry of readdirSync(dir)) {
    const full = join(dir, entry);
    if (statSync(full).isDirectory()) {
      if (!["node_modules", ".next", ".git", "dist"].includes(entry)) {
        scanDir(full);
      }
    } else {
      scanFile(full);
    }
  }
}

// Run scan
scanDir("src");
scanDir("app");

// Report
if (violations.length === 0) {
  console.log("\n✓ No design violations found.\n");
} else {
  console.log(`\n✗ ${violations.length} design violations found:\n`);
  for (const v of violations) {
    console.log(`  ${v.file}:${v.line} — [${v.rule}] ${v.value}`);
  }
  process.exit(1);
}

06

Establish the Drift Detection Workflow

Create a lightweight daily/weekly process that catches drift before it compounds. The key insight: budget 20-30% of capacity for design debt remediation (Pixelmojo framework). Treat design drift like bugs — track it, triage it, fix it. The workflow: (1) Run lint-design.ts at session end, (2) Run visual regression after every PR, (3) Review visual diffs weekly, (4) Update baselines only when design changes are intentional.

// Add to your CLAUDE.md or session protocol:
//
// ## Session-End Design Checklist
// 1. Run: npx tsx scripts/lint-design.ts
// 2. Run: npx playwright test tests/visual-regression.spec.ts
// 3. If violations found → fix before committing
// 4. If visual diffs found → verify they are intentional
// 5. Update baselines ONLY for intentional changes:
//    npx playwright test --update-snapshots

// Claude Code Hook (runs automatically on UI file changes):
// .claude/hooks/post-edit.sh
#!/bin/bash
CHANGED_FILES=$(git diff --name-only HEAD)
if echo "$CHANGED_FILES" | grep -qE '\.(tsx|css)$'; then
  echo "UI files changed — running design lint..."
  npx tsx scripts/lint-design.ts
fi

// Weekly design health metric:
// scripts/design-health.ts
import { execSync } from "child_process";

const violations = execSync("npx tsx scripts/lint-design.ts 2>&1 || true", { encoding: "utf-8" });
const violationCount = (violations.match(/design violations/i) || []).length > 0
  ? parseInt(violations.match(/(\d+) design violations/)?.[1] || "0")
  : 0;

const vrtResult = execSync("npx playwright test tests/visual-regression.spec.ts 2>&1 || true", { encoding: "utf-8" });
const vrtFailed = vrtResult.includes("failed");

console.log("=== Design Health Report ===");
console.log(`  Lint violations: ${violationCount}`);
console.log(`  Visual regression: ${vrtFailed ? "DRIFT DETECTED" : "Clean"}`);
console.log(`  Health: ${violationCount === 0 && !vrtFailed ? "HEALTHY" : "NEEDS ATTENTION"}`);

Managing Design Consistency During Agentic Builds

When to use this blueprint

Prerequisites

Build steps

Edge cases & gotchas

Tools used

What this unlocks