AHD · Positioning
What AHD is.
AHD is a guardrail and evaluation layer for AI-generated design. Web UI, graphic design, illustration, image generation. It is not a generator itself; it sits beside any generator and measures whether the output exhibits the specific, repeated failure modes that mark AI-generated design as AI-generated.
Four pieces, one purpose
-
A named taxonomy of AI design slop.
Thirty-eight concrete tells across web, graphic and typographic surfaces. Enforced today by twenty-eight HTML and CSS rules, three SVG rules, and thirteen vision-critic rules on rendered pixels. The rule count is higher than the taxonomy count because several entries are covered by more than one rule.
-
Style tokens as promptable design direction.
Ten curated bundles spanning web, editorial, identity, illustration and image-generation surfaces. Each declares grid or composition, type, palette, forbidden list, required quirks, reference lineage and per-model prompt fragments.
-
A brief compiler.
Turns a structured intent into constrained model instructions for any surface, with a final mode for single-shot output and a draft mode for human-in-the-loop exploration.
-
An empirical eval loop.
A controlled raw-vs-compiled comparison across any set of text or image generators, scored against the taxonomy, with attempted-vs-scored counts, canonical model identifiers, per-model deltas and per-tell frequency. Vision critique on rendered pixels via a multimodal critic.
What AHD is not
Not a prompt pack. Prompt packs sell style recipes. AHD's value is the reproducible scoring that tells you whether any recipe — ours or yours — actually moves a given model off its median.
Not a canvas product. Galileo, v0, Lovable, Bolt, Magic Patterns, Subframe optimise "prompt to shipped UI." Midjourney, Krea, Lovart optimise "prompt to image." AHD sits beside any of them as an enforcement layer.
Not a design system. Design systems prescribe components. AHD prescribes the thirty-eight patterns a page or image must not exhibit, and measures compliance.
What makes this defensible
The moat is not the prompts. The moat is the taxonomy plus reproducible scoring. A prompt anyone can rewrite; a named, versioned taxonomy with deterministic lint rules and a vision critic is an artefact that compounds with use. A style token anyone can fork; a per-release eval harness that publishes attempted counts, extraction failures, exact model identifiers, confidence intervals and negative results is a cultural commitment competitors rarely match.
Prior art
Pieces of AHD exist in the wild. The combination does not.
Prompt libraries for AI UI generation (uiprompt.io, Promter, GenDesigns, WebGardens) encode style direction; they do not carry a taxonomy or an eval.
Design-token linters (@lapidist/design-lint, stylelint-design-tokens-plugin) enforce token consistency in source. AHD's rules target AI-generated anti-patterns, not adherence to an internal design system.
Figma-era audit tools (DesignLint AI) audit design files against token rules. AHD audits rendered output and source, not design files.
AI UI benchmarks (UI Bench) score generated HTML on engineering quality — axe, Lighthouse, semantics. AHD rates a page's slop fingerprint under a paired raw-vs-compiled control.
What nobody else bundles: a named AI-slop taxonomy spanning web and image, a token-driven brief compiler, a deterministic linter for source-checkable tells, a vision critic for rendered tells, and a raw-vs-compiled empirical eval loop, all in one reproducible project.
What we promise and what we don't
We promise an honest, versioned taxonomy spanning web, graphic and illustration. We promise a deterministic source-level linter covering every taxonomy entry that can be decided from code. We promise a vision-critic pipeline that works on any rendered image. We promise an eval harness that publishes attempted, extracted and scored counts, canonical model identifiers, and per-model deltas including negative results.
We do not promise the compiled brief beats the raw brief for every model. It does not. The measured run publishes Claude Opus 4.7 dropping tells to zero, Qwen 2.5 Coder unmoved, Llama 3.3 70B regressing under the compiled prompt, and SDXL Lightning ignoring the image negative entirely. The framework exposes these differences; it does not paper over them.
We do not promise aesthetic judgement. The linter catches tells, not taste. A page or image can pass every rule and still be bad design. AHD narrows the output; a human still picks.
Adjacent reading: the thirty-eight-tell taxonomy, how we measure, how to use AHD in production.