AHD · Artificial Human Design · v0.5.0-beta

Make it specific.

A guardrail and evaluation layer for AI-generated design. A named taxonomy of thirty-eight slop tells, a token-driven brief compiler, a deterministic linter, and a reproducible raw-vs-compiled eval loop. Across web UI and image generation.

Measured · 21 April 2026

Same brief, same seed, raw vs AHD-compiled. Five models, n=5 per cell. Aggregate 2.08 → 1.04 tells per page. Click any row for the per-model reading.

Claude Opus 4.7: 100%↓
Mistral Small 3.1: 62%↓
Llama 4 Scout 17B: 50%↓
Qwen 2.5 Coder 32B: 0%
Llama 3.3 70B: regressed 150%↑

Full report with attempted-vs-scored counts, per-tell frequency table, and the run manifest: eval · 21 April 2026. How to read the numbers: methodology.

Four pieces

Named taxonomy

Thirty-eight concrete slop tells across web, graphic and typographic surfaces. Enforced by 28 HTML/CSS rules, 3 SVG rules, and 13 vision-critic rules on rendered pixels. Read the taxonomy.
Style tokens

Ten curated design directions spanning Swiss-Editorial, Manual SF, Neubrutalist-Gumroad, Post-Digital, Monochrome-Editorial, Memphis-Clash, Heisei-Retro, Bauhaus-Revival, Editorial- Illustration and Ad-Creative-Collision. Each declares its own forbidden list, required quirks and reference lineage.
Brief compiler

ahd compile takes a structured intent and emits a token-anchored system prompt for any LLM. Draft mode for exploration, final mode for single-shot output. See how.
Empirical eval

Raw-vs-compiled controlled comparison across Claude Opus 4.7, GPT-5, Gemini 3 Pro, Llama 3.3 70B, Llama 4 Scout, Mistral Small 3.1, Qwen 2.5 Coder, DeepSeek R1, and image generators FLUX.1 schnell, SDXL Lightning and DreamShaper. Attempted, extracted, scored counts published. Negative results first-class.

Make it specific.

Measured · 21 April 2026

Four pieces

Named taxonomy

Style tokens

Brief compiler

Empirical eval