Evaluation Tool2026-04-28template
LLM Experience Evaluation Scorecard
A scorecard for turning AI experience quality into measurable and reviewable dimensions.
LLM EvalScorecardDesign Ops
Scoring dimensions
Includes intent understanding, factuality, context continuity, explainability, controllability, recovery, and task completion quality.
Usage
Keep samples, scoring rationale, and version information so future model and product changes can be compared.