ResearchGate — From Holistic Evaluation to Structured Criteria: A Survey of Rubrics Across the Evolving LLM Landscape
preprint · source date 2026-05-31 · 0 comments · original
1. Problems / challenges / motivations - As LLMs move from task-specific systems toward open-ended agents, one scalar score is often too opaque. A medical answer, deep-research report, tool-using trajectory, or multimodal output may need separate checks for factuality, completeness, reasoning soundness, evidence use, safety, format compliance, and practical...