Eval framework. Define correct, test against it, get results.
-
Updated
Feb 17, 2026 - Go
Eval framework. Define correct, test against it, get results.
A web-based interactive demo for the GuessArena evaluation framework
4-model parallel planning workflow with eval framework — Claude, Gemini, Codex, GLM-5 · OpenClaw ecosystem
🕵️♂️ Monitor Claude agents and code execution, identify issues, and gather insights to enhance AI workflows with Claudeye.
MCP server exposing portfolio tools (Semantic Search, Eval Framework, Observability) via Model Context Protocol
Enterprise agent/LLM platform with layered governance (RBAC, audit, policy-as-code); Azure OpenAI and RAG ready.
Add a description, image, and links to the eval-framework topic page so that developers can more easily learn about it.
To associate your repository with the eval-framework topic, visit your repo's landing page and select "manage topics."