OrgX Autonomous Initiative Benchmark
Public scorecards, datasets, and methodology notes for the OrgX benchmark program.
Published benchmark weeks
- local-openai-gpt-5-nano-full-public-judge-20260411: benchmark version 2026-q1 with 15 tasks
- local-openai-gpt-5-nano-full-judge-20260530: benchmark version 2026-q1 with 15 tasks