OrgX Benchmark Week local-openai-gpt-5-nano-full-public-judge-20260411
Public benchmark scorecard, dataset, and task bundle for OrgX benchmark week local-openai-gpt-5-nano-full-public-judge-20260411.
Benchmark summary
- Benchmark version: 2026-q1
- Tasks evaluated: 15
- Dataset: /benchmarks/local-openai-gpt-5-nano-full-public-judge-20260411/examples.json
- Scorecard: /benchmarks/local-openai-gpt-5-nano-full-public-judge-20260411/scorecard.csv
- Independent judgments: /benchmarks/local-openai-gpt-5-nano-full-public-judge-20260411/judgments.json
- Inspectable artifacts: 45