Simulation Bench
Methodology
Scoring
Protocol
Correlations
GitHub
Methodology
Scoring guide
— the 100-point human rubric.
Run protocol
— required deliverables, token capture, intervention recording.
How this dashboard is built
— data sources and pipeline.