How this dashboard is built — Simulation Bench

The dashboard is a static site rebuilt from two sources of truth in the repository:

scores/scores.db — SQLite DB rebuilt from scores/seed_scores.json.
submissions/<id>/ — one folder per run, holding code, results, conceptual model, and per-run metadata.

Pipeline

scores/scores.db ─┐
submissions/      ├─→ harness/build_dashboard.py ─→ dashboard/src/  ─→ astro build ─→ dist/  ─→ fly deploy
docs/methodology  ┘

Quality — scores.scores.total_score; sourced from seed_scores.json.
Tokens — token_usage.json.total_tokens per submission. Method (exact/reported/estimated/unknown) is shown on hover.
Time — run_metrics.json.runtime_seconds.
Intervention — submission.yaml.intervention.category.

The site is fully baked into a Caddy container — there is no runtime database, API, or auth surface. To rebuild:

make dashboard   # rebuild from sources
make deploy      # build + push to fly.io