2026-04-29__001_synthetic_mine_throughputclaude-codeclaude-opus-4-7__agent-teams-nelson-max-thinking

Date: 2026-04-29 · Benchmark: 001_synthetic_mine_throughput · Harness: claude-code · Model: claude-opus-4-7 (agent-teams-nelson-max-thinking) · ? Unrecorded

Scores

Category	Points	Max
Conceptual modelling	19	20
Data and topology	15	15
Simulation correctness	19	20
Experimental design	14	15
Results & interpretation	14	15
Code quality	9	10
Traceability	5	5
Total	95	100

Run metrics

Total tokens: — (method: unknown)
Input / output tokens: — / —
Runtime: — s
Reviewer model: unknown · harness: claude-code · on 2026-04-29
Recommendation: Strong submission
Notes: Clean SimPy DES with rigorous warmup-aware metrics and full event-log auditability; Nelson red-cell review correctly flagged a dead module-level dict that wasn't removed before submission, plus the unused PriorityResource priority parameter.

Evaluation report

Automated checks: 53 / 53 (100%)
Behavioural checks: — / —
Download full evaluation_report.json

Scenario	Mean throughput
baseline	12,960
trucks_4	7,813.333
trucks_12	12,996.667
ramp_upgrade	13,033.333
crusher_slowdown	6,560
ramp_closed	12,883.333

Source files

README.mdmarkdown · 14.9 KB
conceptual_model.mdmarkdown · 9.1 KB
data/dump_points.csvcsv · 134 B
data/edges.csvcsv · 2.5 KB
data/loaders.csvcsv · 160 B
data/nodes.csvcsv · 1.2 KB
data/scenarios/baseline.yamlyaml · 632 B
data/scenarios/crusher_slowdown.yamlyaml · 268 B
data/scenarios/ramp_closed.yamlyaml · 200 B
data/scenarios/ramp_upgrade.yamlyaml · 207 B
data/scenarios/trucks_12.yamlyaml · 112 B
data/scenarios/trucks_4.yamlyaml · 109 B
data/trucks.csvcsv · 424 B
prompt.mdmarkdown · 10.5 KB
requirements.txttext · 91 B
results/evaluation_report.jsonjson · 12.3 KB
run.pypython · 2.5 KB
src/__init__.pypython · 52 B
src/metrics.pypython · 9.1 KB
src/run_experiments.pypython · 11.7 KB
src/scenario.pypython · 1.6 KB
src/simulation.pypython · 25.0 KB
src/topology.pypython · 7.8 KB
submission.yamlyaml · 513 B
summary.jsonjson · 16.5 KB

Downloads

Conceptual model

Conceptual Model: Synthetic Mine Throughput Simulation

This document describes the conceptual model underlying the discrete-event simulation (DES) of a synthetic open-pit mine haulage system. The simulation is built with SimPy and is designed to answer six operational decision questions about mine throughput under varying fleet sizes, road configurations, and crusher conditions.

System Boundary

Included in the model:

Truck fleet operating over an 8-hour shift
Road network connecting parking, loading points, the crusher, and bypass routes
Two ore loading points (North Pit and South Pit faces)
One primary crusher (dump destination)
Capacity-constrained road segments that limit simultaneous truck occupancy
Stochastic loading, dumping, and travel times
Dispatcher that assigns trucks to loaders and routes them via shortest-time paths

Excluded from the model:

Equipment breakdowns and unplanned downtime
Shift changes and crew availability
Fuel consumption, tyre wear, and maintenance schedules
Weather and ground condition variability
Ore grade and blending requirements
Blast cycles and face advance
Waste haulage (trucks run ore-to-crusher cycles only)
Multi-payload trucks (each truck carries a single 100-tonne bucket per cycle)

Entities

Trucks are the only active entities. Each truck is characterised by:

truck_id — unique identifier
payload_tonnes — 100 t per truck
empty_speed_factor — 1.00 (full road speed when empty)
loaded_speed_factor — 0.85 (15 % speed reduction when carrying ore)
start_node — PARK (all trucks begin at the parking area)

Ore payloads are not modelled as separate entities; a loaded truck implicitly carries 100 tonnes and delivers them when it completes a dump cycle.

Resources

Resources limit simultaneous access and cause queuing when saturated.

Resource	Node / Edge	Capacity	Notes
Loader North	LOAD_N	1	Mean load time 6.5 min, SD 1.2 min
Loader South	LOAD_S	1	Mean load time 4.5 min, SD 1.0 min
Crusher	CRUSH	1	Mean dump time 3.5 min, SD 0.8 min
Ramp outbound	E03_UP (J2→J3)	1	Narrow uphill ramp; primary bottleneck
Ramp inbound	E03_DOWN (J3→J2)	1	Same physical constraint, separate edge
Crusher approach outbound	E05_TO_CRUSH (J4→CRUSH)	1	Single-lane dump approach
Crusher approach inbound	E05_FROM_CRUSH (CRUSH→J4)	1	Single-lane return
North pit face access outbound	E07_TO_LOAD_N (J5→LOAD_N)	1	Single-lane face road
North pit face access inbound	E07_FROM_LOAD_N (LOAD_N→J5)	1	Single-lane face road
South pit face access outbound	E09_TO_LOAD_S (J6→LOAD_S)	1	Single-lane face road
South pit face access inbound	E09_FROM_LOAD_S (LOAD_S→J6)	1	Single-lane face road

All edges with capacity < 999 are wrapped as SimPy Resource objects with that capacity. Edges with capacity = 999 are treated as unconstrained (trucks travel freely after a travel-time delay).

Events

Each truck cycles through the following events repeatedly until the shift ends:

Truck dispatched — Dispatcher assigns a truck (at PARK or returning from crusher) to an available loader or the loader with the shortest expected wait.
Truck departs toward loader — Truck acquires capacity on each road segment in sequence along the shortest-time path.
Truck arrives at loader queue — Truck requests the loader resource.
Loading starts — Loader resource granted; loading time sampled from truncated normal.
Loading ends — Truck now carries 100 t; loader resource released.
Truck departs toward crusher — Truck travels loaded at 85 % of road speed, acquiring segment resources along the route.
Truck arrives at crusher queue — Truck requests the crusher resource.
Dumping starts — Crusher resource granted; dump time sampled from truncated normal.
Dumping ends — Truck delivers 100 t to the crusher; crusher resource released; tonnes_delivered counter incremented.
Truck returns empty — Truck travels empty back to PARK (or is immediately re-dispatched if a loader is waiting).
Shift end — At shift_length_hours * 3600 simulation seconds, in-flight cycles are counted if the truck has already completed loading (ore is already in transit); cycles not yet loaded are abandoned.

State Variables

Variable	Description
`truck.location`	Current node or edge of each truck
`truck.loaded`	Boolean — whether the truck is carrying ore
`truck.assigned_loader`	Loader currently assigned to this truck (None when idle)
`queue_length[resource]`	Number of trucks waiting for each loader/crusher/road segment
`resource_busy_time[resource]`	Cumulative seconds each resource has been in use
`tonnes_delivered`	Running total of ore (tonnes) dumped at the crusher per replication
`cycle_times`	List of full cycle durations (dispatch → dump end) per truck per replication
`truck_wait_time[loader	crusher]`

Assumptions

Derived from Data

Loader service times are normally distributed with parameters from loaders.csv (L_N: mean 6.5 min, SD 1.2 min; L_S: mean 4.5 min, SD 1.0 min).
Crusher dump times are normally distributed (mean 3.5 min, SD 0.8 min) from dump_points.csv.
Edge distances, maximum speeds, road types, and capacity limits are taken directly from edges.csv.
Truck payload (100 t) and speed factors are taken from trucks.csv.
The ramp edges (E03_UP, E03_DOWN) have capacity 1, making them the intended structural bottleneck.
The bypass route (J2→J7→J8→J4) exists as an alternative when the ramp is closed or congested, with edges E15, E16, E17 having capacity 999.

Introduced by the Model

Routing policy: shortest-time path computed with Dijkstra’s algorithm over open edges, where edge traversal time = distance / (max_speed × speed_factor). Capacity-constrained edges include expected wait time in the cost estimate.
Dispatch policy: nearest_available_loader — assign the idle loader with the shortest expected travel time from the truck’s current position. Tie-breaker: shortest_expected_cycle_time — prefer the loader that minimises total expected cycle duration including queue wait.
Travel time noise: Each edge traversal time is perturbed by a multiplicative factor sampled from a truncated normal distribution with coefficient of variation CV = 0.10 (i.e., SD = 10 % of mean travel time, truncated at ±30 %).
Loading/dumping time floor: Truncated normal distributions are lower-bounded at 1 minute to avoid zero or negative service times.
In-flight cycle counting: At shift end, trucks that have completed loading but have not yet dumped are counted as partial credit (ore in transit); trucks still loading or travelling empty are not counted.
No warmup period: The baseline scenario uses no warmup (warmup_minutes = 0); all trucks start from PARK at time 0.
Reproducibility: Each replication uses seed base_random_seed + replication_index to ensure independent, reproducible streams.

Limitations

No equipment failures or random breakdowns.
No shift-change effects (continuous 8-hour operation).
No fuel or maintenance constraints.
No ore blending or grade tracking.
Single loader per loading point (no shovel relocation).
Road capacity modelled as a count of simultaneous trucks, not a physical queue length.
No interaction between loaded and empty trucks on shared edges (both directions modelled as separate, independent resources).
Bypass route capacity is unlimited (999); in reality a bypass may also have width constraints.
The model does not account for truck acceleration/deceleration profiles.

Performance Measures

Measure	Definition	How Computed
Tonnes per hour (t/h)	Total ore delivered to crusher divided by shift duration	`tonnes_delivered / shift_length_hours` per replication; mean and 95 % CI across replications
Total tonnes delivered	Cumulative ore dumped at CRUSH per shift	Sum of 100 t increments at each dump event
Truck cycle time (min)	Time from truck dispatch to end of dump	Recorded for each completed cycle; mean and SD reported
Loader utilisation	Fraction of shift time a loader is actively loading	`resource_busy_time[loader] / shift_length_seconds`
Crusher utilisation	Fraction of shift time the crusher is actively dumping	`resource_busy_time[crusher] / shift_length_seconds`
Queue wait time (min)	Mean time trucks wait for each resource	`total_wait_time[resource] / number_of_service_events`
Top bottlenecks	Resources with highest utilisation or wait time	Ranked by utilisation across all resources
95 % confidence interval	Uncertainty estimate on mean t/h	`mean ± 1.96 × (SD / sqrt(replications))`

Results are aggregated across 30 replications per scenario and written to results.csv (one row per replication) and summary.json (scenario-level statistics).

README

Synthetic Mine Throughput — SimPy Discrete-Event Simulation

A discrete-event simulation of an open-pit mine haulage system built with SimPy. Six operational scenarios are modelled over an 8-hour shift with 30 replications each.

Install

Python 3.11+ is recommended.

pip install -r requirements.txt

Or with uv:

uv sync

How to Run

Run all six required scenarios:

python run.py

Run a single scenario:

python run.py --scenario baseline

Run with a custom replication count:

python run.py --scenario baseline --replications 30

Available scenario IDs: baseline, trucks_4, trucks_12, ramp_upgrade, crusher_slowdown, ramp_closed.

Run with a warmup period (excludes the first N minutes from queue / utilisation statistics; throughput denominator becomes shift - warmup):

python run.py --scenario baseline --warmup-minutes 30

Shipped scenarios use warmup_minutes: 0; the CLI flag overrides that for ad hoc steady-state analysis.

How to Reproduce Results

Seeds are controlled per replication: seed = base_random_seed + replication_index. The baseline scenario uses base_random_seed = 12345, giving seeds 12345–12374 across 30 replications. All other scenarios inherit this setting unless overridden in their YAML. Running python run.py with no arguments reproduces the published results.csv, summary.json, and event_log.csv exactly.

Conceptual Model

See conceptual_model.md for the full model description.

Summary: Trucks cycle from a central parking area to one of two ore loaders (North Pit or South Pit), then travel loaded to the primary crusher, dump 100 t, and return empty. Resources that can form queues — loaders, crusher, and single-lane road segments — are modelled as SimPy Resource objects. The dispatcher assigns each idle truck to the nearest available loader, breaking ties by shortest expected cycle time. Routing uses shortest-time Dijkstra over open edges, so the bypass route (J2→J7→J8→J4) is used automatically when it is faster than the main ramp.

Main Assumptions

Loader and crusher service times are sampled from truncated normal distributions parameterised by mean and SD from the input CSV files.
Travel time per edge has multiplicative noise with CV = 0.10 (10 % standard deviation).
All trucks start at PARK at time 0; no warmup period.
A dump cycle is counted if dump_start < shift_end; a 60-minute grace window allows trucks already at the crusher at shift end to complete their delivery.
Capacity-constrained edges (capacity < 999 in edges.csv) are modelled as SimPy Resource objects with that capacity; all other edges are delay-only.
No breakdowns, maintenance, shift changes, or fuel constraints.

For the full assumptions list see conceptual_model.md.

Routing and Dispatching Logic

Routing: Shortest-time Dijkstra over open edges. Edge traversal time is computed as distance / (max_speed_kph × speed_factor), where speed_factor = 0.85 when loaded and 1.00 when empty. Closed edges are excluded from the graph. If no path exists the simulation raises an error rather than producing silent wrong results.

Dispatching: nearest_available_loader — the idle truck is assigned to the loader with the shortest expected travel time from the truck’s current position. When two loaders have equal travel time the tie is broken by shortest_expected_cycle_time, which accounts for estimated queue wait at each loader.

Capacity-constrained segments: Edges with capacity = 1 in edges.csv are wrapped as SimPy Resource(env, capacity=1). A truck must acquire the resource before traversing the edge and releases it on arrival. Separate resources are used for each direction. Affected edges in the baseline topology:

Edge	Route	Direction
E03_UP	J2 → J3 (main ramp)	Outbound
E03_DOWN	J3 → J2 (main ramp)	Inbound
E05_TO_CRUSH	J4 → CRUSH	Outbound
E05_FROM_CRUSH	CRUSH → J4	Inbound
E07_TO_LOAD_N	J5 → LOAD_N	Outbound
E07_FROM_LOAD_N	LOAD_N → J5	Inbound
E09_TO_LOAD_S	J6 → LOAD_S	Outbound
E09_FROM_LOAD_S	LOAD_S → J6	Inbound

The bypass route (E15/E16/E17) has capacity 999 and is treated as unconstrained.

Key Results

All figures are from summary.json, 30 replications × 8-hour shift. 95 % CI computed using Student’s t-distribution with df = n − 1 (scipy.stats.t.interval(0.95, df=29)).

Scenario	Trucks	t/h (mean)	95 % CI	Crusher util	Avg cycle (min)
baseline	8	1620	[1611, 1629]	0.95	28.9
trucks_4	4	977	[972, 981]	0.57	24.1
trucks_12	12	1625	[1612, 1637]	0.95	42.7
ramp_upgrade	8	1629	[1620, 1639]	0.95	28.8
crusher_slowdown	8	820	[812, 828]	0.96	55.9
ramp_closed	8	1610	[1599, 1621]	0.95	29.1

Answers to the 6 Operational Decision Questions

Q1: What is the baseline throughput?

1620 t/h [95 % CI: 1611–1629], equivalent to 12,960 t per 8-hour shift. Mean truck cycle time is 28.9 minutes. The crusher runs at 95 % utilisation, indicating it is near saturation under the baseline 8-truck fleet.

Q2: What are the likely bottlenecks?

The top_bottlenecks ranking in summary.json (sorted by utilisation, then queue time) lists D_CRUSH (crusher) first in every scenario:

Scenario	Top bottleneck	Utilisation	Mean queue (min)
baseline	D_CRUSH	0.95	4.6
trucks_4	D_CRUSH	0.57	0.8
trucks_12	D_CRUSH	0.96	17.2
ramp_upgrade	D_CRUSH	0.95	4.6
crusher_slowdown	D_CRUSH	0.96	27.9
ramp_closed	D_CRUSH	0.95	4.7

The crusher is the binding resource in every configuration except trucks_4, where it runs at 0.57 utilisation and the fleet is the binding constraint instead. The South loader (L_S) is consistently second-highest by utilisation (0.91 baseline, 0.92 trucks_12) because the dispatcher preferentially sends trucks to the faster loader when it is idle.

Note on the narrow ramp (E03_UP). A naive ranking by mean queue time would surface E03_UP at the top of the baseline list (6.0 min mean queue), but its utilisation is only 3.3 % — inconsistent with a true bottleneck. This queue is a startup-stampede artifact: at t = 0 all 8 trucks dispatch simultaneously, the nearest-loader policy sends them all toward L_S via E03_UP, and they queue once. After the first cycle, the fleet has spread across both loaders and E03_UP is essentially unused — routing for L_N already bypasses it via J2→J7→J5. Sorting top_bottlenecks by utilisation (with queue time as tiebreaker) removes this misleading artifact while leaving the underlying data visible in results.csv.

Q3: How sensitive is throughput to fleet size?

Fleet	t/h	Change vs. baseline
4 trucks	977	−40 %
8 trucks (baseline)	1620	—
12 trucks	1625	+0.3 %

The system is strongly fleet-limited below 8 trucks and crusher-saturated above 8. Adding trucks beyond the baseline provides almost no gain (1625 vs. 1620 t/h, within the confidence intervals). The crusher service rate (~3.5 min per dump, capacity 1) sets a theoretical ceiling of approximately 1629 t/h under the baseline payload and shift length. Any further throughput gain requires either a faster crusher or a second crusher rather than additional trucks.

Q4: What is the impact of upgrading the main ramp?

Marginal: 1629 t/h vs. 1620 t/h baseline — a 0.6 % improvement, within noise.

The ramp upgrade (E03_UP/DOWN capacity raised to 999, speed raised from 18/22 to 28 km/h) removes the capacity constraint on the main ramp. However, the baseline routing already directs L_N-bound trucks via the bypass (J2→J7→J5), which is faster than the narrow ramp. Only L_S-bound trucks use E03, and these are spread out enough in steady state that the ramp is not a binding constraint. The ramp is correctly absent from the top_bottlenecks list in both baseline and ramp_upgrade once results are ranked by utilisation, confirming the ramp was not limiting throughput.

Recommendation: Do not invest in a ramp upgrade to increase throughput. The crusher is the binding resource.

Q5: How sensitive is throughput to crusher service time?

Highly sensitive: a doubling of mean dump time (3.5 → 7.0 min) drops throughput by 49 % (1620 → 820 t/h).

Under crusher_slowdown, the crusher remains at 0.96 utilisation but now processes trucks at half the rate. Mean crusher queue time rises from 4.6 to 27.9 minutes, and average truck cycle time extends from 28.9 to 55.9 minutes. Loader utilisations drop sharply (L_S: 0.91 → 0.42; L_N: 0.50 → 0.37) as trucks spend most of their cycle waiting at the crusher. The system is highly sensitive to crusher throughput because the crusher is the single-server bottleneck for the entire fleet.

Recommendation: Crusher reliability and service rate are the most critical operational parameters. Even moderate crusher slowdowns (e.g. blocked chutes, liner wear) will have a disproportionate effect on shift tonnage.

Q6: What happens if the main ramp is closed?

Small impact: 1610 t/h vs. 1620 t/h baseline — a 0.6 % reduction. Rerouting via the bypass is fully viable.

When E03_UP and E03_DOWN are closed, the router automatically finds paths through the western bypass (J2→J7→J8→J4 for L_S-bound trucks; J2→J7→J5 for L_N-bound trucks). The bypass adds some travel distance but the route times are comparable. Crusher utilisation remains at 0.95 and truck utilisation is essentially unchanged (0.783 baseline vs. 0.782 ramp_closed). The confidence intervals overlap substantially ([1611–1629] baseline vs. [1599–1621] ramp_closed), so the difference is not statistically significant at the 95 % level.

Note: In results.csv, the edge_E03_UP_queue_time and edge_E03_DOWN_queue_time columns are 0.0 for ramp_closed and ramp_upgrade scenarios — this is correct because those edges do not exist as resources in those scenarios (closed or unconstrained respectively), not a data error.

Recommendation: The bypass provides adequate rerouting capacity. A ramp closure need not halt production, though travel times are slightly longer for L_S-bound trucks.

Likely Bottlenecks

Based on utilisation and queue-time analysis across all scenarios:

Crusher (D_CRUSH) — the primary steady-state bottleneck in all scenarios except trucks_4. Utilisation 0.95–0.96; mean queue wait 4.6–27.9 min depending on service rate. Any reduction in crusher throughput has an immediate and disproportionate effect on overall t/h.
Loader South (L_S) — consistently second-highest utilisation (0.91 baseline, 0.92 trucks_12). The South loader is faster (4.5 min mean) but heavily loaded because the dispatcher preferentially assigns trucks there when it is idle. Queue time 1.6 min baseline, rising to 2.2 min with 12 trucks.
Crusher approach road (E05_TO_CRUSH) — single-lane access to the crusher, utilisation ~0.43–0.45. Not a bottleneck at current fleet sizes but could become one if throughput increases.
South pit return road (E09_FROM_LOAD_S) — single-lane pit access, utilisation ~0.48 baseline. Co-occupies the South pit cycle alongside L_S; not currently binding but the highest-utilisation road segment.

A startup-transient artifact appears on E03_UP (high queue time, ~3 % utilisation) for the first ~20 minutes of each replication while the fleet spreads from PARK. This is correctly excluded from the bottleneck ranking by sorting on utilisation; see Q2.

Limitations

No equipment breakdowns or random downtime for trucks, loaders, or crusher.
No shift changes, refuelling stops, or operator breaks during the 8-hour window.
Opposing-direction traffic on physically single-lane segments does not interact via meet-and-pass logic; each direction is an independent SimPy resource.
Bypass route (E15–E17) is treated as unconstrained (capacity 999); in reality a bypass may have width or grade limits.
The shipped scenarios use warmup_minutes: 0, so a startup-stampede transient on E03_UP is visible in the early minutes of each replication. Warmup support is implemented in the runner — pass --warmup-minutes 30 on the CLI to exclude the transient from queue and utilisation statistics for ad-hoc analysis.
Ore is delivered in fixed 100-tonne increments; no partial payloads or blend control.
Truck speed is a constant factor per edge; no acceleration/deceleration or switchback effects.
Service time distributions are simple truncated normals; bimodal or heavy-tailed effects (e.g. blocked chutes, swell factors) are not captured.
The edge_E03_UP_queue_time and edge_E03_DOWN_queue_time columns in results.csv are 0.0 for ramp_upgrade and ramp_closed scenarios because those edges are removed as constrained resources in those scenarios — this is expected behaviour, not missing data.

Suggested Improvements and Further Scenarios

Warmup period in shipped scenarios — warmup_minutes support is implemented in the runner but the shipped YAMLs use 0. Bump baseline to 30–60 min in the YAML to make the bottleneck ranking and queue-time statistics reflect steady-state operation only. The CLI override (--warmup-minutes 30) provides the same effect ad hoc.
Stochastic breakdowns — model loader and crusher failures using exponential time-to-failure and lognormal repair times to assess availability risk.
Second crusher scenario — add a second crusher unit to test whether a parallel dump point breaks the current throughput ceiling.
Shift-change scenario — introduce a 15-minute production pause at hour 4 to quantify the tonnage cost of crew changeover.
Dynamic dispatch with real-time queue feedback — upgrade the dispatcher to use live queue lengths rather than estimated wait times for assignment decisions.
Bypass capacity constraint scenario — set E15/E16/E17 capacity to 1 or 2 to test whether the bypass becomes a bottleneck if the ramp is closed long-term.
Sensitivity analysis on CV — vary travel time noise (CV = 0.05, 0.10, 0.20) to quantify how road condition variability affects throughput confidence intervals.

Output Files

File	Description
`results.csv`	One row per replication per scenario (180 rows); scenario-level and replication-level metrics
`summary.json`	Scenario-level statistics: mean t/h, 95 % CI, utilisations, queue times, bottleneck ranking
`event_log.csv`	Full event trace (~92,000 events); columns: time, truck_id, event_type, node, tonnes
`topology.png`	Visualisation of the road network graph with node types and edge capacities
`conceptual_model.md`	Formal conceptual model document

← Back to leaderboard