# Conceptual Model: Synthetic Mine Throughput Simulation
Benchmark: `001_synthetic_mine_throughput`
Engine: SimPy discrete-event simulation
Shift length: 480 minutes (8 hours), hard cut at `t = 480`
This document specifies the conceptual model that the SimPy implementation under
`src/mine_sim/` realises. It follows the modelling-and-simulation convention of
separating *system boundary*, *entities*, *resources*, *events*, *state
variables*, *assumptions* (split between data-derived and introduced), *model
limitations*, and *performance measures*.
---
## 1. System boundary
### 1.1 Inside the boundary
The model represents one ore haulage shift on the synthetic open-pit mine
described by `data/nodes.csv`, `data/edges.csv`, `data/trucks.csv`,
`data/loaders.csv`, and `data/dump_points.csv`. Inside the boundary we include:
- **The ore production cycle** `PARK -> LOAD_{N|S} -> CRUSH -> LOAD_{N|S} -> ...`
for every truck, as a sequence of travel, queue, load, and dump events.
- **All directed road segments** in `edges.csv` that lie on a path between
`PARK`, `LOAD_N`, `LOAD_S`, and `CRUSH` (including the `J3-J4` ramp and
bypass alternatives `J7`-`J8`).
- **Capacity-constrained edges** (`capacity <= 1` in the CSV) modelled as
independent SimPy `Resource` objects, one per directed edge: `E03_UP`,
`E03_DOWN`, `E05_TO_CRUSH`, `E05_FROM_CRUSH`, `E07_TO_LOAD_N`,
`E07_FROM_LOAD_N`, `E09_TO_LOAD_S`, `E09_FROM_LOAD_S`.
- **The two ore loaders** `L_N` (at `LOAD_N`, mean 6.5 min) and `L_S`
(at `LOAD_S`, mean 4.5 min), each capacity 1.
- **The primary crusher** `D_CRUSH` (at `CRUSH`, mean dump 3.5 min,
sd 0.8 min), capacity 1.
- **Dispatching logic**: an *empty* truck chooses the loader that minimises
`travel_to_loader + current_queue_len * mean_load_time + own_load_time`.
- **Routing logic**: static shortest-time Dijkstra paths per
`(scenario, origin, destination)`, recomputed once per scenario load
(so closures in `ramp_closed` are honoured).
- **Stochastic effects** on per-edge travel (lognormal multiplier, mean 1,
cv 0.10), per-load time, and per-dump time (normal-truncated at
`max(0.1, sample)`).
- **Seven scenarios**: `baseline`, `trucks_4`, `trucks_12`, `ramp_upgrade`,
`crusher_slowdown`, `ramp_closed`, plus the proposed combo
`trucks_12_ramp_upgrade`.
### 1.2 Outside the boundary
These elements are deliberately excluded so the model stays focused on ore
throughput to the primary crusher:
- **Waste haulage and the `WASTE` dump** (`D_WASTE`, edges `E13_*`).
Trucks never visit `WASTE` in this model.
- **Maintenance / refuelling at `MAINT`** (edges `E14_*`). The `availability`
field on trucks is treated as `1.0` for the active shift; we do not model
random breakdown, refuelling, or shift breaks.
- **Operator behaviour**: shift handovers, lunch breaks, manual overrides.
- **Weather, dust, visibility, grade-dependent fuel burn**, and any non-time
effects on cycle execution.
- **Ore quality / blending** at the crusher; tonnes are treated as a single
homogeneous bulk material.
- **Crusher downstream stockpile** dynamics; the crusher is always *able* to
receive a dump (only its service time constrains it).
- **Network effects between adjacent shifts**: the simulated shift starts
empty (all trucks at `PARK`, all queues empty) and ends with a hard cut at
`t = 480`.
### 1.3 Time horizon and termination
A single simulated shift lasts exactly 480 minutes. We enforce a **hard cut at
`t = 480`**: only `end_dump` events with `time_min < 480` contribute tonnes to
throughput. In-flight cycles at the cut are discarded. This is a deliberate
modelling choice that mirrors how an operator would value the *closed* tonnes
they can actually report at end-of-shift.
---
## 2. Entities
The dynamic, attribute-bearing things that flow through the system.
| Entity | Population | Key attributes | Lifecycle |
|---|---|---|---|
| **Truck** | 4, 8, or 12 (scenario-dependent), each starting at `PARK` | `truck_id`, `payload_tonnes` (100), `empty_speed_factor` (1.00), `loaded_speed_factor` (0.85), `availability` (1.00), current node, loaded flag, current loader assignment | dispatched at `t=0` -> repeat ore cycle until shift end |
A truck always carries either zero tonnes (empty) or `payload_tonnes` (loaded).
We treat the *ore payload* as an attribute on the truck rather than as a
separate entity, because no payload-level transformation occurs between the
loader and the crusher.
---
## 3. Resources
The static, capacity-bound things that *constrain* truck flow. All are SimPy
`Resource` objects so the engine handles waiting and FIFO queueing for us.
| Resource | Type | Capacity | Where in graph | Service-time distribution |
|---|---|---|---|---|
| `L_N` | Loader | 1 | node `LOAD_N` | `normal_truncated(mean=6.5, sd=1.2, lower=0.1)` min |
| `L_S` | Loader | 1 | node `LOAD_S` | `normal_truncated(mean=4.5, sd=1.0, lower=0.1)` min |
| `D_CRUSH` | Crusher (dump) | 1 | node `CRUSH` | `normal_truncated(mean=3.5, sd=0.8, lower=0.1)` min |
| `E03_UP` | Edge resource | 1 (or 999 in `ramp_upgrade`) | `J2 -> J3` | n/a (transit) |
| `E03_DOWN` | Edge resource | 1 (or 999 in `ramp_upgrade`, closed in `ramp_closed`) | `J3 -> J2` | n/a (transit) |
| `E05_TO_CRUSH` | Edge resource | 1 | `J4 -> CRUSH` | n/a |
| `E05_FROM_CRUSH` | Edge resource | 1 | `CRUSH -> J4` | n/a |
| `E07_TO_LOAD_N` | Edge resource | 1 | `J5 -> LOAD_N` | n/a |
| `E07_FROM_LOAD_N` | Edge resource | 1 | `LOAD_N -> J5` | n/a |
| `E09_TO_LOAD_S` | Edge resource | 1 | `J6 -> LOAD_S` | n/a |
| `E09_FROM_LOAD_S` | Edge resource | 1 | `LOAD_S -> J6` | n/a |
Edges with `capacity = 999` are treated as effectively unconstrained and are
modelled as plain time delays without a SimPy resource (SimPy resources have
fixed overhead per request, so this avoids spurious queue records on free
roads). Each direction of a single physical lane is mirrored *literally* from
the CSV as an independent `Resource`, in line with the Seed constraint.
---
## 4. Events
Every truck cycle produces the events below. They are recorded into
`event_log.csv` with columns `time_min, replication, scenario_id, truck_id,
event_type, from_node, to_node, location, loaded, payload_tonnes, resource_id,
queue_length`.
| Event type | Trigger | Notes |
|---|---|---|
| `dispatch` | `t = 0` for every truck | Initial release, all trucks released simultaneously |
| `arrive_loader` | Truck reaches the assigned loader's node | Recorded *before* requesting the loader resource |
| `start_load` | Loader resource granted | Records loader queue length at start |
| `end_load` | Truncated-normal load duration elapses | Truck flips to `loaded = True` |
| `depart_loader` | Truck releases the loader and starts travelling toward `CRUSH` | |
| `arrive_crusher` | Truck reaches `CRUSH` node | Recorded before requesting `D_CRUSH` |
| `start_dump` | `D_CRUSH` granted | Records crusher queue length |
| `end_dump` | Truncated-normal dump duration elapses; tonnes credited if `time_min < 480` | The throughput-defining event |
| `depart_crusher` | Truck releases `D_CRUSH` and starts travelling back to a loader | |
| `edge_enter` | Truck acquires a capacity-1 edge resource | `resource_id = edge_id` |
| `edge_leave` | Truck releases that edge resource | |
Travel along a non-capacity-constrained edge is a `simpy.Environment.timeout`
of `(distance / (max_speed * speed_factor)) * lognormal_multiplier`, with no
explicit event log entry. Travel along a capacity-constrained edge is the same
delay *while holding* the edge `Resource`, bracketed by `edge_enter` /
`edge_leave` events for traceability.
---
## 5. State variables
State that must be tracked to produce the required metrics, derived primarily
from SimPy's own bookkeeping plus a small per-replication accumulator object.
### 5.1 Per truck
- `current_node`: most recently arrived node.
- `loaded`: boolean.
- `current_loader_assignment`: `L_N` / `L_S` / `None`.
- `cycle_start_time` and `cycle_count`: rolling counters used to compute mean
cycle time.
- `productive_busy_time`: cumulative minutes spent in the productive part of
the cycle (loaded travel + dumping + dump-side queue + empty travel +
loading + load-side queue). Used for `truck_utilisation = productive / 480`.
### 5.2 Per resource
- For `L_N`, `L_S`, `D_CRUSH`: total `busy_time` (sum of service durations),
total `queue_wait_time` (sum of waits before a request is granted), and
number of services completed. Utilisation is `busy_time / 480`.
- For each capacity-1 edge resource: `busy_time`, `queue_wait_time`, and
number of traversals.
### 5.3 Per replication
- `total_tonnes_delivered`: `100 t * count(end_dump events with time < 480)`.
- `tonnes_per_hour`: `total_tonnes_delivered / 8`.
- `average_truck_cycle_time_min`: mean over completed full cycles (defined as
consecutive `end_dump` -> `end_dump` intervals, with the very first cycle
using `dispatch` -> `end_dump`).
- `average_truck_utilisation`: mean `productive_busy_time / 480` across trucks.
- `crusher_utilisation`: `D_CRUSH.busy_time / 480`.
- `loader_utilisation_{L_N, L_S}`: `loader.busy_time / 480`.
- `average_loader_queue_time_min`, `average_crusher_queue_time_min`: mean wait
time per service event at the loaders / crusher.
### 5.4 Per scenario
- Across the 30 replications, every per-replication metric is summarised as a
mean and a 95% Student-t confidence interval with `n - 1 = 29` degrees of
freedom.
- `top_bottlenecks`: ranked by composite score
`utilisation * mean_queue_wait_min`, computed for every loader, the
crusher, and every capacity-1 edge resource.
---
## 6. Assumptions
The benchmark prompt explicitly asks us to separate assumptions sourced from
the data from those we have introduced.
### 6.1 Data-derived assumptions
These come directly from the CSV / YAML inputs and are reproduced literally
in the model:
- **Topology**: 15 nodes (`PARK`, `J1`-`J8`, `LOAD_N`, `LOAD_S`, `CRUSH`,
`WASTE`, `MAINT`) and 35 directed edges, taken verbatim from `nodes.csv` /
`edges.csv`.
- **Capacity-constrained edges**: edges with `capacity <= 1` are modelled as
shared single-lane resources. From the CSV these are `E03_UP`, `E03_DOWN`,
`E05_TO_CRUSH`, `E05_FROM_CRUSH`, `E07_TO_LOAD_N`, `E07_FROM_LOAD_N`,
`E09_TO_LOAD_S`, `E09_FROM_LOAD_S`.
- **Loaders**: two loaders, capacity 1, with means 6.5 / 4.5 min and standard
deviations 1.2 / 1.0 min from `loaders.csv`.
- **Crusher**: single dump with capacity 1, mean 3.5 min, sd 0.8 min from
`dump_points.csv`.
- **Truck fleet**: 12 trucks defined in `trucks.csv`, each with payload
100 t, `empty_speed_factor = 1.00`, `loaded_speed_factor = 0.85`,
`availability = 1.00`, starting at `PARK`. Scenarios cap the active fleet
at 4, 8, or 12.
- **Free-flow edge times**: `distance_m / (max_speed_kph * 1000 / 60)`
minutes per edge, again with the speed-factor multiplier.
- **Scenario semantics**: closures, capacity overrides, and crusher service
changes are read from the YAML override blocks (`edge_overrides`,
`dump_point_overrides`, `fleet`).
- **Stochasticity recipe**: the YAML specifies
`loading_time_distribution: normal_truncated`,
`dumping_time_distribution: normal_truncated`, and
`travel_time_noise_cv: 0.10`.
### 6.2 Introduced assumptions
These choices fill in gaps the data does not specify; each is required to
make the simulation runnable and is documented here.
1. **Routing is static shortest-time per scenario**, recomputed by Dijkstra on
free-flow edge times whenever a scenario changes the edge set (closures or
capacity upgrades). Trucks do *not* re-plan during a replication, even if
queues form on capacity-1 edges. This trades a small amount of realism for
reproducibility and traceability.
2. **Travel-time noise** is a per-edge-traversal lognormal multiplier with
mean 1 and coefficient of variation 0.10. This honours
`travel_time_noise_cv: 0.10` while keeping multipliers strictly positive.
3. **Loading and dumping** are sampled as `normal_truncated` with the
loader/crusher mean and sd, truncated at `max(0.1, sample)` so a sample
below 0.1 min is replaced with 0.1 min rather than rejected and
resampled. This avoids zero / negative durations without biasing the mean.
4. **Dispatch policy**: each empty truck is assigned to
`argmin(travel_to_loader + current_queue_len * mean_load_time + own_load_time)`.
`current_queue_len` includes the truck currently being served. Ties are
broken by lower `loader_id` (`L_N` before `L_S`).
5. **Initial dispatch**: all trucks are released simultaneously at `t = 0`
from `PARK`. There is no staged ramp-up.
6. **Hard cut at `t = 480`**: only dumps completed strictly before 480 min
count toward throughput. In-flight loads or dumps at the cut are
discarded. This is consistent with the operator-facing "tonnes closed at
end of shift" interpretation.
7. **Truck utilisation = productive only**: time spent travelling, queueing,
loading, or dumping inside the ore cycle counts; idle time at `PARK` does
not. Specifically, post-shift idle time after the hard cut is excluded.
8. **Reachability self-check** at scenario load: if any of the OD pairs
`PARK<->LOAD_N`, `PARK<->LOAD_S`, `LOAD_N<->CRUSH`, `LOAD_S<->CRUSH` is
unreachable in the post-override graph, the scenario fails loudly rather
than silently producing zero throughput.
9. **Per-replication seed**: `seed_r = base_random_seed + replication_index`.
This makes individual replications independently reproducible while the
scenario as a whole is deterministic.
10. **`WASTE` and `MAINT` are out of scope** for this throughput study and
their edges are kept in the graph but never used. Routing therefore never
detours to them.
11. **Edge resources are independent per direction**, mirroring `edges.csv`
literally (`E03_UP` and `E03_DOWN` are two separate `Resource` objects).
A more realistic single-physical-lane model would couple them, but
the data treats them as separate edges and we follow the data.
12. **Crusher tonnes are credited at `end_dump`**, not at `start_dump` or
`arrive_crusher`. This matches the standard SimPy convention for
"service complete" and aligns with the prompt's instruction that
throughput is measured by completed dump events.
### 6.3 Combo scenario rationale
In addition to the six required scenarios, we propose
**`trucks_12_ramp_upgrade`**: 12 trucks combined with the upgraded ramp.
The rationale is that `trucks_12` alone is expected to saturate at the
capacity-1 ramp, and `ramp_upgrade` alone is expected to be limited by fleet
size at 8. The combo isolates the joint effect, telling the operator whether
the two investments are complementary (super-additive), substitutive
(sub-additive), or independent.
---
## 7. Limitations
These are areas where the model is deliberately simpler than the real system,
and a user of the results should keep them in mind.
- **No re-routing during the shift**: trucks commit to the static
shortest-time path even if a capacity-1 edge develops a long queue. In
reality, a dispatcher might divert a truck through the bypass.
- **Independent edge directions**: `E03_UP` and `E03_DOWN` are treated as two
separate single-lane resources. If the physical ramp is genuinely a single
lane shared by both directions, real congestion will be worse than
modelled.
- **No truck-truck interaction on free-flow edges**: capacity 999 edges are
treated as effectively infinite. Real haul roads have finite headway.
- **Deterministic mechanical availability**: no random truck breakdowns,
flat tyres, or refuelling; `availability = 1.00` for all 480 minutes.
- **No operator-level decisions**: no shift change, lunch break, or manual
override. Trucks cycle continuously.
- **No queue-length feedback in dispatch**: dispatch uses *current* queue
length at the moment of decision, but a truck en route does not influence
later dispatch decisions until it physically arrives.
- **Single-replication horizon**: a single 480-minute shift, no warmup
trimming. The empty-system bias is small because trucks reach steady state
within the first few cycles, but it is not zero.
- **Crusher always available**: the crusher never blocks (no full-bin
back-pressure from downstream stockpile, no maintenance windows).
- **Single ore type / single payload**: every dump is exactly 100 t and
treated as homogeneous.
- **Deterministic node coordinates**: animation uses Euclidean coordinates
from `nodes.csv` even though real haul roads bend. This affects the
visualisation only, not metrics.
---
## 8. Performance measures
The performance measures below are computed per replication and aggregated
per scenario across 30 replications using a Student-t 95% CI with `n - 1 = 29`
degrees of freedom.
### 8.1 Primary throughput measures
- **`total_tonnes_delivered`** (t):
`payload_tonnes * count(end_dump events at CRUSH with time_min < 480)`.
This is the headline number and the answer to operational question 1.
- **`tonnes_per_hour`** (t/h): `total_tonnes_delivered / 8`.
### 8.2 Cycle-level measures
- **`average_truck_cycle_time_min`**: mean wall-clock duration of completed
full cycles (between consecutive `end_dump` events for a truck, with the
first cycle measured from `dispatch`).
- **`average_truck_utilisation`**: mean per-truck
`productive_busy_time / 480`. "Productive" = travel + queue + load + dump.
### 8.3 Resource-level measures
- **`crusher_utilisation`** = `D_CRUSH.busy_time / 480`.
- **`loader_utilisation`** per loader = `loader.busy_time / 480`.
- **`average_loader_queue_time_min`** = mean wait per loader-service event,
averaged across loaders.
- **`average_crusher_queue_time_min`** = mean wait per crusher-service event.
- **Edge resource utilisation and queue wait** for every capacity-1 edge.
### 8.4 Bottleneck ranking
`top_bottlenecks` lists every constraining resource (loaders, crusher,
capacity-1 edges) ranked by
```
composite_score = utilisation * mean_queue_wait_min
```
This composite captures both *how busy* a resource is and *how much actual
delay* it imposes. A near-saturated resource with no queue (e.g. a fast
loader with a very short queue) is correctly down-weighted relative to a
near-saturated resource that is also creating long waits.
### 8.5 Uncertainty quantification
For every reported scalar `x`, the 95% confidence interval is
```
mean(x) +/- t_{0.975, n-1} * std(x) / sqrt(n)
```
with `n = 30`. This is reported as `xxx_ci95_low` / `xxx_ci95_high` in
`summary.json`.
### 8.6 Decision-question linkage
The operational decision questions are answered using these measures:
| Question | Primary measure(s) |
|---|---|
| Q1 baseline throughput | `tonnes_per_hour_mean` and CI for `baseline` |
| Q2 likely bottlenecks | `top_bottlenecks` for `baseline` |
| Q3 more trucks helps? | `tonnes_per_hour` for `trucks_4` vs `baseline` vs `trucks_12` |
| Q4 ramp upgrade helps? | `tonnes_per_hour` for `baseline` vs `ramp_upgrade`; cross-checked with combo |
| Q5 crusher sensitivity | `tonnes_per_hour` and `crusher_utilisation` for `crusher_slowdown` vs `baseline` |
| Q6 ramp closed impact | `tonnes_per_hour` for `ramp_closed` vs `baseline` and route lengths |
| Combo (proposed) | `tonnes_per_hour` for `trucks_12_ramp_upgrade` vs `trucks_12` and `ramp_upgrade` individually |
All numerical answers in `README.md` reference values from `summary.json` so
that the conceptual model and the reported answers stay in lockstep.
conceptual_model.md
← Back to submission · View raw on GitHub