Multi-year forecast loop over (season_year, init_date) pairs¶

Failure-mode flowchart — multi-year forecast loop

Drive the per-init forecast pipeline across N (season_year, init_date) tuples against an already-trained run_dir. There is no dedicated multi-year orchestrator: the pattern is caller-driven — loop the standard forecast entry point. Each tuple owns a disjoint subtree under <run_dir>/forecast/<season_year>/<init_date>/ (market_insights_models/src/commodity_hindcast/lib/results/results_slice.py:328), so the loop is parallel-safe by construction (lib/results/results_slice.py:303-308, drafts/decisions/ADR-004-forecast-path-restructure.md).

Composes on top of forecast_per_init.md — read that first, then return here for the loop pattern, parallelisation, and per-tuple bookkeeping.

1. When to use¶

Multi-year client outlook from a single init: forecast 2026, 2027, 2028 corn off the same April-2026 init_date. This is the use case PR-369 was built for (drafts/decisions/ADR-004-forecast-path-restructure.md).
Backtesting forecast quality across init_dates: hold season_year fixed and sweep init_date to study how the point-in-time forecast tracks in-season truth.
Comparison series across both axes: a (season_year, init_date) grid produces a matrix of deliveries for CI-coverage studies and "how the forecast evolved" charts.

Out of scope: training a new model (run hindcast or fit-production first; see drafts/runbook/full_hindcast_rerun.md), back-filling missing historical folds, and any non-commodity_hindcast forecast pipeline.

2. Preconditions¶

All preconditions from drafts/runbook/forecast_per_init.md section 2 apply unchanged (existing run_dir, production model artefacts, included_geo_identifiers.txt, canonical hindcast pred.parquet, forecast.residual_mode set, residual-mode requirements satisfied). They are checked once per tuple by validate_residual_mode at market_insights_models/src/commodity_hindcast/stages/run_forecast.py:91 (called from run at line 161) and by preflight at market_insights_models/src/commodity_hindcast/run/preflight.py:42.

Two additional requirements are specific to the multi-year loop:

Disk budget: each tuple writes a full forecast subtree (indices.zarr, features/pred.parquet, preds/walk_forward_preds.parquet, preds/year_data.parquet, postprocessed/national.parquet, three delivery CSVs — lib/results/results_slice.py:333,343,353,358,368,378). Budget N × [PLACEHOLDER: typical subtree size] of free disk under run_dir. The indices zarr dominates.
Climo zarr coverage for every season_year. When a tuple's season_year exceeds the observed extent, the long-range stub at stages/run_forecast.py:301 synthesises climo from a trailing-window median (stress stub at :310 for stress sources). Stubs no-op when the source covers the season; output interval coverage is degraded beyond observed extent but the run does not fail.

3. Procedure¶

The unit of work is one (season_year, init_date) tuple. Pick sequential or parallel based on whether you can afford the wall-clock and whether MLflow serialisation is acceptable.

3a. Sequential loop (safe default)¶

A Python loop calling the orchestrator entrypoint directly. Use this when you want predictable, single-threaded execution and a simple log to read back.

uv run python -c "
from datetime import date
from cloudpathlib import AnyPath
from market_insights_models.src.commodity_hindcast.stages import run_forecast

run_dir = AnyPath('/path/to/run_dir')
season_years = [2025, 2026, 2027]
init_dates = [date.fromisoformat(s) for s in ['2026-04-01', '2026-05-01']]

for sy in season_years:
    for id_ in init_dates:
        run_forecast.run(run_dir, season_year=sy, init_date=id_)
"

Source: run_forecast.run at market_insights_models/src/commodity_hindcast/stages/run_forecast.py:143. Each call validates residual mode (line 161), builds features (line 163), and runs predict (line 164) for that single tuple.

The CLI form is equivalent — wrap in a shell loop:

for sy in 2025 2026 2027; do
  for id in 2026-04-01 2026-05-01; do
    uv run -m market_insights_models.src.commodity_hindcast.cli run forecast \
      --run-dir /path/to/run_dir \
      --season-year "$sy" \
      --init-date "$id"
  done
done

CLI source: market_insights_models/src/commodity_hindcast/cli.py:450 (@run.command("forecast")).

3b. Parallel loop (faster, with caveats)¶

Because each tuple writes a disjoint subtree (lib/results/results_slice.py:303-308), N forecast calls can run concurrently against the same run_dir. Use GNU parallel or xargs -P:

# pairs.txt — one "<season_year> <init_date>" per line
2025 2026-04-01
2026 2026-04-01
2027 2026-04-01
2025 2026-05-01
2026 2026-05-01
2027 2026-05-01

cat pairs.txt | xargs -P 4 -L 1 bash -c '
  uv run -m market_insights_models.src.commodity_hindcast.cli run forecast \
    --run-dir /path/to/run_dir \
    --season-year "$0" \
    --init-date "$1"
'

Or with GNU parallel:

parallel --colsep " " -j 4 \
  uv run -m market_insights_models.src.commodity_hindcast.cli run forecast \
    --run-dir /path/to/run_dir --season-year {1} --init-date {2} \
  :::: pairs.txt

MLflow caveat: every forecast call opens an MLflow run. When all tuples write to the same SQLite-backed tracking DB (mlruns.db), concurrent writes can serialise on a database is locked OperationalError — see auto-memory "MLflow DB locking". Two options:

Accept the lock at low parallelism (-P 2 or -P 4 is usually fine for short bursts).
Isolate per-process by setting MLFLOW_TRACKING_URI to a per-tuple sqlite file or a non-sqlite backend before each call:

cat pairs.txt | xargs -P 8 -L 1 bash -c '
  export MLFLOW_TRACKING_URI="sqlite:///mlruns_${0}_${1}.db"
  uv run -m market_insights_models.src.commodity_hindcast.cli run forecast \
    --run-dir /path/to/run_dir \
    --season-year "$0" \
    --init-date "$1"
'

MLflow runs from isolated DBs can be merged later if needed.

Disk I/O contention to the climo zarr is the other common parallel bottleneck; it is read-only and tolerates concurrent readers, but at high -P values throughput plateaus regardless.

4. Verification¶

Confirm three things after the loop completes:

N delivery CSV sets on disk, one per tuple. For each (season_year, init_date) in the input list:

<run_dir>/forecast/<season_year>/<init_date>/delivery/
  Treefera_<experiment_key>_ADM0_Forecast_<init_date>.csv
  Treefera_<experiment_key>_ADM1_Forecast_<init_date>.csv
  Treefera_<experiment_key>_ADM2_Forecast_<init_date>.csv

Filename construction: lib/results/results_slice.py:377 (delivery_csv).

Postprocessed national parquet per tuple at <run_dir>/forecast/<season_year>/<init_date>/postprocessed/national.parquet. Each parquet should hold one row per (year, init_date) row in the production walk-forward predictions for that single init — i.e. a single-init slice, not a sweep. Path property at lib/results/results_slice.py:368.
MLflow run status FINISHED for every tuple. When using a shared tracking DB, query the experiment for runs created during the window and count FINISHED against RUNNING/FAILED. Any FAILED with delivery CSVs on disk is a partial write — treat that tuple as suspect and re-run.

Quick all-in-one shell check (expect three CSVs per tuple):

uv run python -c "
from cloudpathlib import AnyPath
run_dir = AnyPath('/path/to/run_dir')
for child in sorted((run_dir / 'forecast').glob('*/*/delivery')):
    print(child.parent.relative_to(run_dir), len(list(child.glob('*.csv'))))
"

5. Failure modes and recovery¶

One tuple fails, others continue. Subtrees are disjoint (lib/results/results_slice.py:303-308); a failure in (2027, 2026-04-01) does not corrupt (2026, 2026-04-01). Inspect the failed tuple's subtree at <run_dir>/forecast/<sy>/<id>/: missing delivery/ = pre-delivery failure, missing postprocessed/ = earlier, missing preds/ = earlier still. Per-init runbook section 5 covers each stage. Re-run the failed tuple in isolation; do not re-run the whole sweep.
Long-range climo stub fires for season_year beyond observed extent. Triggered inside _build_forecast_features at stages/run_forecast.py:301 (stress equivalent at :310). Not an error — fills missing years from a trailing-window median per county. Interval coverage is degraded for those tuples; flag in the deliverable. Removable once the upstream climo zarr covers the horizon (in-source \TODO markers).
Disk pressure mid-loop. Each tuple ≈ [PLACEHOLDER: typical subtree size]. If the loop dies on ENOSPC partway through, completed tuples on disk remain valid (per-stage atomic writes). Free disk and resume with the remaining tuples.
MLflow database is locked under high parallelism. See auto-memory "MLflow DB locking" and drafts/runbook/mlflow_db_recovery.md. Drop -P, or isolate MLFLOW_TRACKING_URI per process per section 3b.
Stale forecast features parquet from a partial earlier run. Re-running a tuple that crashed mid-feature-build: pass --force to forecast-features (cli.py:376; existence guard at stages/run_forecast.py:207) so indices.zarr and features/pred.parquet rebuild unconditionally.

For residual-mode and preflight failures, diagnosis is identical to drafts/runbook/forecast_per_init.md section 5.

6. Rollback¶

The unit of rollback is the per-tuple subdirectory. To roll back a single tuple:

Stop any consumer reading the delivery CSVs for that (experiment_key, season_year, init_date).
Remove (or archive) <run_dir>/forecast/<season_year>/<init_date>/ in its entirety.
Re-run that single tuple via the procedure in section 3a.

Rollback of one tuple does not touch any other tuple, the trained artefacts under <run_dir>/models/, or the canonical hindcast features. There is no shared cache, no shared writes, and no MLflow side-effect that survives directory removal beyond the original run record (drafts/decisions/ADR-004-forecast-path-restructure.md, lib/results/results_slice.py:412-427).

To roll back the entire multi-year sweep, repeat the per-tuple procedure for each tuple, or remove <run_dir>/forecast/ outright if no other forecast output should survive (note: this is destructive across all (season_year, init_date) tuples currently under that run_dir).

If any delivery CSV has already been served to a client, follow the data-correction process owned by [PLACEHOLDER] before deleting. Re-running with the same (season_year, init_date) overwrites the per-tuple subtree in place.

References¶

Per-init runbook (composed on): drafts/runbook/forecast_per_init.md.
Source: market_insights_models/src/commodity_hindcast/stages/run_forecast.py (run at :143, validate_residual_mode at :91 and :161, run_features at :167, long-range climo stub at :301, long-range stress stub at :310).
Source: market_insights_models/src/commodity_hindcast/lib/results/results_slice.py (ForecastSlice at :302, root at :328, disjoint-subtree guarantee at :303-308, per-artefact path properties at :333,343,353,358,368,378, training delegation at :412-427).
Source: market_insights_models/src/commodity_hindcast/run/preflight.py (run_preflight at :42).
CLI: market_insights_models/src/commodity_hindcast/cli.py (@run.command("forecast") at :450, force flag at :376).
ADR: drafts/decisions/ADR-004-forecast-path-restructure.md.
Wiki: wiki/commodity_hindcast/pipelines/multi_year_forecast.md, wiki/commodity_hindcast/entities/ForecastSlice.md, wiki/commodity_hindcast/sources/prs/PR-369.md.
Related runbooks: drafts/runbook/full_hindcast_rerun.md, drafts/runbook/mlflow_db_recovery.md.