Skip to content

PR #339 — refactor(commodity_hindcast): 9-phase restructure to match SYNTHESIS

At a glance

  • Author: ai-tommytf
  • Merged: 2026-04-29
  • Branch: tl/refactor-commodity-hindcast
  • Net effect: Structural refactor of the commodity_hindcast package (22,968 lines, 205 files changed, +4897/-3059). Deleted inner src/, steps/, forecast/, tfds/, main.py. Built canonical subpackage tree matching SYNTHESIS.md. Broke the long-standing forecast/results.pysteps/experiment_result.py import cycle by unifying HindcastSlice and ForecastSlice in experiments/slices.py. Manifest-gated: wheat and corn hindcast outputs are byte-for-byte identical before and after (sha256 verified). Tests: 212 → 296 (84 newly discoverable).
  • Why this matters: Before this PR the package had an inner src/ namespace, a defunct tfds/ directory, two parallel composition paths for hindcast vs forecast, and a circular import worked around with lazy imports. After: one clean subpackage tree, two thin orchestrators (hindcast.py, forecast.py), no cycles, no deprecation shims.

PR body (faithful extract)

The problem (in one paragraph)

Imagine a kitchen where the spice rack lives inside the freezer, the chopping board hangs from
the ceiling, and there are two near-identical cookbooks taped to opposite walls — one for
hindcasts, one for forecasts. Cooking still works, but every new chef takes a week to find the
salt. Worse, the two cookbooks have started disagreeing in subtle ways. That kitchen is the
pre-refactor commodity_hindcast: an inner '/src' namespace nested inside an outer 'src/', a
'tfds/' folder named after a long-defunct prototype, two parallel composition paths (one in
'main.py', one in 'forecast/wrappers'), and a stubborn import cycle between
'forecast/results.py' and 'steps/experiment_result.py' that everyone worked around with lazy
imports.

Before / after tree

BEFORE                                      AFTER
commodity_hindcast/                         commodity_hindcast/
└── src/                                    ├── hindcast.py    ┐
    ├── steps/     (regression, detrend,    ├── forecast.py    │ orchestrators
    │               experiments,            │                  ┘
    │               postprocess,            ├── preprocess.py  ┐
    │               delivery — jumbled)     ├── train.py       │
    ├── forecast/  (deliver/predict/        ├── predict.py     │ 6 stage
    │               postprocess wrappers)   ├── evaluate.py    │ modules
    ├── tfds/      (defunct prototype)      ├── deliver.py     │
    └── main.py    (484 LOC, both pipes)    ├── postprocess/   ┘
                                            ├── infra/
                                            ├── preprocessors/
                                            ├── features/
                                            ├── models/
                                            ├── experiments/
                                            ├── delivery/
                                            ├── diagnostics/
                                            └── tracking/

Nine phases, each a squash-merge commit

# Phase Squash sha What changed
0 Snapshot baseline (oracle) aedc9783 Captured 12 CSV + 4 parquet sha256s on 2026-04-27
1 Promote tfds/config.py to root cd495482 58 import sites rewritten via libcst
2 Flatten inner src/ namespace 98db03eb 113 zero-byte renames + 293 import-site rewrites
3 Lift 6 stage modules to root 5219a512 preprocess, train, predict, evaluate, postprocess, deliver
4 Build infra/ + preprocessors/ d6a1cf0b Domain primitives grouped; geo/ + reference_data/ re-homed
5 Build ML subpackages + cycle break 65e4c4a1 models/, experiments/, postprocess/, delivery/ — cycle dies
6 Build diagnostics/ + tracking/ af710000 reporting + metrics + plots; MLflow 4-way split
7 Orchestrators + delete forecast/ c2dafddc hindcast.py and forecast.py; parallel-tree dissolved
8 BC + complexity + reference-integrity sweep 0f483d31 4-group scan: zero deprecation residue, dead code, stale paths
9 Final cleanup; tree matches SYNTHESIS 9c4b082e steps/, main.py, tfds/ gone; tests consolidated

The regression oracle — manifest-driven gates

Before any phase started, wheat and corn hindcast outputs were snapshotted as sha256 hashes in baseline_2026-04-27/MANIFEST.yaml. Every phase commit had to regenerate those outputs and produce identical hashes. The gate caught a real regression at Phase 2: cli.py used __file__.resolve().parents[1] / 'configs' which resolved one level too high after the flatten. Fix: parents[1]parents[0]. Local lint, type-check, and unit tests all passed; only the manifest gate caught it.

Parquet oracle defect: pipeline_run_id is a per-run UUID; byte-for-byte equality is impossible. Solution: scripts/manifest_parquet_data_sha.py computes a UUID-excluded data sha. The parquet invariant is (file_size_bytes, row_count, parquet_data_sha256_uuid_excluded).

Phase 5 — The cycle break (centrepiece)

The cycle: forecast/results.py (owned ForecastSlice) ↔ steps/experiment_result.py (owned HindcastSlice). Each needed the other; the only workaround was function-body lazy imports at two sites.

Fix: create experiments/slices.py and move both classes into the same module. Two lazy imports deleted. forecast/results.py deleted entirely (its only export was ForecastSlice).

Wrinkle: HindcastSlice.training called ExperimentResult.from_run_dir(...).production, which would recreate a cycle even after the move. Minimum-rewrite carve-out: extract production_hindcast_slice() — a shared free function both slice classes call without needing ExperimentResult.

# experiments/slices.py
def production_hindcast_slice(run_dir: Path, commodity: str) -> HindcastSlice | None:
    ...

class HindcastSlice: ...
class ForecastSlice: ...

Phase 5 — Second cycle (PEP-562 __getattr__)

postprocess/__init__.py had to expose postprocess_experiment without triggering:

experiments/slices.py → postprocess/bias_correction.py
  → postprocess/__init__.py (eager re-export)
  → postprocess/orchestrator.py → experiments/experiment_result.py
  → experiments/slices.py  ← partial-init crash

Fix: PEP-562 module-level __getattr__ defers resolution until first access:

def __getattr__(name: str) -> Any:
    if name == "postprocess_experiment":
        from .orchestrator import postprocess_experiment as _pe
        return _pe
    raise AttributeError(...)

Phase 7 — Orchestrators dissolve the parallel-tree

Before: main.py:run_hindcast (hindcast) + forecast/*.py wrappers (forecast) — two separate composition paths for the same six-stage pipeline.

After: hindcast.py and forecast.py at the package root, each a thin composition. The entire forecast/ directory was deleted. forecast.py (the file) is now the sole owner of the commodity_hindcast.forecast namespace.

Manifest gate — final proof

Wheat ADM2 hindcast (414,250 rows, 71 MB), sha256 from MANIFEST.yaml:

c50eb52e4901b2918c06fdf8df1adc9abd4b4df4f418150baa54c2048184ae40

Post-Phase-9 run: same hash. Byte-for-byte identical.

Five documented deviations from SYNTHESIS

  1. runs.py absent — no RunDir class exists to extract; deferred.
  2. Root postprocess.py absent — replaced by postprocess/ package (PEP-562 carve-out required it).
  3. experiments/predict_experiment.py new — absorbed steps/predict/experiment.py.
  4. postprocess/orchestrator.py new — absorbed steps/postprocess/orchestrator.py.
  5. delivery/{export,geo_normalise}.py new — relocated warehouse-export modules; British-English rename (-ize-ise).

Five invariants honoured throughout

  1. Exact same outputs (manifest oracle).
  2. Move, don't rewrite (relocate + import update, no logic changes).
  3. DRY/SRP/ETC opportunistic (fix what the move reveals; don't hunt).
  4. Crash early; no silent fall-backs.
  5. Each commit small, named, reversible.

Files / lines touched

Additions Deletions File
+734 -0 SHOWBOAT.md
+435 -0 market_insights_models/src/commodity_hindcast/diagnostics/metrics.py
+380 -0 .claude/skills/phased-refactor-harness/references/common_blockers.md
+361 -0 market_insights_models/src/commodity_hindcast/DESIGN_ADDENDUM_334.md
+325 -0 market_insights_models/src/commodity_hindcast/IMPLEMENTATION_PLAN.md
+259 -0 .claude/skills/phased-refactor-harness/references/plan_skeleton.md
+239 -0 .claude/skills/phased-refactor-harness/references/critic_brief_template.md
+239 -0 market_insights_models/src/commodity_hindcast/ORCHESTRATION.md
+192 -0 .claude/skills/phased-refactor-harness/references/tmux_recipes.md
+190 -0 .claude/skills/phased-refactor-harness/SKILL.md

(Total: 205 files changed, +4897/-3059)

Cross-references

Lessons captured

  • The canonical package tree is: config.py, cli.py, hindcast.py, forecast.py, six stage modules at root, then subpackages infra/, preprocessors/, features/, models/, experiments/, postprocess/, delivery/, diagnostics/, tracking/, app/, configs/.
  • HindcastSlice and ForecastSlice are defined in experiments/slices.py; that is the only module that should own them.
  • The postprocess/__init__.py uses PEP-562 __getattr__ to avoid a partial-init cycle; do not add eager re-exports to that __init__.
  • forecast/ is deleted; forecast.py (the file at the package root) is the sole owner of the commodity_hindcast.forecast namespace.
  • Manifest-based regression gates (sha256 on delivery CSVs, UUID-excluded data sha on parquets) are the right tool for structural refactors.
  • libcst codemods were used for import-site rewrites; sed would have broken if TYPE_CHECKING: blocks and multi-line imports.
  • cli.py path resolution uses Path(__file__).resolve().parents[N] / "configs" where N depends on where cli.py sits in the tree; Phase 2 regression: parents[1] must be parents[0] after the flatten.