Skip to content

Risk register — commodity_hindcast

Last updated: 2026-05-08 Total risks tracked: 23 (R1..R14 from the original register; R15..R23 merged in from risks_addendum.md on 2026-05-08; 7 refinements applied to existing rows; new "Fixed in flight" section added with 18 historical lessons)

Severity legend

  • Critical: would corrupt a production delivery silently OR block a release
  • High: likely to cause an incident / needs intervention this quarter
  • Medium: known footgun; works around exist
  • Low: minor / cosmetic

Risks

# Title Severity Likelihood Description Mitigation Owner Sources
1 TMI selection bias correction wrong sign Critical High 2020 corn flips from QUBE +2.23 to TMI -8.31 bu/ac under an identical SBC formula — wrong sign with ~3.7x different magnitude (NOT the inherited "~17x smaller" figure, which came from a now-deleted gen_report.py::convert_metrics_to_bu_acre re-unitting selection_bias_kg_ha to bu/ac without renaming). The historical "weather features drop core counties via global dropna" mechanism is no longer dominant: training_dropna_subset at config.py:812-822 now returns ONLY [com.target_col, com.target_detrended_col] and no longer splats commodity.feature_cols. Root-cause status is therefore REOPEN — dropna-on-target alone cannot explain a 287-vs-972 county collapse. Two-track mitigation. (B) Quick unblock — make dropna column-aware per consuming estimator: largely landed at config.py:812-822 (subset narrowed to target/detrended target only); add a regression test that pins training_dropna_subset returns exactly that pair so a future widening cannot re-introduce the bug. (A) Root-cause — fix the weather feature builder so EDD z-scores cover all 935 top-95% counties; re-investigate the 287-vs-972 collapse now that dropna-splat is ruled out (1980 component fixed at features/builders/yields.py:185-199; Wayne County NC climo issue [PLACEHOLDER: not located in current tree]). Live dropna call sites: stages/run_fit.py:94 and run/experiment_protocol.py:54. Switch evaluation to the full prediction universe; the included_geo_identifiers contract is codified at DESIGN.md:114. [PLACEHOLDER: owner needed] MEMORY.md/project_sbc_tmi_bug.md, config.py:812-822, stages/run_fit.py:94, run/experiment_protocol.py:54, features/builders/yields.py:185-199, DESIGN.md:114, sessions A2-A5
2 DESIGN.md vs code drift on postprocessed/national.parquet High High DESIGN.md clause specifies postprocessed/{experiment_key}_national.parquet but code writes postprocessed/national.parquet (no infix). Any consumer that reads DESIGN.md as a code contract will fail, and the wiki pipelines/postprocess.md documents the discrepancy. Either update DESIGN.md:73 to drop the {experiment_key}_ infix, or update the writer to include it. Code path: lib/results/results_slice.py:368, lib/results/run_result.py:155. [PLACEHOLDER: owner needed] DESIGN.md:73, lib/results/results_slice.py:368, lib/results/run_result.py:155, LINT_REPORT.md§<section>
3 S3 path anchoring on local-only sinks High High INPUT_DATA_DIR is s3://... in QA; code that calls .as_posix(), pathlib.Path(AnyPath(...)), or anchors sqlite/lockfiles under data_root crashes (AttributeError: 'S3Path' object has no attribute 'as_posix') or produces sqlite:///s3://.... Recurring class — last incident f66f4ac9 (2026-04-29) broke wheat/corn/soy/cotton in QA. Branch on isinstance(x, CloudPath) for any URI sink; use str(path) not .as_posix(); route local-only sinks (sqlite, FileLock) to /tmp/<run-id>/...; use AnyPathParam (PR-345) for click args; add parametrised tests covering both Path and S3Path anchors. [PLACEHOLDER: owner needed] MEMORY.md/feedback_s3_path_anchoring.md, wiki/commodity_hindcast/sources/prs/PR-345.md, commit f66f4ac9, fix 13117727
4 MLflow SQLite locking on parallel same-commodity runs High Medium Running two pipelines for the same commodity in parallel hits SQLAlchemy OperationalError because two writers contend for the same MLflow SQLite DB. Blocks concurrent backfills. Run same-commodity pipelines sequentially; longer term, move MLflow tracking to a server backend (Postgres/MySQL) or per-run isolated DB files. [PLACEHOLDER: owner needed] MEMORY.md Known Issues section, mlruns.db artefacts in repo root
5 _OOS_MODES frozenset hand-maintained vs ResidualMode Literal High Low stages/run_forecast.py:82 keeps a hand-maintained frozenset of OOS modes used by validate_residual_mode. Any new value added to ResidualMode in models/meta_models/types.py that is not also added to _OOS_MODES would silently allow a non-OOS mode through the validation gate when no conformal sidecar exists. Replace frozenset with set(get_args(ResidualMode)) - {"in_sample_pooled"}; add a unit test that pins the derivation. [PLACEHOLDER: owner needed] LINT_REPORT.md§<section>, stages/run_forecast.py:82, models/meta_models/types.py, wiki/commodity_hindcast/sources/prs/PR-372.md
6 Wheat sub-types listed in config but not produced High Medium Wheat preprocessor only emits crop_type WHEAT. The 2026-04-17 unified-builder commit c7041b68 ("refactor(commodity_hindcast/features): unify feature builders on one season-DOY path") consolidated all wheat configs onto a single season-DOY path; WINTER_WHEAT survives only as a NASS crop_type filter at configs/wheat_usa.yaml:214, while SPRING_DURUM_WHEAT and SPRING_EXCL_DURUM_WHEAT are not represented in current configs. Sub-type reintroduction would now require its own season_start_doy/freeze_cap_sdoy/season_length triple per sub-type and is untested under the unified path. Either implement the sub-type split in the wheat preprocessor (with the per-sub-type season-DOY triples) or remove the unsupported entries from configs and document WHEAT as the only produced type. Treat sub-type reintroduction as orthogonal to unification. [PLACEHOLDER: owner needed] MEMORY.md Known Issues — Wheat sub-types, commit c7041b68, configs/wheat_usa.yaml:214, session A6
7 Sugar / non-US preprocessing missing geometries High Medium Default CROP_YIELD_GEOBOUNDARIES_FILE=/data/processing/yield_forecast/raw/boundaries/geometry.parquet is US-only and has no IND/GHA/BRA rows; non-US runs need the 52 k-row all_geoboundaries_processed.parquet. Wrong env var ⇒ silently empty geometry joins. Document the override; add a config-time check that asserts geography rows exist for the configured commodity; default Brazil/India/Ghana configs should set the path. [PLACEHOLDER: owner needed] MEMORY.md Known Issues — Sugar/non-US preprocessing
8 Two upward-import layering violations in delivery/ Medium Medium Two modules in delivery/ import from the stages/ layer rather than from the shared lib/ layer, breaking the single-direction import DAG (DESIGN.md Clause 19). Risks reintroducing the cycle that PR-339 spent nine phases to break. Move shared helpers (conformal apply, unit conversion) into lib/; have both delivery/ and stages/run_meta_models.py import from there. Tracked in BOUNDED_CONTEXTS.md. [PLACEHOLDER: owner needed] LINT_REPORT.md§<section>, wiki/commodity_hindcast/sources/prs/PR-339.md, wiki/commodity_hindcast/sources/prs/PR-353.md open question §9
9 Long-range climo stub silently retrendless beyond zarr Medium Medium forecast_long_range_stub.py (PR-369) fills missing z-score features for season_year beyond the materialised climo zarr's coverage with per-county trailing-3-year medians. Long-range forecasts collapse to trend-only output by design (season_doy_weather_weight = 0); this is correct only as long as the schedule is honoured. The filename is deliberately temporary — removal blocked on extending the climo zarr. Extend the materialised climo zarr horizon; remove the stub when no caller imports from it. Until then, keep the three WARNING log lines and the docstring removal criteria visible. Use materialise_for_forecast(...) (centralised) rather than ad-hoc DOY collapse. [PLACEHOLDER: owner needed] wiki/commodity_hindcast/sources/prs/PR-369.md, features/forecast_long_range_stub.py, MEMORY.md/feedback_centralised_climo.md
10 Forecast walk-forward has no per-fold checkpoint Medium Low Walk-forward CV restart re-does all earlier folds; an interrupted multi-year hindcast wastes hours of compute. Tracked as a §9 Open Question in DOMAIN_MODEL2.md. Add per-fold completion sentinels under run_dir/; resume by skipping folds whose artefacts already exist and pass schema validation. [PLACEHOLDER: owner needed] wiki/.../sources/prs/PR-353.md §9 Open Questions, DOMAIN_MODEL2.md §9
11 Streamlit dashboard regressions invisible until manual launch Medium Medium The app/ package has no automated test that exercises streamlit run startup. PR-360's reference-data refactor silently broke dashboard imports until PR-363 patched three independent bugs by hand. The required PYTHONPATH + two env vars are also fragile — streamlit run bypasses uv's package discovery. Add a smoke test that imports app/app.py under the same PYTHONPATH shape Streamlit uses; document the launch incantation in the runbook. [PLACEHOLDER: owner needed] wiki/commodity_hindcast/sources/prs/PR-363.md, wiki/commodity_hindcast/sources/prs/PR-360.md, MEMORY.md/project_streamlit_app_launch.md
12 QA RDS unreachable from dev EC2 forces local_data symlink Medium High Fresh worktrees that run dev_tools/run_model_local.py time out fetching geometry from QA Postgres because the SG/route blocks the dev box. The script only hits the DB when local_data/<model_id>/geometry.parquet is missing — every worktree needs the symlink ritual. Mirror the cached geometry.parquet files into a documented shared location; or open the SG route from dev EC2; or make the script fail fast with the symlink instruction. [PLACEHOLDER: owner needed] MEMORY.md/project_local_data_symlink.md
13 Wiki broken-link backlog (84 broken refs across 47 files) Medium High Lint pass found 84 broken relative links and 6 high-value missing pages (notably pipelines/hindcast.md, referenced by 19 entity pages). New onboarding engineers will hit dead-ends across the entity catalogue. Write pipelines/hindcast.md first (highest fan-in); fix PR-345/361/369/372 cross-ref paths; backfill concepts/experiment_protocol.md and concepts/season_doy.md (17 and 25 inbound refs respectively). [PLACEHOLDER: owner needed] LINT_REPORT.md§Broken Links, LINT_REPORT.md§Recommended Next Pass
14 entities/EditRule.md vs entities/EditRuleConfig.md naming ambiguity Low Low PR-353.md references EditRule.md while only EditRuleConfig.md exists. May be one missing page (the rule operation, distinct from the rule config) or a stale name. Decide whether EditRule (operation) is a separate entity from EditRuleConfig (declarative config); write the missing page or update the cross-ref. [PLACEHOLDER: owner needed] LINT_REPORT.md§<section>, wiki/commodity_hindcast/sources/prs/PR-353.md
15 tau2 floor 24 orders of magnitude tighter than QUBE High Medium TMI sets tau2 = max(..., 1e-30) in partial-pooling EB shrinkage; the in-file comment at models/detrend/partial_pooling_detrend.py:326-330 explicitly notes QUBE uses 1e-6 and that the TMI floor produces λ ≈ 0 (pure national prior) where QUBE produces λ ≈ 1e-6/SE². On full production panels (~927 counties) τ² ≈ 0.44 so the floor is never active, but on small/synthetic/degenerate-slope panels the TMI floor is effectively zero and shrinkage collapses to a single county's slope. The risk is acknowledged in the source comment but the floor has not been raised. Raise floor to 1e-6 to match QUBE; add a regression test that fits an EB shrinkage on a degenerate slope panel and asserts the shrinkage does not collapse to a single county. Pin a synthetic-panel snapshot of the resulting eb_lambdas so the QUBE-equivalent regime is testable in CI. [PLACEHOLDER: owner needed] models/detrend/partial_pooling_detrend.py:334 (floor), :326-330 (comment), :138, :465, :493 (self._tau2), session A2 (3804f325)
16 Per-commodity yield-range bounds are the only delivery-side defensive net Medium Medium config.py:324 defines yield_range: tuple[float, float] on CommodityConfig ("A new commodity MUST declare its own range; this crashes early at config load if omitted"). The canonical clip helper clip_yield_to_delivery_range(df, yield_range, value_cols, log_tag=...) lives at lib/unit_utils.py:93-130 and is called once at the delivery boundary by delivery/conversions.py:396-398 (import at :47). Per-commodity values: corn [50.0, 250.0] (configs/corn_usa.yaml:88), wheat [0.0, 260.0] (configs/wheat_usa.yaml:95), cotton [400.0, 1200.0] (configs/cotton_usa.yaml:90). Wheat's 0.0 lower bound is intentionally permissive (county panels include failed-crop years), but it admits the 94 bu/ac wheat-mean-from-misaligned-climo defect that A6 originally hit; the clip would not have caught it. run/preflight.py performs file-existence checks only — no value-quality / null-rate / z-score-std checks. The dashboard at app/_dashboard_config.py:143, 161, 181, 204, 219 re-uses the same yield_range for axis bounds, so loosening the YAML silently widens the dashboard charts too. Document yield_range as a load-bearing guardrail in DESIGN.md; tighten ranges where defensible (e.g. wheat lower bound could be 5.0 if the failed-crop-year edge case is handled upstream); add upstream feature-quality assertions (climo null-rate ceiling, z-score std ceiling) so the clip is not the sole net; add a unit test that exercises clip_yield_to_delivery_range on out-of-range delivery rows and asserts both the clip behaviour and the warn-on-clip log line via log_tag. [PLACEHOLDER: owner needed] config.py:324, lib/unit_utils.py:93-130, delivery/conversions.py:47, :396-398, configs/corn_usa.yaml:88, configs/wheat_usa.yaml:91-95, configs/cotton_usa.yaml:90, app/_dashboard_config.py, session A6 (6f8c9256)
17 climo_lag_days regression watch — unification shifted coefficients across all commodities High Low Pre-fix, climo_lag_days = 30 was applied to harvest-init training rows, not just inference. The training row at harvest sdoy=184 saw a window [1..154] for corn — i.e. 30 days of end-of-season weather missing from training, contradicting DESIGN.md ("SHALL fit on harvest-time data"). Post-fix (Option B): harvest-row uses lag=0; other rows use lag=1. Reported coefficient shifts: 10-95% on most features (gdd_zscore_gstd -16.65 → -32.12, +93%; tavg_zscore_gstd -42.21 → -57.54, +36%; stress_score_lag1 -4.87 → +2.00 sign flip). Affects every commodity's forecasts. Live default at config.py:350 (climo_lag_days: int = 1); call site at features/builders/climo.py:123. Add a regression test that asserts harvest-init training rows use lag=0 and other rows use lag=1; pin a coefficient-sign baseline on a fixture; assert config.commodity.climo_lag_days >= 1 at config load (negatives already guarded by df7ea52f, but no test). Material change; warrants a release-note callout. [PLACEHOLDER: owner needed] config.py:350, features/builders/climo.py:123, domain-modelling/schema.yaml:219, fixed in commit c7041b68 (2026-04-17), guard added in df7ea52f, session A6
18 Leap-year off-by-one in legacy season-array slicer Medium Low Pre-fix legacy code did zarr[year=Y, dayofyear=start..366] regardless of leap-year state. For non-leap years (e.g. 2022 Oct 1-Dec 31 = 92 days) the slice gave 93 positions; the legacy path silently let the season array be one slot longer than season_length. Caught by the unified prototype's shape assertion. The _legacy variant has been removed; all callers go through features/builders/climo.py:34 (_build_season_array), features/builders/ndvi.py:97, or features/builders/weather.py:55 (build_season_array_from_daily_zarr). features/builders/weather_stress.py:29 imports the same shared engine. Verify no consumer outside the unified path still calls a calendar-DOY zarr slice without checking is_leap_year(Y). Add a property-based test (hypothesis) that for any season window [start, end] and year Y, the returned array has length end - start + 1 regardless of leap-year state. [PLACEHOLDER: owner needed] Removed in c7041b68; current call sites in features/builders/{weather,climo,ndvi,weather_stress}.py, session A6
19 Silent config drift between TMI and QUBE-parity baseline High Medium Three corn config knobs were silently misaligned with QUBE per A5: (a) correction_shrinkage defaulted to 1.0 vs QUBE 0.3 (3.3x larger weather corrections); (b) season_doy_weather_weight ramp commented out, effectively 1.0 everywhere vs QUBE's active ramp; combined effect ~45x larger weather weight in early season; (c) weather_correction_fit_level was at one point ADM2 vs QUBE ADM0. 2023/2024 symptom: opposite-sign weather corrections vs QUBE. Fix collapsed max wx-correction diff from 6.36 bu/ac (with sign flips) to 0.071 bu/ac. Live configs/corn_usa.yaml:283 pins weather_correction_fit_level: ADM0; season_doy_weather_weight: is present at :326 (block-mapping form, not exhaustively verified — could still be a no-op stub). correction_shrinkage no longer appears in any configs/*.yaml — [PLACEHOLDER: knob may have been renamed, absorbed into the per-row weight ramp, or deleted; lineage needs git log -- configs/corn_usa.yaml]. Cross-commodity coverage (soy, cotton, wheat) is unverified by this round. Add a config-parity regression test pinning weather_correction_fit_level=ADM0 and the active season_doy_weather_weight ramp against a snapshot YAML; trace what happened to correction_shrinkage and document the rename / removal in DESIGN.md; add an equivalent parity check for soy, cotton, wheat configs. [PLACEHOLDER: owner needed] configs/corn_usa.yaml:283, :326, issues/20260415_tmi_qube_weather_correction_and_trend_alignment.md, session A5 (8f327031)
20 1980 row preservation — fix landed but no test Medium Low Pre-fix if not prior_mask.any(): continue dropped 1980 entirely (972 county-rows), per A5 accounting for ~89% of the trend drift between TMI and QUBE. The detrender uses raw NASS yield not lagged features, so 1980 was always usable. Live (verified): features/builders/yields.py:185-199 emits 1980 with NaN lagged features (yield_last, yield_avg_3, yield_avg_5 set to np.full(n, np.nan)) instead of skipping. Comment at :191-193 codifies the rationale. Add a unit test asserting 1980 rows survive _compute_yield_features with NaN lags rather than being dropped; assert the row count of the panel equals n_geos × n_years. [PLACEHOLDER: owner needed] features/builders/yields.py:165, :185-199, session A5
21 union_fit_pred_for_production_ranking sweeps unpopulated pred years Medium Medium stages/run_hindcast.py:135 calls prod_panel = union_fit_pred_for_production_ranking(fit_data, pred_data) followed by select_by_production(prod_panel, ..., max_year=max(config.experiment_protocol.test_years)) at :139-145. Helper defined at lib/geo/selection.py:10. The bug shape: the union sweeps in the unpopulated pred year, and max_year references a year with all-NaN production, biasing the top-95% production ranking by ~13 counties. A2/A3's claim that "primary worktree at src/main.py:87-92 already passes fit_data only" does NOT hold in the live tree on tl/bra-soy-update; the live tree still passes the union. Filter all-NaN-production years before ranking inside union_fit_pred_for_production_ranking, or change the call site to pass fit_data only and explicitly set max_year=int(fit["year"].max()); add a unit test on the helper that constructs a union frame with a trailing all-NaN year and asserts the ranking is unchanged versus a fit-only frame. [PLACEHOLDER: owner needed] stages/run_hindcast.py:135, :139-145, lib/geo/selection.py:10, sessions A2, A3
22 Multi-worktree drift on shared files (process risk) Medium Medium Sibling worktrees still exist on disk: treefera-market-insights-commodity-hindcast/, treefera-market-insights-commodity-hindcast-minim-impl-model-update/, treefera-market-insights-corn-yield-productionisation-v2/, treefera-market-insights-mergediag/, treefera-market-insights-forecast-wt/, treefera-market-insights.wt-validation-reports/. The drift pattern persists in principle: any fix landed in one worktree but not the others silently keeps the bug. A2 also documented a concurrent "TREND_AXIS refactor" session editing the same files mid-orchestration; the TrendAxis machinery now lives at models/detrend/time_axis.py:12, so the axis refactor did land — but cross-worktree consistency is unverified. Consolidate to a single worktree; if multiple worktrees are required, document them and add a CI check that diffs critical files (config.py, models/detrend/partial_pooling_detrend.py, stages/run_hindcast.py, stages/run_fit.py) across worktrees and fails the build on divergence outside an explicit allowlist. [PLACEHOLDER: owner needed] parallel copies of stages/run_*.py, models/detrend/*.py, config.py across worktrees, sessions A2, A3
23 Test coverage gap at tier-1 ADR surfaces (walk-forward driver, conformal modes) Medium High The test suite at tests/unit/commodity_hindcast/ (83 .py files) and tests/integration/commodity_hindcast/ exists and is healthy in aggregate, but two tier-1 surfaces are unexercised: (1) Walk-forward driver (ADR-001) — run_walk_forward (run/runner.py:27) and _predict_fold_rolling (run/runner.py:86) have zero direct test coverage (grep -rn "run_walk_forward\|_predict_fold_rolling" tests/ returns no matches). These are the rolling-fold entry points the hindcast pipeline drives through, so a regression here would only surface in end-to-end runs. (2) Conformal residual modes (ADR-002) — only 2 of the 4 supported modes appear in any test; out_of_sample_per_year and hindcast_oos_pooled are completely uncovered. ADR-003 (validate_residual_mode) IS covered by tests/integration/commodity_hindcast/test_forecast_residual_mode_validation.py. Add a unit test for run_walk_forward over a small synthetic panel asserting fold-by-fold prediction shape and that _predict_fold_rolling advances training years monotonically. Parametrise the existing conformal test over all four ResidualMode values so out_of_sample_per_year and hindcast_oos_pooled are exercised. Cross-link to R15 (EB shrinkage path needs its own regression test) and R16 (delivery-clip helper needs a test). [PLACEHOLDER: owner needed] run/runner.py:27, :86, tests/unit/commodity_hindcast/test_apply_conformal_experiment.py, tests/unit/commodity_hindcast/test_postprocess.py, ADR-001/002/003 cross-check, verification 2026-05-08

Risks deferred

  • Code-style feedback items (feedback_fstrings.md, feedback_no_backwards_compat.md, feedback_no_claude_attribution.md) — these are review-time conventions, not production risks; covered in the contributor guide.
  • feedback_qa_leave_conab_columns.md — agent-behaviour guidance for QA reports, not a pipeline risk.
  • DESIGN.md "TODO: need to define the Delivery module job" (DESIGN.md:110) — editorial gap rather than runtime risk; rolled into wiki backlog (Risk 13).
  • Custom exception hierarchy / marketing_year collapse / forecast.md See Also (DOMAIN_MODEL2.md §9 Open Questions) — design open questions with no current incident pressure.
  • delivery/conversions.py aliases obs-yield to nass_actual regardless of geography (PR-360 follow-up) — values correct, label cosmetic; tracked there.
  • WASDE/commodity_ prefix path drift fixed by PR-361 — historical, no longer a live risk.

Fixed in flight

These items were diagnosed AND fixed within their session; the underlying files have since been restructured but the design intent survives. Citations re-anchored where possible. Kept here for institutional memory; CLOSED, not open risks.

  • gen_report.py silent unit double-labelling (A4 NEW-1) — convert_metrics_to_bu_acre re-units 7+ metric columns including selection_bias_kg_ha, mae, rmse while leaving the _kg_ha suffix in place. Fixed in commit 6f7132cf (now appends _bu_ac). The file gen_report.py no longer exists anywhere in the tree; DESIGN.md:117 still references gen_report.py:convert_metrics_to_bu_acre as the canonical converter, but the implementation has moved. [PLACEHOLDER: locate the current renamer in the post-restructure layout.] Residual transition risk only — any consumer joining metrics_table.csv by old column names will silently miss columns post-rename.
  • included_geos defaulting to single test-fold (A4 NEW-2) — Pre-fix eval.py:173 built included_geos from test_data["geo_identifier"] (one fold's split). Fixed by 6f7132cf (derive from fit_data_full) and reinforced by 2b5545fa (required kwarg, no fallback). The file eval.py no longer exists; the contract survives at DESIGN.md:114 ("required keyword argument", "no default", "no fallback"). Runtime enforcement path needs re-verification post-restructure to confirm the kwarg is still threaded through evaluate_modelgen_metricsestimate_walk_forward_selection_bias_kg_hacompute_selection_bias_for_year_kg_ha.
  • DESIGN.md unit-discipline contract (A4 NEW-3) — kg/ha as canonical internal unit; included_geo_identifiers as the single required parameter name. Verified at DESIGN.md:114-117.
  • PcaRidgeRegressor national-mode fillna(0) made explicit (A2 Risk 8) — class lives at models/regression/pca_ridge_regressor.py:65. [PLACEHOLDER: re-anchor "fillna(0) made explicit" comment to specific line.]
  • Imputer re-export shim removed and extract_sample_weight inlined (A5 finding 12) — imputation utilities live at lib/edit_and_imputation/imputation.py; partition_groups_by_valid_obs at :146 is consumed directly by models/detrend/partial_pooling_detrend.py:25. Re-export shim absence is consistent with the "no backwards compatibility patterns" rule.
  • TMI PartialPoolingDetrend and PcaRidgeRegressor proven QUBE-equivalent (A5 finding 7) — historical equivalence claim. No current code change required.
  • aggregate_weighted_frame() ↔ QUBE _aggregate_national() cross-test bit-for-bit identical (A5 finding 8) — historical, settled.
  • QUBE stale feature cache identified (A5 finding 9) — TMI is correct; QUBE has the data bug. Documented as a known TMI-vs-QUBE divergence.
  • QUBE MultiStageEstimator.predict() silent county drop (A5 finding 10) — TMI is arguably more correct; documented divergence.
  • Wheat dim-order crash (A6 N1) — pre-fix ds[var_name].values[:, mask] assumed (geoid, time) but conus_adm2_wheat.zarr is (time, geoid). Fix verified live at features/builders/weather.py:75-76: var_da = var_da.transpose(geo_id_col, time_dim); raw = var_da.values[:, mask]. Transpose-to-canonical in place; landed in c7041b68.
  • QUBE wheat climo silently wrong (A6 N3) — gstd.sdoy_start = 91 was re-interpreted by QUBE as calendar DOY 91 = April 1, missing the autumn vegetative phase entirely (74% null rate, std=7 z-scores, 94 bu/ac county mean). Fixed by unification in c7041b68 (build_climo over season-DOY for all commodities); the redundant weather_builder config field was deleted. Wheat now uses the same season-DOY path as corn and soy.
  • Redundant july_*_county features (A6 N10) — july_edd_county / july_precip_county removed. Live configs use edd_jul/precip_jul only.
  • Imputer audit confirmed no yield imputation exists (A5 finding 5) — partial_pooling_detrend.py:233, 268 use partition_groups_by_valid_obs (row filter, not impute). The Imputer plumbing fills only the trend line; all regressors enforce nan_policy: raise (verified at models/regression/pca_ridge_regressor.py:79, 97, 114, 138). [PLACEHOLDER: _assert_no_raw_yield_in_features claimed as guardrail in runtime.py could not be located via grep; the guard may live elsewhere, have been renamed, or been removed entirely.]
  • edd_zscore_apr_jul zero-fill discussion (A2 Risk 2) — historical design discussion.
  • XGBoost native-NaN handling routed through median imputer (A2 Risk 10) — historical design trade-off.
  • regression_params: dict[str, Any] typed-schema gap (A2 Risk 9) — partial protection only; full mitigation out of scope.
  • Pre-commit hook silently rewrote files (A2 Risk 7) — process lesson.
  • Stop-hook E902 ruff cwd issue (A6 N11) — ~/.claude/hooks/lint-check.py runs ruff from the session cwd, not from the git toplevel. Patch sketched in A6 but not applied as of the verification round. Developer-experience drag, not a code regression.

Heatmap

Risk severity × likelihood heatmap

The current PNG renders only R1..R14 from the original register; rows R15..R23 are not yet plotted. [NOTE: heatmap re-render required after this merge to include R15..R23.]