Skip to content

Lint Report — P7

Summary

Total pages: 109 (including domain_model/ and wiki/ trees) Backlog items applied: 17 Backlog items punted: 1 (item 7 — verified no action needed) New broken links found: 84 (across 47 files) Orphans found: 2 (new index files, expected) Contradictions found: 3 Missing concept pages identified: 6 high-value gaps

Backlog Applied

# Item Status
1 EditOperation sub-table in ENTITIES.md Applied — added 6-row table listing Clip, Flag, Drop, Fail, DeductiveImpute, PanelTrailingMedian with edit.py line refs
2 AGGREGATES.md CalibrationResult "transient" claim Applied — replaced with correct description noting save (conformalise.py:215) and load (conformalise.py:225) with per-mode parquet sidecars
3 EvaluationConfig existence check Applied — removed from AGGREGATES.md children list; noted it exists in DOMAIN_MODEL2.md and DESIGN.md as conceptual but has no class in config.py
4 AssembleStressConfig in AGGREGATES ExperimentConfig children Applied — added to children list with config.py:203 citation
5 HINDCAST_DELIVERY production edge in ER Diagram 2 Applied — added HINDCAST_SLICE ​\|\|--\|\| HINDCAST_DELIVERY : "assembled into" and FORECAST_SLICE ​\|\|--o\| HINDCAST_DELIVERY : "may produce"
6 meta_models.md types.py:14 → types.py:16 Applied
7 Normalise bare sources: fields to sources: [] Verified no-op — all 30 flagged files have valid YAML list values on the following lines; no truly bare fields exist
8 fit_production note in stages.md Applied — added paragraph after the public-surface listing
9 timeline.md commit count 87→90, 111→116 Applied — updated table and summary paragraph; added miguelboland-treefera: 1 row
10 DESIGN.md wiki EARS clause count 35→54 Applied — updated description from "35 named clauses" to "54 clause-bullets" with explanation that the wiki index covers 35 of them
11 Write commits/index.md Applied — new file with frontmatter and link to timeline.md
12 Write code/index.md Applied — new file listing all 11 subsystem pages with descriptions
13 corn_usa.md dead cross-refs Verified — entities/ExperimentConfig.md and entities/Commodity.md both exist; links resolve correctly
14 postprocess.md clause number Applied — updated to cite "wiki Clause 34; actual DESIGN.md:73"
15 Four missing cross-ref targets Applied — wrote entities/SeasonWindow.md (~80 lines), concepts/conformal_calibration.md (redirect to conformal_modes.md), concepts/bias_correction.md (~80 lines), concepts/resolvable_path.md (~80 lines)
16 Entity cross-refs in adm_levels.md and multi_year_forecast.md Applied — added See Also sections to both pages
17 forecast.md See Also Applied — added ForecastSlice.md link
18 postprocess output path drift in LINT_REPORT Applied — documented in Architectural Drift Items section below

Additionally, corrected conformalise.py:224 → :225 in meta_models.md and entities/CalibrationResult.md (off-by-one found during item 2 verification).

New Issues Found by Standard Scan

84 broken relative links across 47 files. The dominant pattern is links to pages that were planned in the entity/concept/pipeline framework but not yet written. Grouped by category:

Planned pipeline pages not yet written (19 entity pages link to these): - pipelines/hindcast.md — referenced by 19 entity pages (ExperimentConfig, Fold, HindcastSlice, CalibrationResult, ReferenceYieldSpec, Yield, BiasCorrectorConfig, RunDir, ExperimentProtocolConfig, Commodity, InitDate, SeasonYear, Region, PostprocessConfig, ExperimentResult) - pipelines/pipeline_features.md — 2 references (FeatureBuilder.md, features.md source page) - pipelines/delivery.md — 1 reference (DeliveryConfig.md) - pipelines/features.md — 1 reference (EditRuleConfig.md)

Concept pages not yet written: - concepts/detrending.md — referenced by ModelConfig.md - concepts/abstract_slice.md — referenced by HindcastSlice.md, ForecastSlice.md - concepts/stage_isolation.md — referenced by ExperimentResult.md, RunDir.md - concepts/season_doy.md — referenced by CommodityConfig.md - concepts/unit_conversion.md — referenced by Yield.md, Commodity.md - concepts/conformal_prediction.md — referenced by DeliveryConfig.md - concepts/delivery_transforms.md — referenced by DeliveryConfig.md, PR-331.md - concepts/geo_identifier_format.md — referenced by Region.md - concepts/spatial_aggregation.md — referenced by Region.md - concepts/forecast_path_layout.md — referenced by ForecastSlice.md, PR-369.md - concepts/long_range_climo_stub.md — referenced by ForecastSlice.md, PR-369.md - concepts/fellegi_holt.md — referenced by EditRuleConfig.md - concepts/centralised_climo.md — referenced by ForecastConfig.md - concepts/experiment_protocol.md — referenced by stages.md - concepts/delivery.md — referenced by stages.md - concepts/pipeline_dag.md — referenced by PR-339.md, PR-353.md - concepts/import_cycle.md — referenced by PR-339.md - concepts/bounded_contexts.md — referenced by PR-353.md - concepts/unit_conventions.md — referenced by PR-331.md, PR-360.md, PR-363.md - concepts/reference_data_union.md — referenced by PR-360.md - concepts/dashboard_truth_sources.md — referenced by PR-340.md - concepts/panel_imputer.md — referenced by PR-369.md - concepts/reference_data.md — referenced by plots.md - concepts/delivery_schema.md — referenced by plots.md

Entity pages not yet written: - entities/EditRule.md — referenced by PR-353.md (separate from EditRuleConfig.md) - entities/WasdeLoader.md — referenced by PR-363.md

PR cross-ref pattern (source pages link to wrong paths): - PR-345.md links to ../../code/cli.md and ../../code/run_predict.md — these should be ../code/orchestration.md and ../code/stages.md - PR-361.md, PR-372.md link to ../../code/meta_models.md and ../../code/run_forecast.md — should be ../code/meta_models.md and ../code/stages.md - PR-369.md links to ../../code/results_slice.md and ../../code/run_forecast.md - PR-345.md links to ../../../../memory/MEMORY.md — this external path is invalid in the wiki context

Source page path errors: - sources/code/detrend.md links to runtime.md (non-existent sibling) - sources/code/dashboard.md links to ../../orchestration.md and ../../regression.md (wrong paths; should be orchestration.md and regression.md as siblings) - sources/code/delivery.md links to ../../../commodity_hindcast_kb/domain_model/BOUNDED_CONTEXTS.md (traverses above wiki root — incorrect path) - sources/code/features.md links to configs.md (non-existent sibling) - sources/code/regression.md links to three source-code directories (not wiki pages)

AGENTS.md example link: - AGENTS.md contains ../entities/Commodity.md (example link in schema documentation) and relative/path.md (template placeholder) — both are illustrative examples in the schema definition, not actual cross-references. These can be ignored.

Orphans

Only the two new index pages (sources/code/index.md, sources/commits/index.md) have zero inbound links, which is expected since the main index.md has not yet been updated to reference them. Recommend adding links from index.md.

Contradictions

  1. conformalise.py load line numbermeta_models.md previously cited :224; CalibrationResult.md cited :224; actual line is :225. Fixed in this lint pass.

  2. AGGREGATES.md vs ENTITIES.md on CalibrationResult — AGGREGATES.md (before fix) described CalibrationResult as "transient"; ENTITIES.md Tier 3 described it as having save/load. Now aligned.

  3. DESIGN.md clause count — wiki DESIGN.md page said "35 named clauses"; actual DESIGN.md has 54 EARS-style bullet-clauses. Fixed to "54 clause-bullets" with note about the wiki's numbered index covering 35 of them.

Suggested Missing Pages

The following terms appear in two or more wiki files but have no dedicated page (or, where a page exists, are flagged as already covered):

Term Occurrences (files) Suggested page
geo_identifier / GeoIdentifier 42 concepts/geo_identifier_format.md (distinct from adm_levels.md which is higher-level)
season_doy 25 concepts/season_doy.md — the season_doy vs calendar_doy distinction and cross-year crop handling
pipelines/hindcast.md 19 High priority — the hindcast pipeline page is the single most-linked missing page; 19 entity pages reference it
walk_forward_cv 18 concepts/walk_forward_cv.md exists — but 18 files reference it, so it is well-covered already
experiment_protocol 17 concepts/experiment_protocol.mdExpandingFoldGenerator, fold schedule, test-years logic
unit_conversion 2 concepts/unit_conversion.md — kg/ha ↔ bu/acre boundaries

Architectural Drift Items (Out of Scope for Lint, Escalated)

  • DESIGN.md artefact contract vs code postprocess output path — DESIGN.md (DESIGN.md:73) specifies postprocessed/{experiment_key}_national.parquet but the code writes postprocessed/national.parquet (without the {experiment_key} infix). The run_dir path already encodes the experiment key via the timestamped directory name. The wiki pipelines/postprocess.md already documents this discrepancy. Maintainers should either: (a) update DESIGN.md:73 to say postprocessed/national.parquet, or (b) update the code to include the experiment key. The current state means DESIGN.md is technically incorrect as a code contract.

  • _OOS_MODES frozenset hand-maintained (run_forecast.py:82) — the set of OOS residual mode strings used by validate_residual_mode is a hand-maintained frozenset rather than being derived from the ResidualMode Literal in types.py. Any addition to ResidualMode that is not added to _OOS_MODES will silently allow a non-OOS mode through the validation gate for run_dirs that lack a conformal sidecar. Drift risk: medium. Fix: replace frozenset with set(get_args(ResidualMode)) - {"in_sample_pooled"}.

  • Two upward-import layering violations in delivery/ — existing tech debt; tracked in BOUNDED_CONTEXTS.md. Two modules in delivery/ import from stages/ layer rather than from the shared library layer, breaking the single-direction import DAG (DESIGN.md Clause 19).

  • entities/EditRule.md vs entities/EditRuleConfig.md — PR-353.md references EditRule.md while the actual entity page is EditRuleConfig.md. These may be two different intended pages (the rule config vs the rule operation), or a naming inconsistency. Needs clarification.

  • Write pipelines/hindcast.md — 19 entity pages are broken until this exists. It is the highest-priority missing page.
  • Fix PR source page cross-refs — PR-345.md, PR-361.md, PR-372.md, PR-369.md all use incorrect relative paths to code source pages.
  • Write concepts/experiment_protocol.mdExpandingFoldGenerator, fold-label semantics, test_years ordering; referenced by 17 files.
  • Write concepts/season_doy.md — season-DOY vs calendar-DOY distinction; cross-year crop handling for wheat; referenced by 25 files.
  • Write pipelines/hindcast.md — cover the full run_hindcast.run() orchestration, walk-forward loop, and production fit.
  • Update wiki/index.md to add entries for the new pages created in this pass (SeasonWindow, conformal_calibration, bias_correction, resolvable_path, code/index.md, commits/index.md).
  • Fix DESIGN.md:73 drift — align the artefact contract clause with what the code actually writes (postprocessed/national.parquet).