Artifact A — helios-provenance-spec v0.1 Review Pack¶
Agent: A (background, dispatched 2026-05-17)
Branch: feat/v0.1-rfc (local; not pushed)
Local tag: v0.1.0 (annotated; not pushed)
Commits: 3 on top of scaffolding (schema → tests → docs/RFC)
TL;DR¶
Substantial and clean. 98 tests passing at 98% coverage on src/helios_provenance/. ruff check, ruff format --check, and mypy --strict all green. The centerpiece worked example (a HeliosFusedOutputRecord with full 3-step lineage tracing back through conformal wrapping → BMA averaging → isotonic calibration to three Scoreboard A inputs) is implemented at schema/examples/11-fused-sep-all-clear.json and round-trips through tamper-detecting hashing.
Recommend merging after reviewing the 4 open questions below.
What landed (file-by-file highlights)¶
Schema (schema/)¶
helios-provenance-v0.1.json— JSON Schema 2020-12 with 4 record types underoneOfdiscriminator. UsesunevaluatedProperties: falseper-branch (cleaner thanadditionalPropertiesforallOfcomposition).examples/01..11-*.json— 11 valid records: DONKI flare, Scoreboard A (UMASEP-10), SWPC Kp, CDDIS GIM TEC, GOES protons, DSCOVR solar wind, isotonic/BMA/conformal transformations, and the fused output.crosswalks/{spase,prov,ro-crate}.md— field-by-field mappings.
Python reference impl (src/helios_provenance/)¶
__init__.py— version0.1.0; full public API exported.models.py— pydantic v2 withextra="forbid";parse_record(),to_jsonld(),HeliosFusedOutputRecord.build_with_hash(),verify_hash().hashing.py— RFC 8785 JCS canonical-JSON hashing with documented fallback, null-stripping normalization, rejection of non-finite floats. Hash payload coverslineage + prediction_target + timestamp + value + value_units + schema_version. Does NOT coverconformal_interval,location,agent— see open question #1.validator.py—HeliosProvenanceValidatorclass +helios-provenance-validateCLI (supports stdin via-).crosswalk.py—dataset_to_spase_xml()+records_to_prov_json()._schema/helios-provenance-v0.1.json— schema shipped inside the wheel.
Tests (tests/)¶
conftest.py,test_smoke.py,test_models.py,test_hashing.py,test_validator.py,test_crosswalk.py— 98 tests total, 0.91s runtime.
RFC + docs¶
rfc/RFC-0001-feature-lineage.md— ~2000 words. Section 6 has 8 explicit open questions flagged for community comment (see below).docs/{index,schema,examples,api}.md+ mirrored crosswalks/RFC.mkdocs.ymlupdated.
Release-prep additions¶
CHANGELOG.md(new) — v0.1.0 entry.README.mdrefreshed.CITATION.cffbumped to 0.1.0.pyproject.toml— addedjsonschema,rfc8785,rfc3339-validator,rfc3987runtime deps; addedhelios-provenance-validatescript entry point.
Open questions — your call¶
- Push the
v0.1.0tag now? - Plan says no (operator gates releases). Agent followed plan. The tag is annotated and on
feat/v0.1-rfclocally. -
Recommend: review the diff, run
pytest --covyourself, thengit checkout main && git merge --no-ff feat/v0.1-rfc && git push origin main && git push origin v0.1.0. -
Open Issue #1 ("RFC-0001: feature-level provenance for heliophysics fusion systems") on GitHub.
- Agent deferred this until the branch is on main. Reasonable — the issue body cites
rfc/RFC-0001-feature-lineage.mdand the link needs to resolve. -
After merge, run
gh issue create --repo 577Industries/helios-provenance-spec --title "RFC-0001: feature-level provenance for heliophysics fusion systems" --body-file rfc/RFC-0001-feature-lineage.md(or a hand-edited intro pointing to the file). -
The 8 open RFC §6 questions — community-comment items. Worth scanning to confirm none should be pre-resolved before publishing the RFC. The two most consequential per my read:
- Q1 (code_ref shape): free-form string vs. structured
{git_url, sha, path}. Structured is more rigorous; free-form is easier for early adopters. The agent left it as string. Reasonable for v0.1 RFC. -
Q8 (hash payload composition): should
conformal_interval,location,agentbe inside the hash? Trade-off: including them makes records tamper-evident across MORE dimensions but breaks hash stability when, e.g., a conformal recalibration recomputes intervals without changing the underlying fused value. The agent excluded them; the rationale is reasonable and worth confirming. -
@contextURI is a placeholder (577industries.github.io/.../v0.1.jsonld) into_jsonld()output. Real URL becomes the MkDocs site address once docs deploy. Fix as a v0.1.1 patch alongside docs deployment.
Merge readiness checklist (per master plan §"Per-Artifact 'citable'-readiness")¶
- ✅ CI green on main (will run on push; local pytest/ruff/mypy all green)
- ✅ README with badges + working quick-start
- ✅ LICENSE (Apache 2.0) + NOTICE + CITATION.cff
- ⏳ Tagged v0.1.0 (local; not pushed)
- ⏳ Published to PyPI (post-merge via GH release)
- ⏳ DOI minted via Zenodo (post-tag-push)
- ⏳ RFC issue open and circulated (post-merge; see open question #2)
- N/A in this RFC pass: pre-registration on OSF (that's for Artifact C)
Sequence the operator should run¶
```bash
1. Pre-merge review¶
cd ~/577i-Projects/helios-provenance-spec git diff main..feat/v0.1-rfc | less git checkout feat/v0.1-rfc pip install -e '.[dev]' pytest --cov && ruff check . && ruff format --check . && mypy
2. Merge¶
git checkout main git merge --no-ff feat/v0.1-rfc -m "feat: helios-provenance-spec v0.1.0 RFC
JSON Schema 2020-12 for 4 record types (Dataset / ModelOutput / Transformation / FusedOutput). pydantic v2 reference implementation with tamper-evident lineage hashing. 11 worked examples including end-to-end fused SEP all-clear lineage. SPASE / PROV-JSON / RO-Crate crosswalks. RFC-0001 issued for community comment.
98 tests, 98% coverage."
3. Push branch + tag¶
git push origin main git push origin v0.1.0
4. Open GitHub release (auto-trigger PyPI publish if trusted publishing is configured)¶
gh release create v0.1.0 --generate-notes --repo 577Industries/helios-provenance-spec
5. Open RFC discussion issue¶
gh issue create --repo 577Industries/helios-provenance-spec \ --title "RFC-0001: feature-level provenance for heliophysics fusion systems" \ --body "See `rfc/RFC-0001-feature-lineage.md`. Comments welcome on the 8 open questions in §6."
6. Notify the helios-program companion that A has shipped¶
cd ~/577i-Projects/helios-program python -m orchestration.companion_sync git add companion/footnotes.yaml git commit -m "chore: companion sync after helios-provenance-spec v0.1.0" git push ```
Downstream impact¶
Once A v0.1.0 lands and is pushed:
- Connectors (Artifact B) has a placeholder ProvenanceRecord it can now swap for from helios_provenance.models import HeliosModelOutputRecord etc. Dispatch a follow-up agent against helios-spaceweather-connectors on feat/v0.2-real-provenance to do this swap, update tests, and tag v0.2.0.
- Fusion engine (Artifact C) also has placeholder types in src/helios_fusion/types.py. Same swap. Dispatch in parallel.
- Companion document updates automatically via companion_sync — once v0.1.0 is tagged on GH, companion/footnotes.yaml will reflect version: 0.1.0 and status: in-development.
Confidence notes¶
- The schema design is conservative (composes existing standards rather than inventing) but adds a genuinely novel feature-level lineage record. That's the right shape for an RFC — easier to gather community comments than to defend something fully novel.
- The
rfc8785library handles JCS canonical JSON. If the maintainers ever break compatibility, the documented fallback (inhashing.py) ensures backward decodability. Reasonable defensive choice. - The agent added 4 runtime deps (
jsonschema,rfc8785,rfc3339-validator,rfc3987). All small, well-maintained. Acceptable for a spec library.
Bottom line: ready for your review and merge. No blocking issues found; 4 questions are gated on your decision, none of which change the implementation.