Skip to content

Pre-Registration of Metrics (frozen before runs)

Following 577's HELIOS/ASEMA discipline: the headline metrics, their thresholds, the test-vector set, and the kill-gate logic are defined and frozen here before any transformation run, so results cannot be cherry-picked after the fact. This file is committed; reviewers can diff the proposal-time freeze against the live repository at the cited commit hash.

All metrics are measured on the synthetic, unclassified surrogate in surrogate/. They are preliminary and framework-level, not government-validated. See EXCLUSIONS.md.

Frozen metric definitions

Metric Definition Pre-registered threshold
Parse coverage % of surrogate C#/.NET projects successfully parsed by the Discovery Engine ≥ 95%
Rule-extraction F1 F1 of extracted business rules vs. the hand-labeled surrogate/gold/business-rules.gold.ttl ≥ 0.85
Discrete equivalence Equality of discrete mission outputs (tasking GO/NO-GO, waypoint counts, validation flags) legacy vs. modern, over the frozen vector set 0 violations (tolerance = 0)
Continuous equivalence Relative error of continuous outputs (great-circle distance, interpolated coordinates, time-on-target) within tolerance ≤ 1e-9 relative error; 0 out-of-tolerance
Equivalence confidence Multiplicative-Chernoff upper bound on operational-deviation probability at the corpus N, δ=0.999 reported (lower is better); N frozen below
Complexity reduction Max method cyclomatic complexity, legacy → modern, for the transformed component legacy CC ≥ 30 → modern max CC < 10
STIG remediation Count of Application-Security-&-Development / .NET STIG findings present in legacy and absent after transform reported (before vs. after)
Crypto inventory Count of weak / quantum-vulnerable cryptographic usages found in the surrogate reported

Frozen test-vector set

  • Corpus is generated by surrogate/corpus/generate.py with a fixed RNG seed (recorded in that file). The committed corpus tag at submission is the frozen set.
  • Required tag coverage: nominal, anti-meridian, leap-second, overflow, degenerate-route, precision-drift.
  • Target corpus size N: recorded in surrogate/corpus/manifest.json at freeze time.

Kill-gate logic (Base period)

  • KG#1 (end of Month 3): PASS requires rule-extraction F1 ≥ 0.85 AND the mission-data-aware oracle harness runs end-to-end on the surrogate. FAIL → reduce to single-module scope and document.
  • KG#2 (end of Month 6): formal go/no-go. PASS requires 0 discrete violations, continuous outputs within tolerance, and the cATO artifact bundle auto-generated. PASS → recommend Option/Phase II.

Intentional-divergence policy

Where the modern implementation corrects a known legacy bug (e.g., a mishandled anti-meridian case), behavioral equivalence intentionally diverges. Such cases are recorded as findings (IsIntentionalDivergence = true), not equivalence failures, and are listed explicitly in the equivalence report. This keeps the validation story honest.

Change control

Any change to a threshold, the seed, or the vector set after freeze requires a tracked commit that references this file and a note in governance/REVIEW_GATES.md - mirroring the ECP audit trail the government described in the topic Q&A.

Human-in-the-Loop Review Gates

The government's topic Q&A states the expectation directly: "a lot of human involvement until the product is verified; then ECP to baseline and less human involvement." FORGE EVOLVE for TMPC encodes that as explicit, recorded gates. No transformation is accepted into the modern baseline without a human sign-off carrying the rule-diff, the equivalence delta, and the STIG delta.

Pipeline gates (per transformed component)

  1. Design gate - after Discovery + Planning: human approves the proposed microservice boundary and the migration unit scope.
  2. Translation gate - after Transformation: human reviews the emitted modern code, the extracted business rules it must honor, and the diff.
  3. Acceptance gate - after Validation: human reviews the equivalence report (vectors passed, per-oracle deltas, intentional divergences) and the cATO deltas before the component is accepted.

Each gate produces a ReviewGate record (see ForgeEvolve.Contracts) appended to the tamper-evident provenance chain.

Program gates (human checkpoints for the PI / 577 Industries)

Gate When Decision/action
H0 After Phase 0 Approve surrogate scope + frozen interface contracts
H1 After surrogate build Confirm surrogate is representative AND unmistakably synthetic
KG#1 End of Month 3 (Base) F1 ≥ 0.85 + oracle harness runs → continue full scope
KG#2 End of Month 6 (Base) 0 discrete violations + cATO bundle → recommend Option/Phase II
H-cost Proposal cost volume Supply real company/labor/rate numbers
H5 After verification Accept the all-auditors-pass package
H6 Publish Create/push the public repo under 577's GitHub auth
H7 Submit Complete DSIP webforms (FWA, foreign affiliations) + final certify

Decisions are recorded here with date, gate id, outcome, and evidence pointer.