Science. The Lab Protocol
One cure at a time. Gates written first.
This is the discipline we hold the harder work to. Every change is tested against a matched control where one variable moves. Every test names its pass line and its falsifier before the run. An adversarial review defaults to reject, and the math has to survive before any code is written. Verdicts are PASS, PARTIAL, FAIL, or WITHHELD, never a percentage, never spun. The honest part is the method, in the open.
The first rules
Five rules, each one a brake on overclaiming.
These exist so a result means what it says. They make claims slower to earn and harder to fake.
- 01
One cure at a time
A second change is not deployed until the first has a recorded verdict. If two things changed at once, the result is unattributable and may not be claimed.
- 02
Paired by default
Same code, same world, same body, same shape. The only difference between the two arms is the one thing under test. That is the whole point of a clean result.
- 03
Gates written first
Every cure registers its gates before the run. PASS requires all of a named list. It FALSIFIES if a named thing happens. The falsifier is the cost of entry.
- 04
Continuous evidence
Time-series, not single snapshots, collected outside the model session so they survive interruptions. Behaviour is confirmed by an independent authority, not the body's own report.
- 05
Receipts on every claim
Each number points to a commit, a saved artifact, and a log line that reproduces it. No receipt, no claim.
The adversarial review
Five roles. Reject is the default.
Each role reviews alone first, with no cross-contamination. The math-breaker speaks first and tries to refute the term outright. Only if the math survives does anyone discuss how to build it. A theorist merges the verdicts into one call.
Math-breaker
Systems reviewer
Experiment designer
Embodiment designer
Theorist, the merger
Why the review actually bites
Three rules for the reviewers themselves.
- Name the math before the metaphor
- Locate a proposal in the model first. This blocks curiosity, need, or awareness language from hiding an undefined number.
- Demand the falsifier before the cure
- Every reviewer states the condition that would reject the proposal before suggesting any fix.
- Force typed artifacts, not prose approval
- An accepted change ships a typed spec, property-test validators, a paired test design, and a short report. Approval is never just a nod.
The verdict classes
Four honest verdicts. One of them can lose.
- PASS
The gate fired
Consistent, and attributable to the single variable under test.
- PARTIAL
A sub-claim held
The full claim needs more, named and scheduled. Recorded, never spun into the larger win.
- FAIL
The falsifier fired
The registered no-go was observed. The map updates.
- WITHHELD
The reviewers disagreed
A contradiction in the verdicts, escalated to a human rather than resolved quietly.
A real PARTIAL, recorded in full: a curiosity term suppressed a hoarding pattern in the test arm but did not, on its own, break the deeper behavioural plateau. The information-gain advantage decayed exactly as the math says it must once the model is well-fit, which preserves the no-reward guarantee. We logged it as PARTIAL, with the receipt, and named the next cure to try. We did not spin hoard-suppression into a plateau-break.
The honesty fences
The claim fence, carried in full.
- 01
Operational behavioural and organisational measures are necessary-not-sufficient substrates with ZERO evidential weight for awareness, consciousness, or life on their own. Passing a gate demonstrates the named behaviour, never experience.
- 02
The UNI preprint (DOI 10.5281/zenodo.19785799) is an unrefereed working preprint. Peer review is pending.
- 03
The pure core is implementation-complete but the live colony adapter is a documented bridge. We state what is demonstrated vs specified vs aspirational.
