For Auditors ARCHIVE
For Auditors: 5-Minute Verification Checklist
This page provides a quick verification path for external auditors evaluating MathLedger's governance claims.
Expectation: Auditors should expect that some claims will remain permanently ABSTAINED. This is correct behavior and not a failure mode.5-Minute Verification Checklist
Complete these steps to verify the system's core claims:
Step 1: Open the Hosted Demo (30 seconds)
- Navigate to /demo/
- Confirm the demo loads without errors
- Verify the version banner shows
v0.2.11-verifier-parity
Step 2: Run the Boundary Demo (90 seconds)
- In the demo, click "Boundary Demo: Same Claim, Different Authority"
- Observe the 4-step sequence:
- Key insight: Same mathematical truth, different outcomes based on trust class declaration
Step 3: Verify with Self-Test (60 seconds) — Recommended
The verifier has built-in test vectors that work regardless of demo state:
- Open the Evidence Pack Verifier
- Click "Run self-test vectors"
- Confirm the banner shows SELF-TEST PASSED (3 vectors)
- Observe the results table:
valid_boundary_demo: Expected PASS, Actual PASS → Test Result PASS
- tampered_ht_mismatch: Expected FAIL, Actual FAIL → Test Result PASS
- tampered_rt_mismatch: Expected FAIL, Actual FAIL → Test Result PASS
This proves the verification algorithm works correctly and tamper detection functions as designed.
Step 3 (Alternative): Download from Demo
If you want to verify a pack generated by your own demo session:
- After running the boundary demo, click "Download Evidence Pack"
- Save the JSON file locally
- Upload to the Evidence Pack Verifier
- Click "Verify" and confirm PASS
Ready-to-Verify Examples (No Demo Required)
If the demo is unavailable or you want to verify without running it, use these pre-generated examples:
Download Examples
The file /v0.2.11/evidence-pack/examples.json contains:
| Example | Expected Result | Purpose |
|---|---|---|
valid_boundary_demo | PASS | Shows a correctly-formed evidence pack from a boundary demo |
tampered_ht_mismatch | FAIL (h_t mismatch) | Demonstrates detection of tampered H_t field |
tampered_rt_mismatch | FAIL (r_t mismatch) | Demonstrates detection of tampered reasoning artifacts |
Verification Steps
- Download
examples.json - Copy the content of
examples.valid_boundary_demo.packto a new file - Open the Evidence Pack Verifier
- Upload the pack JSON
- Click Verify and observe PASS
- Repeat with
tampered_ht_mismatch.packand observe FAIL
Why This Matters
These examples prove:
- The verification algorithm is deterministic (same input always produces same hash)
- Tamper detection works (any modification is detected)
- No demo required (auditors can verify offline using the same code)
What This Version Claims
This version (v0.2.11) demonstrates:
- Deterministic canonicalization of reasoning artifacts (RFC 8785-style)
- Cryptographic binding via H_t = SHA256(R_t || U_t)
- Trust class separation where ADV claims are excluded from R_t
- Replay verification using the same code paths as live execution
- Self-auditing capability via the evidence pack verifier
What This Version Does NOT Claim
Explicit non-claims:- No formal verification — FV (Formally Verified) claims always return ABSTAINED. There is no Lean/Z3/Coq integration.
- No multi-model consensus — Only one partitioner template is active. The "multi-model arena" is not operational.
- No learning loop — RFL (Reinforcement from Ledger) is not active. The system does not learn from past decisions.
- No production hardening — This is a demonstration, not a production system. Do not use for real financial or legal decisions.
- No guaranteed uptime — The hosted demo may be unavailable. Local execution is always possible.
Demo Reliability Note
The hosted demo runs on shared infrastructure with known limitations:
- "Server lost proposal state" — If you see this error, refresh the page and retry. This is a demo hosting limitation (stateless container restart), not user error.
- Evidence pack download timing — The boundary demo sequence is scripted and does not persist state for download. Use the verifier self-test for reliable verification.
- Intermittent 503 errors — Network tab may show failed requests to analytics endpoints. These do not affect demo functionality.
Key Documentation Links
| Document | Description |
|---|---|
| Scope Lock | What this version does and does not demonstrate |
| Invariants | Tier A/B/C enforcement status with explicit counts |
| Field Manual | Obligation ledger driving version promotions |
| Hostile Demo Rehearsal | Pre-prepared answers to skeptical questions |
| How the Demo Explains Itself | UI self-explanation integration points |
Field Manual (Why It Exists)
The Field Manual (fm.tex/pdf) is an internal constraint artifact, not marketing documentation.
It is not rewritten yet; it is used to surface obligations and gaps.How auditors should use it:
- Cross-reference claims — If a feature is claimed, check FM for caveats
- Look for "TODO" and "OBLIGATION" — Explicit acknowledgments of gaps
- Compare versions — FM changes show what was addressed between releases
- Trust gaps over features — The gaps we document are more honest than features we claim
Local Verification (Alternative Path)
If the hosted demo is unavailable, verify locally:
git clone https://github.com/helpfuldolphin/mathledger
cd mathledger
git checkout v0.2.11-verifier-parity
uv run python demo/app.py
Open http://localhost:8000
The local demo produces identical evidence packs that can be verified with the same auditor tool.
External Audits
Independent audits of this project are listed below.
| Date | Auditor Role | Version Audited | Report |
|---|---|---|---|
| 2026-01-03 | External safety lead (no prior context) | v0.2.1 archive, v0.2.0 demo | Cold-Start Audit Report |
| 2026-01-03 | Hostile link integrity auditor | v0.2.1 site | Link Integrity Audit |
| 2026-01-03 | Hostile acquisition auditor | v0.2.2 site | Hostile Audit v0.2.2 |
Questions?
For technical questions about this verification process, see the Hostile Demo Rehearsal document which addresses common skeptical questions with prepared answers.