PUBLIC

Signed Receipts and Replayable Evidence

Important work should leave proof that can be checked later

Autonomous work should emit signed receipts and replayable evidence.

CURRENT 5 min Advanced Paper
Article map
Maps to
Maps to HELM AI Kernel
Status
PUBLIC
Reviewed
2026-06-08

Proof-safe research note.

Autonomous execution needs more than logs. This paper explains why proposals, verdicts, approvals, execution results, and evidence references should be captured for audit and replay.

ReceiptsEvidencePackReplay

What this does and does not claim.

Does
  • Frames signed receipts and replayable evidence as a research lens for governed AI execution.
  • Separates model proposal from execution authority.
  • Keeps product claims tied to current public HELM evidence surfaces.
Does not
  • Does not claim every described pattern is generally available in production.
  • Does not claim third-party certification, vendor partnership, or compliance attestation.
  • Does not make local demos, tests, or diagrams equivalent to live customer proof.

Claim, boundary, evidence implication.

Claim

Autonomous work should emit signed receipts and replayable evidence.

Boundary

The article describes the proof model and does not claim all production deployments emit every artifact today.

Evidence

Proof claims need receipt hashes, verifier coverage, and EvidencePack references.

Diagram interlude

Receipts make execution replayable as evidence.

Every governed action needs a receipt trail that can be inspected without turning private context into public proof.

ProofGraph Event TrailREPLAYABLESIGNEDHASH-CHAINED
Every action leaves a chain of signed evidence that can be replayed and verified.
ProofGraph Event TrailProofGraph Event Trail: 8 cryptographic events form a chain from Proposal through Receipt to Replay, each with timestamp, actor, policy version, and hash.SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]SECURE_BLOCK_CHAIN_HASH_VALIDATION_STAMP_MD5_SHA256_VERIFIED_BY_HELM_KERNEL_SECURE_REPLAY_HASH_[a4f2_e7b1_91c8_b3d0]EVENT CHAINClick any event to reveal its metadata. Arrow keys navigate.▶ REPLAY
Text description
EventTimestampActorHashDecision
Proposal14:32:01.442Zagent-sprint-twina4f2…c891Submitted
Policy Snapshot14:32:01.443Zhelm-pepe7b1…3d40Captured
Approval State14:32:01.510Zhelm-cpi91c8…f712Verified
Tool Contract14:32:01.511Zhelm-connectorb3d0…8a21Matched
Action14:32:01.580Zconnector-githubf4e2…1b09Executed
Receipt14:32:01.581Zhelm-proofc912…4df8Signed
EvidencePack14:32:01.590Zhelm-evidenced1a7…92e3Bundled
Replayauditorfull chainVerifier pass
Open standalone diagram

When an API call is made by a human developer, the audit trail is straightforward: an identity, a timestamp, and a request payload. When an autonomous AI agent executes a sequence of actions, the context is far more complex. Traditional logging is insufficient for answering the critical questions.

Signed Receipts Section

Cryptographic Provenance

Every action taken by the system generates a signed receipt and replayable evidence.

The Evidence Pack

Whenever a proposal is generated and evaluated, the system compiles an Evidence Pack. This is a signed, tamper-sensitive record containing:

  • The original intent that initiated the action.
  • The specific context state available at that time.
  • The exact structured spec generated by the model.
  • The policy evaluation result.
  • The human approval signature if required.
  • The final execution result.

Replayability for Trust and Debugging

Because every input and state transition is captured, any execution can be replayed:

  • For auditors: This gives reviewers a chain from human intent to machine execution. It is evidence, not a certification.
  • For engineers: This allows developers to load a failed execution state locally, inspect what the model saw, and debug the specific failure point.

Execution should not be a black box. It should leave a receipt that people can inspect.

Request architecture review Back to Research