Trust Thermodynamics — Paper 1 of 3

Mean Time to Epistemic Failure: The Autoimmune Paradox and 72-Hour Decay Dynamics in Autonomous AI Systems

Author: Jason Doffing
Published: March 9, 2026
DOI: 18929815
Repository: github.com/trust-thermodynamics
Classification: Open Science / Practitioner Research
NIST Docket: NIST-2025-0035 (AI Agent Standards Initiative)

Abstract

We present the first empirical measurement of Mean Time to Epistemic Failure (MTEF) for autonomous multi-agent AI systems. Across an experimental corpus spanning 1,100+ agent generations, 4,182 individually scored claims, and four model architectures (Claude, Gemini, Mistral, Groq/Llama), we discover the autoimmune paradox: governed systems with active verification mechanisms lose factual accuracy faster than ungoverned ones, retaining 24.3% of original facts versus 55.2% over 25 generations.

We derive and validate a 72-hour failure horizon through Monte Carlo simulation (10,000 trials, 72.0 ± 23.0 hours) and demonstrate throughput-dependent compression to under 30 minutes at production pipeline velocities. We develop and deploy measurement instruments for real-time epistemic health monitoring, including truth-correspondence entropy (H_truth), damage-weighted truth scoring (D_truth), and verification diagnostic scoring (VDS-2D). All experimental protocols, instrument specifications, and data descriptions are published for community replication.

Key Findings
01 Autoimmune Paradox: Governed systems decay 2.2× faster than ungoverned systems. Verification consumes the cognitive budget needed for truth preservation.
02 Conviction Parity: Fabricated claims carry 97% of the conviction of true claims (CIS 4.4 vs 4.5 on 5-point scale). Confidence-based detection is architecturally blind.
03 72-Hour Failure Horizon: MTEF = 72.0 ± 23.0h at human-supervised rates, compressing to ~29 minutes at high-velocity pipeline rates.
04 Non-Monotonic Severity: Subtle distortions cause higher collapse rates than blatant fabrications. The lies that break your system are the ones that are almost right.
05 Repair Window Extinction: Fact correction is possible in Generations 6–12. After Generation 12, zero repairs observed. The window closes permanently.
06 Conservation of Document Volume: Documents maintain ~16–17 claims per generation while truth content collapses. Monitoring that tracks output volume, confidence, or structure will not detect the degradation.
07 Cross-Model Validation: Four distinct decay architectures identified—gradual erosion (Claude), narrative takeover (Gemini), frozen document (Mistral), selective erosion (Groq/Llama). The paradox reproduces across all architectures.

Supplementary Materials

Seven appendices provide complete experimental protocols, instrument specifications, and analytical detail:

AppendixContent
AExperimental Protocols (seed documents, system prompts, API parameters, scoring rubrics)
BScoring Instruments (H_truth 5-state taxonomy, D_truth weighting, VDS-2D specification)
CCascade Analysis (EXP-CASCADE-002 cross-analysis, non-monotonic severity data)
DTemporal Dynamics (per-generation trajectories, convergence classification, transition matrices)
EIntervention Matrix (fork experiments, dose-response, repair window analysis)
FPipeline Validation (MANDELA-001a methodology, claim extraction, soft lineage resolution)
GFramework Terminology (Trust Thermodynamics axioms, glossary, derivation constants)

Citation

Doffing, J. (2026). Mean Time to Epistemic Failure: The Autoimmune Paradox and 72-Hour Decay Dynamics in Autonomous AI Systems. Zenodo. DOI:18929815

This paper is published as open science under the Probabilistic Resilience Engineering research program. The findings were submitted to the NIST AI Agent Standards Initiative (Docket NIST-2025-0035) as evidence for epistemic integrity as a fourth verification domain for AI agent governance.

For practitioner-accessible analysis, see Confident, Wrong, and Undetectable.