Broken by Default: Structural Fragility

Updated 4 July 2026

Broken by Default is a cross-disciplinary concept that defines inherent fragility encoded in baseline system configurations and assumptions.
It highlights that systems appearing secure or verified—from TLS to AI code—can harbor vulnerabilities without active hardening.
The topic spans domains such as software security, formal semantics, and financial modeling, offering insights into detecting and rectifying hidden failure modes.

“Broken by default” is not a single formal doctrine but a recurring analytical motif across software security, program verification, logic, physics, and mathematical finance. In each domain, the phrase denotes a setting in which the nominal baseline state of a system—its shipped configuration, its verification boundary, its inferential semantics, or its market mechanism—already encodes fragility, unsafe behavior, or structural incompleteness unless explicit countermeasures are added. In TLS server software, the relevant problem is that enabling TLS with minimal effort often leaves legacy protocols and weak options in place rather than delivering a modern secure posture (Stanek, 2017). In web applications, an automated suite can be continuously green while user-facing defects still escape because the tests do not observe the seams where failures occur (Bilal et al., 21 Jun 2026). In contextuality theory, the analogous claim is that a measurement should not be identified across contexts a priori; it must be treated as a contextual collection of random variables “by default” (Tezzin et al., 2020). This breadth suggests that the phrase names a family of structurally similar failure modes rather than a single technical property.

1. Baseline fragility as a general pattern

A common feature across the literature is that “default” does not mean merely “initial” or “standard”; it means the operational assumption that is adopted before any corrective structure is imposed. In some settings the default is a software configuration. In others it is an observation model, a proof-theoretic semantics, or a market microstructure. The resulting breakage is often not total failure. Martin Stanek’s TLS study is explicit that the tested products were not uniformly “catastrophically unsafe out of the box”; rather, the pattern was insecure or suboptimal defaults that preserved legacy compatibility and left administrators exposed unless they actively hardened the system (Stanek, 2017).

This distinction matters because several of the papers reject a simplistic interpretation in which “broken” means immediately unusable. The test-automation study reports a suite of 1,553 test cases that passed continuously, yet defects still reached production because the green signal referred only to the part of reality the suite actually observed (Bilal et al., 21 Jun 2026). The contextuality paper similarly does not say that measurements become meaningless across contexts; it says that cross-context identity must not be assumed as the starting point (Tezzin et al., 2020). In finance, “default” itself may cease to be a surprise event under certain information structures or contractual arrangements, so the brokenness lies in the model or mechanism that turns what should be an inaccessible event into a predictable or amplified one [(Bedini et al., 2016); (Kenyon et al., 2013)].

A plausible implication is that “broken by default” functions as a diagnosis of hidden commitments. Systems appear stable until one asks what is being held fixed: protocol versions, test harnesses, context labels, transitive dependencies, intensity assumptions, collateral rules, or future trading opportunities. The moment those defaults are made explicit, the latent failure mode becomes analyzable.

2. Insecure configurations and unsafe generated artifacts

In software security, the phrase is used most literally. Stanek’s study of TLS tested twelve web and application servers on Ubuntu 16.04 LTS with up-to-date updates, using default server configurations and making only the minimal changes required to enable TLS, typically by pointing to certificate and key files (Stanek, 2017). All tested servers disabled SSL 2.0 and SSL 3.0 by default, but the broader secure-by-default principle was “followed rarely.” The majority, including apache, nginx, tomcat, and wildfly, enabled TLS 1.0, TLS 1.1, and TLS 1.2 by default; caddy and hiawatha did not enable TLS 1.0 by default; and gunicorn accepted only TLS 1.0 by default. Cipher defaults were likewise mixed: some servers did not define cipher order, gunicorn offered NULL or export-grade ciphers, a few servers exposed RC4 or SEED, and all servers supported some cipher suites offering PFS. The study therefore sharpened the meaning of breakage: default TLS was often functional, but not securely configured.

The 2026 formal-verification study on AI-generated code pushes the same theme into code synthesis. Across 3,500 artifacts generated by seven frontier models on 500 security-critical prompts, 55.8% contained at least one COBALT-identified vulnerability, and 1,055 findings were formally proven exploitable by Z3 satisfiability witnesses (Blain et al., 7 Apr 2026). The benchmark covered MEM, INT, AUTH, CRYPTO, and INP categories; the canonical arithmetic proof obligation was the overflow condition

$n \times \mathit{sizeof}(T) < n$

with $n$ modeled as a BitVec(32). No model achieved a grade better than D; GPT-4o scored 62.4% vulnerable artifacts and received F, while Gemini 2.5 Flash scored 48.4% and still received D. The auxiliary findings strengthened the “by default” diagnosis: explicit security instructions reduced the mean vulnerability rate by only 4 percentage points, six industry tools combined missed 97.8% of the Z3-proven findings, and the same models identified their own vulnerable outputs in review mode 78.7% of the time while generating vulnerable outputs 55.8% of the time by default (Blain et al., 7 Apr 2026).

Taken together, these studies treat default as a baseline production mode rather than a corner case. In TLS, the shipped configuration leaves legacy risk active. In AI code generation, the baseline output distribution itself is vulnerability-prone. This suggests that “secure by default” is not merely a usability slogan but a claim about the location of responsibility: hardening cannot be delegated entirely to administrators, downstream reviewers, or post hoc static tools.

3. Verification boundaries, seam blindness, and update breakage

The paper “All Green, Still Broken” shows that default breakage can arise even when the testing posture appears extensive (Bilal et al., 21 Jun 2026). The studied rental-search assistant combined LLM output, browser-driven behavior, external data sources, and multi-market internationalization. Over six weeks, its automated suite grew to 1,364 test functions, expanded by pytest into 1,553 test cases across 144 files, and a passing run was required before deployment. Yet the project still recorded 252 bug-fix commits, six user-facing defects that reached production, and one recurring defect. The empirical core of the paper is the four-seam framework: Runtime, Market, Flow, and System. These are “boundaries where our code meets something it does not control,” and they are precisely the places component-level unit tests do not observe. The measured defect distribution was Flow 35, Runtime 29, Market 23, and System 23, totaling 110 of 252 bug-fix commits, or about 44%. One defect shipped twice because the first fix left no guard at the seam where the bug had escaped.

The paper’s central technical claim is that component tests make seams deterministic by replacing the uncontrolled side with a stand-in: stored HTML instead of a live browser, the default market instead of a non-default market, a stub instead of a neighboring component, or a fixed constant instead of a changing system baseline (Bilal et al., 21 Jun 2026). This yields determinism but also removes the production condition that would have exposed the defect. A plausible implication is that the default testing posture is itself a source of breakage: the suite certifies only the internal slice of the system that the harness makes observable.

“Breaking-Good” analyzes a related form of hidden breakage in Java/Maven dependency updates (Reyes et al., 2024). A breaking dependency update is defined as a pair of commits with a passing build before the update and a failing build after updating one dependency version. The paper studies 243 real-world compilation-failing breaking updates and classifies them into Direct compilation failure, Indirect compilation failure, Java version incompatibility, and Werror failure. The key argument is that the root cause is often buried in transitive dependencies, Java version mismatches, or client-specific configuration rather than in the direct dependency bump visible in the manifest. Breaking-Good therefore blends log analysis, dependency-tree differencing, AST analysis via Spoon, and changed-construct detection via japicmp to generate explanations. The reported result is that the tool generates automatic explanations for about 70% of the studied breaking updates (Reyes et al., 2024).

These two papers converge on the same structural lesson. In one case the hidden boundary is the seam between the test harness and the real runtime. In the other it is the seam between a visible dependency edit and the deeper dependency tree, compiler configuration, or JVM environment. In both cases, “broken by default” means that the system’s default observability model is narrower than the system that actually fails.

4. Formal semantics of default: context, priority, and self-breaking structure

In foundations and logic, the phrase shifts from configuration to semantics. The contextuality paper formalizes a compatibility scenario as

$\mathcal S \equiv (\mathcal X,\mathcal C,O),$

where $\mathcal X$ is the set of measurements, $\mathcal C$ the family of contexts, and $O$ the outcome set (Tezzin et al., 2020). A behavior assigns to each context $C\in\mathcal C$ a probability distribution $p^C$ on $O^C$ . The paper’s central claim is that the contextuality-by-default idea is already implicit in the compatibility-hypergraph formalism: once behaviors are interpreted probabilistically, what one obtains are context-indexed random variables $R_x^C$ , not a single context-free random variable for each measurement. The behavior-level analogue of consistent connectedness is non-degeneracy,

$n$ 0

which is weaker than non-disturbance. The paper proves that the set of non-degenerate behaviors is a polytope and that a behavior is noncontextual in the standard sense if and only if it is non-degenerate and noncontextual in the extended sense (Tezzin et al., 2020). Here “broken by default” does not mean empirical malfunction; it means that measurement identity across contexts must be constructed, not presumed.

Rintanen’s study of prioritized default logic exposes a related instability in nonmonotonic inference (Rintanen, 2011). Reiter default logic already admits multiple extensions; priorities are introduced to control conflict resolution. The paper compares the prioritized default logics of Baader and Hollunder, Brewka, and a lexicographic formalization. For propositional theories, skeptical inference in the Baader–Hollunder and Brewka variants remains $n$ 1-complete in the general case, while lexicographic prioritized default logic with strict total priorities is $n$ 2-complete (Rintanen, 2011). The paper also identifies tractable islands, especially when priorities form a strict total order and the default theory is syntactically restricted. The important point is that priorities do not automatically repair default reasoning. Depending on the semantics, they either preserve second-level complexity or push reasoning higher in the polynomial hierarchy. This suggests that ambiguity in default reasoning is not fixed “for free” by adding an ordering.

A more literal self-breaking mechanism appears in Dixon’s supersymmetry paper (Dixon, 2010). In the CSSM, a non-minimal supersymmetric Standard Model with right neutrinos and a Higgs singlet $n$ 3, the theory contains composite BRST-cohomology operators called Outfields. Coupling these Outfields to dotted-spinor superfields and adding the finite completion terms yields a BRST-invariant action before electroweak symmetry breaking. But once the gauge-invariant term $n$ 4 is added and the Higgs vacuum expectation value breaks $n$ 5, the relevant Outfields no longer survive in local BRST cohomology, the BRST Poisson bracket becomes nonzero, and no local term can restore it. The resulting SUSY breaking depends on the same parameter that produces the VEV and is explicitly described as not spontaneous, so the vacuum energy remains zero (Dixon, 2010). In this case the system is not externally broken; it is arranged so that its own internal cohomological structure loses SUSY invariance once electroweak breaking occurs.

5. Default as predictability, control, and term-structure irregularity

Several mathematical-finance papers treat “default” not as a metaphor but as a stopping time whose default assumptions determine whether the model is structurally well behaved. In the Brownian-bridge credit model, the market observes

$n$ 6

where $n$ 7 is the bankruptcy time, independent of Brownian motion $n$ 8 (Bedini et al., 2016). The observable information process is a Brownian bridge on $n$ 9 pinned to zero at $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 0, and $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 1 almost surely, so default is detectable once it occurs. The main question is whether default becomes predictable before it occurs. The answer depends on the geometry of the support $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 2: if $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 3 and

$\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 4

then $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 5 is a predictable $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 6-stopping time; a sufficient condition is finite $\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 7-Hausdorff measure, which includes countable support (Bedini et al., 2016). The paper therefore shows that once singular default-time laws are allowed, default can become foreseeable from the information structure itself.

The stochastic maximum-principle paper places default inside a controlled SDE and an enlarged filtration (Cherif et al., 2020). The state equation is

$\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 8

where

$\mathcal S \equiv (\mathcal X,\mathcal C,O),$ 9

is the compensated default martingale. The corresponding Hamiltonian is

$\mathcal X$ 0

and the adjoint equation is a BSDE driven jointly by $\mathcal X$ 1 and $\mathcal X$ 2 (Cherif et al., 2020). The paper proves sufficient and necessary maximum principles and applies them to a logarithmic-utility problem. Here the “default” dimension of the model is not reducible to a drift correction; it alters the filtration, the admissible controls, the adjoint dynamics, and the first-order optimality condition.

Fontana and Schmidt show that default also breaks the geometry of standard term-structure models when the intensity paradigm is abandoned (Fontana et al., 2016). For a zero-recovery bond they propose

$\mathcal X$ 3

where the additional random-measure term records information about dates at which default can occur with positive probability (Fontana et al., 2016). This is necessary because a purely absolutely continuous HJM forward curve cannot represent maturity discontinuities caused by risky predictable dates. The resulting no-arbitrage conditions include

$\mathcal X$ 4

for the short end and

$\mathcal X$ 5

for jump matching between the default compensator and the term-structure discontinuity (Fontana et al., 2016). The paper’s central message is that once default need not be totally inaccessible, a smooth intensity-based term structure is no longer sufficient.

The portfolio paper with CDS trading studies a different repair mechanism (Fei et al., 10 Apr 2025). Default does not only create an equity jump loss; it also terminates future trading in the defaulted security. For a CARA investor, the certainty equivalent is characterized by a semilinear PDE, and the optimal CDS position hedges not only the direct equity loss upon default but also the loss of future trading opportunities. The paper shows that if the underlying equity market is complete absent default, then the equity–CDS market is complete accounting for default, and the numerical application finds that the optimal CDS policies are “essentially static” while investing in CDS dramatically increases indirect utility (Fei et al., 10 Apr 2025). This gives a precise mathematical version of the idea that default can break a dynamic control problem and that a credit derivative can restore what default destroys.

6. Endogenous amplification, spillover, and post-default valuation

The literature on credit contagion and collateralization studies settings in which the system is destabilized by its own response to default risk. In “Collateral-Enhanced Default Risk,” the central claim is that increased collateralization can raise the standalone probability of default even as it reduces counterparty credit transmission (Kenyon et al., 2013). The model unifies Merton-type terminal default and Black–Cox first-passage default through a barrier-option survival formula, with discrete remargining incorporated through the Broadie–Glasserman–Kou barrier correction. The key economic point is that more frequent mark-to-market and higher collateral barriers convert temporary volatility into realized default risk. In the trigger-style scenario reported in the paper, simultaneous increases in volatility and collateralization produced a maximum increase of about 4000 bps (40%) in equivalent CDS spread in the non-bank case (Kenyon et al., 2013). The paper explicitly argues that central counterparties, while reducing credit-risk transmission, can systematically increase default risk through this mechanism.

“A default system with overspilling contagion” generalizes interacting-intensity models by allowing defaults to alter the stochastic environment itself (Coculescu et al., 2017). Debtor defaults are decomposed into a specific part $\mathcal X$ 6 and an environmental part $\mathcal X$ 7, and contagion is introduced by a change of measure that adds both direct and indirect impact terms to default intensities. Under the contagion measure, the environment is no longer autonomous: the intensities of environmental stopping times $\mathcal X$ 8 are themselves modified by past defaults. Survival probabilities are represented through an $\mathcal X$ 9-adapted recursive SDE system for $\mathcal C$ 0, and the full brute-force computation requires

$\mathcal C$ 1

equations, reduced to

$\mathcal C$ 2

when only $\mathcal C$ 3 names can produce indirect contagion (Coculescu et al., 2017). The paper’s formal point is that default propagation becomes non-Markovian once default can feed back into the environment that governs future default intensities.

“Loss given default after default” shifts from default occurrence to valuation after default has already happened (Pomazanov, 14 Nov 2025). The paper proposes a way to build post-default LGD without training a separate model, using a pre-default LGD model, the average repayment time after default, repayment volumes and dates, the loan interest rate, and the recovery rate implied by the pre-default LGD. The recovery law is

$\mathcal C$ 4

and the Bayesian-style posterior mean of ultimate recovery is

$\mathcal C$ 5

Predicted recovery by time $\mathcal C$ 6 is

$\mathcal C$ 7

yielding

$\mathcal C$ 8

On the reported portfolios, the estimated average repayment times were 23.47 months for mortgages and 11.3 months for consumer loans, with low reported $\mathcal C$ 9 values for the exponential fit (Pomazanov, 14 Nov 2025). In this setting, “broken by default” no longer refers to the onset of default but to the need to update loss severity dynamically once the loan has entered workout.

Across these financial models, the common structure is endogenous amplification. Collateral rules can convert liquidity pressure into default; defaults can spill into the environment and alter future intensities; the information process can render default predictable; and after default, recovery and LGD remain dynamic rather than static quantities. This suggests that “default” is not merely an event label. It is a structural intervention point at which the governing equations, filtrations, and optimization problems often change regime.