Taxonomy of Unreliable Behaviors

Updated 4 September 2025

Taxonomy of Unreliable Behaviors is defined as a framework that categorizes design flaws and operational anomalies across systems to pinpoint structural errors and vulnerabilities.
It systematically distinguishes between qualitative design errors, quantitative faults, and empirical markers using statistical methods and anomaly detection techniques.
The taxonomy aids in developing targeted mitigation strategies, enhancing system robustness in areas such as spreadsheet modeling, distributed computation, and human–AI interactions.

Unreliable behaviors are systematic patterns, design flaws, or operational phenomena in systems—human, computational, or organizational—that undermine their dependability, trustworthiness, or interpretability. Across technical domains, taxonomies of unreliable behaviors provide structured frameworks to classify, analyze, and ultimately mitigate these latent or manifest threats. Taxonomies arise in areas ranging from spreadsheet modeling and distributed computation to news reliability, human–AI interaction, and scientific communication, each with domain-specific characteristics, criteria, and implications.

1. Structural and Hierarchical Taxonomies: Spreadsheet Modeling

A foundational taxonomy of unreliable behaviors in spreadsheet models distinguishes between qualitative (design/development-phase) and execution (user operation-phase) errors (Przasnyski et al., 2011). The design-phase taxonomy posits four primary logical groupings:

Category	Criteria/Examples	Typical Manifestation
Input Data Structure	Hard-coding values in formulas; duplicated input values; poorly identified inputs	Errors that make modification error-prone or brittle
Semantics	Missing/wrong labels; poor layout; ambiguous documentation	Misinterpretation, hidden assumptions
Extendibility	Poor layout hindering extension; bad copy/paste logic; incorrect references	Lack of scalability, propagation of errors upon changes
Formula Integrity	Spurious formulas (bad use of SUM, etc.); absence of explicit formulas	Hidden calculations, spurious output

This hierarchical framework enables organizations to systematically flag models that, although numerically correct in the short term, are unreliable in structure and thus prone to future failure. For evaluation, a simple binary “flagging” method is used: the presence of any error type warrants attention, regardless of multiplicity.

2. Phenomenon-Oriented Systems: Moving Beyond Ambiguous "Errors"

The Asheetoxy taxonomy eschews the ambiguous term “error” and instead focuses on observable phenomena (Kulesz et al., 2018). At the top level, an “anomaly” encompasses any negative occurrence, which is then decomposed:

Wrong action: Negative human-initiated actions (e.g., improper data entry).
Defect: Undesired states/formats in the artifact, subdivided into:
- Imperfection: Qualitatively negative (e.g., poor maintainability) without immediate result errors.
- Fault: Likely to have quantitative impact, as “logic faults” (formulas that can go wrong) or “data faults” (bad data entries).
Failure: Where a fault manifests in observable incorrect computation.
Problem: When a failure precipitates real-world negative consequences.

The focus on phenomena observable in the artifact—rather than psychological intent or design process—provides a robust, generalizable structure.

3. Empirical and Statistical Approaches: Unreliable Behaviors in Information Systems

Taxonomies in information credibility (e.g., unreliable news and online reviews) are grounded in empirical statistical analysis of behaviors (Gruppi et al., 2018, Berry, 21 May 2024):

In unreliable news articles, common markers include:
- Lower readability scores, simpler language, yet longer sentences or titles.
- Heavy use of informal or sensationalist cues: capitalization, exclamation/question marks.
- These features exhibit universality across languages and cultures.

Classification relies on extraction of content-based features (complexity, style), using ANOVA and effect size (Cohen's $d$ ) to identify significant differentiators, with SVM classifiers achieving notable predictive accuracy.

For online consumer deception, predictive modeling links unreliable behavior to demographic and attitudinal predictors:

$\log\left(\frac{p}{1-p}\right) = -7.327 + 0.753\,\textrm{Gender} - 0.429\,\textrm{Income} - 0.059\,\textrm{Region} + 0.837\,\textrm{Education} - 0.477\,\textrm{Age} + 0.131\,(\textrm{Trust\,of\,People})$

This quantifies propensity based on structured survey data; patterns suggest that unreliable behavior is correlated with, for example, higher education and complex trust attitudes, not simply with low trust or low status.

4. Unreliability and Expressive Limitation in Distributed Systems

In computational models such as population protocols and enterprise crowdsourcing, taxonomies of unreliable behaviors emerge from both theoretical and empirical constraints (Raskin, 2019, Dwarakanath et al., 2018).

In unreliable population protocols, message loss or non-atomic interactions constrain system power to “counting predicates”—Boolean combinations of threshold tests on population counts:

$\text{Only }f(\vec{n}) = \text{BooleanCombination}\left(\{n_i \ge k_i\}\right)$

This collapses richer models to immediate observation protocols and demonstrates that system unreliability universally limits broader computational expressiveness.

In crowdsourcing, unreliable behaviors are categorized by:
- Submission quality (non-adherence to requirements or best practices, malicious code).
- Timeliness (missed deadlines).
- Ownership and legal liability (license violations, IP leakage).
- Network and data security.

Mitigation approaches distinguish between individual screening (reputation, credentials) and system-level validation (peer review, static analysis, use of contests). Empirical analysis reveals that system-based controls are more robust than reputation-based prediction.

5. Unreliable Behaviors in AI/ML and Human–AI Systems

Recent taxonomies extend unreliable behaviors into machine learning and human–AI domains, highlighting new forms of risk:

In LLMs, qualitative error categories mirror human cognitive biases—anchoring, confirmation, availability—eliciting predictable misinterpretation, overgeneralization, or high-impact failures (e.g., destructive code actions) (Jones et al., 2022).
In federated learning, unreliable client behaviors are mathematically formalized as:

$\hat{w}_i^{(k\tau)} = \alpha \cdot w_i^{(k\tau)} + n_i^{(k\tau)},\,\, \alpha\in[-1,1],\,n_i \sim N(0,\sigma^2)$

This model enables analysis of convergence bounds and the design of defensive DNN-based aggregation mechanisms for abnormal update detection (Ma et al., 2021).

For conversational and companion AI systems, taxonomies distinguish between utterance-level and context-sensitive unsafety, and further refine role-based harms (perpetrator, instigator, facilitator, enabler). Categories of harm include relational transgression, verbal abuse, self-harm, harassment, mis/disinformation, and privacy violations (Zhang et al., 26 Oct 2024, Sun et al., 2021).

6. Domain-Independent Frameworks, Uncertainty, and Disinformation

Broad, systematic taxonomies address unreliable behaviors by analyzing sources of uncertainty or misinformation:

The system-theoretic uncertainty taxonomy (Gansch et al., 2023) distinguishes:
- Aleatory uncertainty (inherent randomness, captured by $P(x)$ ).
- Epistemic uncertainty (lack of parameter/structural knowledge, $H_{system|model} = H_{total} - H_{model}$ ).
- Ontological uncertainty (incompleteness; “unknown unknowns”).
- Mitigation strategies include uncertainty prevention, removal, tolerance, and forecasting, mirroring classical dependability taxonomies.
In scientific disinformation, a multidimensional taxonomy spans actors (individuals, organizations, governments), outlets (journals, events, media), and methods (deceiving scholarly communication, gaming media, leveraging legal systems) (McIntosh et al., 2023). Real-world case studies demonstrate that mischaracterization, manipulation, and misuse of institutional structures are central tactics.

7. Future Directions, Challenges, and Areas for Extension

Across domains, limitations and needs for further development are recurring themes:

Taxonomies often begin in restricted contexts (e.g., simple spreadsheet models) and require extension to capture emerging complexities such as inter-sheet logic, macros, or social/systemic factors (Przasnyski et al., 2011).
Empirical reliability—particularly inter-rater agreement—remains an open challenge for practical deployment of taxonomies.
Rapidly evolving environments, as seen in disinformation tactics or AI system deployment, demand regular taxonomy refinement.
There is a growing impetus for integrating standardized quantitative tools, such as automated anomaly detectors or system audits, which operationalize taxonomies for large-scale monitoring and intervention.

Conclusion

The taxonomy of unreliable behaviors refers to structured frameworks—varying by domain—that systematically classify, describe, and investigate the patterns and causes of unreliable, erroneous, or otherwise risk-prone phenomena in technical, organizational, and social systems. By distinguishing between types, causes, and manifestations of unreliability, taxonomies facilitate more rigorous detection, prevention, and mitigation strategies, improving the dependability, interpretability, and trustworthiness of complex systems across disciplines.