Component-Level Defect Taxonomy
- Component-level defect taxonomy is a hierarchical, multidimensional framework that categorizes defects in software, AI, and manufacturing systems.
- It integrates orthogonal attributes—such as data, learning, spatial, and causal dimensions—to enable precise defect identification and risk assessment.
- The taxonomy supports automated quality assurance and standardizes defect reporting through formalized ontologies, validation techniques, and domain-specific extensions.
A component-level defect taxonomy is a hierarchical and multidimensional scheme for rigorously classifying defects specific to discrete units of software, data, or hardware systems. Such taxonomies underpin defect identification, prioritization, and analysis by establishing standardized categories and attributes per component type and technological context. Recent research develops taxonomies tailored to domains including AI-based software systems (Alannsary, 25 Aug 2025), deep learning (Humbatova et al., 2019), logging code (Wang et al., 15 Aug 2025), and metal additive manufacturing (Carraturo et al., 2022), each capturing domain-specific defect sources, properties, and impacts. The structures described below exemplify the state of the art.
1. Multidimensional Classification Schemes
Component-level taxonomies integrate multiple, often orthogonal, dimensions. For AI-based software, AI-ODC introduces an “AI Attribute” axis—distinct from legacy ODC dimensions—enabling separation of Data, Learning, Thinking, and Not Related defects (Alannsary, 25 Aug 2025). Deep learning component fault taxonomies employ faceted classification across model, tensor, hyperparameter, and computational resource axes (Humbatova et al., 2019). Manufacturing defect ontologies use spatial, phenomenological, and causal source axes (Carraturo et al., 2022). This multidimensional approach ensures high specificity and coverage at the component granularity.
AI-ODC AI Attribute Categories and Definitions (Alannsary, 25 Aug 2025):
| Category | Definition |
|---|---|
| Data | Defects arising from training/test datasets, e.g., mislabeled samples, schema errors |
| Learning | Defects in model fitting/training, e.g., convergence issues, optimizer misconfiguration |
| Thinking | Defects in inference/decision logic, e.g., runtime errors, post-processing flaws |
| Not Related | Traditional software (non-AI) defects, e.g., API misuse, build/package errors |
Manufacturing defect ontologies similarly encode spatial (surface vs. internal), phenomenological (e.g., porosity, cracking), and causal (design, equipment, material, process) axes in formal OWL/DL (Carraturo et al., 2022).
2. Hierarchical Taxonomic Structures
Hierarchical structure offers granularity: high-level defect classes partition into context-specific subcategories, often recursively. In deep learning, 15 leaf categories are grouped into model-type, architectural, tensor, data, training, and API dimensions (Humbatova et al., 2019). Examples include:
- Model Type & Properties: Architecture suitability and initialization
- Layer Properties: Parameters, dimensions, neuron/filter counts
- Activation Function: Omission or misconfiguration of activations
- Hyperparameters: Learning rate, batch size mis-setting
In logging code, seven high-level patterns—each with sub-scenario granularity—capture recurring fault modes (Wang et al., 15 Aug 2025):
| Pattern | Scenario Example | Definition/Example |
|---|---|---|
| RD | RD₁: Domain Jargon | Nontransparent technical terms in logs (e.g., "GC-compaction") |
| VR | VR₂: Placeholder Mismatch | Number of string placeholders ≠ supplied variables in log |
| SS | SS₁: Plain-text Credentials | Logging of passwords/tokens |
| PF | PF₁: Hot-path Logging | Logging inside tight computation loops causes performance issues |
Such hierarchies allow mapping precise component failure modes and support extensibility for emerging defect types.
3. Severity, Impact, and Attribute Extensions
Taxonomies incorporate severity and impact dimensions, further structuring triage and risk analysis. AI-ODC extends the canonical four-tier ODC severity scale to five levels by introducing “Catastrophic,” mapped through a matrix based on application criticality (e.g., safety-critical vs. entertainment), harm reversibility, and failure scope (Alannsary, 25 Aug 2025):
| Application Type | Low | Medium | High | Critical | Catastrophic |
|---|---|---|---|---|---|
| Non-critical | Localized, reversible | ||||
| Enterprise | Localized | Systemic/irreversible | |||
| Safety-critical | Recoverable system-wide | Irreversible system-wide |
Impact in AI-ODC leverages a layered quality model: AI-level (Accuracy, Robustness, Trustworthiness, etc.) and AIP-level (Reliability, Maintainability, etc.), associating each defect category with directly degraded characteristic(s). Representative mapping (Alannsary, 25 Aug 2025):
| Defect Category | AI/AIP | Impacted Quality |
|---|---|---|
| Missing dense layer | AI | Trustworthiness, Accuracy |
| Wrong tensor shape | AIP | Reliability |
| Wrong loss function | AI | Trustworthiness, Robustness |
The defect impact mapping replaces IBM’s classical CUPRIMD model, aligning defect analysis to the unique concerns of complex, learning-based systems.
4. Formalization, Axiomatization, and Validation
Modern taxonomies for component-level defect analysis employ both empirical (bottom-up) and knowledge-representation (ontology-driven) approaches. Deep learning fault taxonomies are constructed by labeling observed faults in repositories, issues, and developer interviews, then validated by practitioner surveys (e.g., 66% of practitioners experienced at least 9.7/15 categories; highest occurrence: Training Data Quality 95%, lowest: Missing Layer 24%) (Humbatova et al., 2019).
Manufacturing defect ontologies encode definitions and axioms in OWL 2 DL, enabling machine reasoning:
- Example OWL axiom:
Ontology alignment leverages bridge axioms, e.g., mapping physical phenomena to process-induced subtypes, importing classes from standards such as NIST4DefectOnt and MASON4DefectOnt to ensure semantic interoperability (Carraturo et al., 2022).
Logging code defect taxonomies formalize pattern-scenario relationships as functions , where is a log statement, its context (Wang et al., 15 Aug 2025).
5. Application to Component-Level QA and Automation
Component-level taxonomies serve as actionable frameworks for defect detection, risk assessment, and quality improvement:
- AI-ODC: Guides investment—“Learning+Catastrophic” clusters indicate high-ROI for automated guardrails; Data defects prompt governance, Learning defects training orchestration, Thinking defects inference validation (Alannsary, 25 Aug 2025).
- Deep Learning: Used for checklist-driven testing, code review, static analysis (e.g., ensuring .to(device) is called before distributed execution), and mutation testing (fault seeding) (Humbatova et al., 2019).
- Logging: Directly translatable to linter rules and static analysis, e.g., enforcing matching placeholder counts, credential masking, and performance-guarded logging (Wang et al., 15 Aug 2025).
- Manufacturing: Ontology-backed taxonomies enable automated query and diagnostic tasks, as well as explainable monitoring for cyber-physical systems (Carraturo et al., 2022).
Concrete QA measures are typically mapped from defect category, severity, and impacted quality characteristic to interventions, e.g., automated data validation, loss monitoring, or code-style enforcement.
6. Domain-Specific Extensions and Integration
Emerging domains require adaptation and extension of foundational taxonomies. AI-ODC directly incorporates AI-specific defect attributes and impact domains not present in legacy ODC (Alannsary, 25 Aug 2025). Logging code taxonomies add fine-grained scenarios for readability, data leakage, and runtime performance (Wang et al., 15 Aug 2025). Manufacturing ontologies integrate spatial and physical principles, causal models, and interoperability with industrial standards (Carraturo et al., 2022).
Integration strategies reuse existing ontologies for shared concepts (Process, Material, Device), minimizing redundancy and enhancing semantic richness. Class equivalences and bridge axioms interlink imported modules, e.g., unifying PhysicalObject ≡ SpatialObject, necessary for cross-system analytics (Carraturo et al., 2022).
Component-level defect taxonomies thus provide comprehensive, rigorously structured foundations for systematic defect analysis and quality engineering across diverse technical domains.