Papers
Topics
Authors
Recent
Search
2000 character limit reached

Cisco Integrated AI Security and Safety Framework Report

Published 15 Dec 2025 in cs.CR and cs.AI | (2512.12921v1)

Abstract: AI systems are being readily and rapidly adopted, increasingly permeating critical domains: from consumer platforms and enterprise software to networked systems with embedded agents. While this has unlocked potential for human productivity gains, the attack surface has expanded accordingly: threats now span content safety failures (e.g., harmful or deceptive outputs), model and data integrity compromise (e.g., poisoning, supply-chain tampering), runtime manipulations (e.g., prompt injection, tool and agent misuse), and ecosystem risks (e.g., orchestration abuse, multi-agent collusion). Existing frameworks such as MITRE ATLAS, National Institute of Standards and Technology (NIST) AI 100-2 Adversarial Machine Learning (AML) taxonomy, and OWASP Top 10s for LLMs and Agentic AI Applications provide valuable viewpoints, but each covers only slices of this multi-dimensional space. This paper presents Cisco's Integrated AI Security and Safety Framework ("AI Security Framework"), a unified, lifecycle-aware taxonomy and operationalization framework that can be used to classify, integrate, and operationalize the full range of AI risks. It integrates AI security and AI safety across modalities, agents, pipelines, and the broader ecosystem. The AI Security Framework is designed to be practical for threat identification, red-teaming, risk prioritization, and it is comprehensive in scope and can be extensible to emerging deployments in multimodal contexts, humanoids, wearables, and sensory infrastructures. We analyze gaps in prevailing frameworks, discuss design principles for our framework, and demonstrate how the taxonomy provides structure for understanding how modern AI systems fail, how adversaries exploit these failures, and how organizations can build defenses across the AI lifecycle that evolve alongside capability advancements.

Summary

  • The paper presents a unified taxonomy that integrates adversarial threat modeling with content safety to address evolving AI risks.
  • The paper details a four-tiered structure comprising objectives, techniques, and subtechniques, enabling precise detection and compliance mapping.
  • The paper demonstrates the framework’s applicability for lifecycle management and regulatory integration in dynamic, multi-modal AI deployments.

Cisco Integrated AI Security and Safety Framework: An Authoritative Synthesis

Context and Motivation

The integration and escalating autonomy of AI systems across enterprise, industrial, and consumer domains has precipitated a rapid expansion in risk surface area. This acceleration in capability deployment far outpaces organizations’ abilities to mature risk governance, workforce readiness, and security posture, resulting in critical gaps in both awareness and operational defense against evolving AI-specific threats. Current frameworks—including MITRE ATLAS, NIST’s AML taxonomy, and the OWASP Top 10s—address only partial views of the AI risk landscape and lack comprehensive lifecycle coverage, cohesive treatment of multi-agent and multimodal settings, and integration across both security and safety domains.

The Cisco Integrated AI Security and Safety Framework (“AI Security Framework”) directly addresses these deficiencies by unifying adversarial threat taxonomies and content safety concerns within a structured, lifecycle-aware, and highly extensible operationalization framework. This approach supports practical threat modeling, red teaming, risk management, and compliance with emergent global regulatory mandates.

Conceptual Foundations: Definitions and Scope

The Framework advances precise definitions for AI security and AI safety rooted in both traditional cybersecurity and the unique challenges of modern AI:

  • AI Security: Concerned with protecting AI systems and pipelines against unauthorized use, integrity compromise, and availability attacks, while foregrounding accountability as a primary axis of defense, particularly salient in agentic and autonomous system deployments.
  • AI Safety: Ensures AI system behaviors are reliable, ethical, fair, transparent, and value-aligned, encompassing both direct system effects and the broad societal, organizational, or personal harms associated with unaligned model outputs.

By operationalizing these domains in tandem, the Framework allows organizations to concurrently mitigate technical compromise and enforce content/output safety, capturing convergence scenarios where threat actors combine adversarial exploits with goal misalignment or content policy violations.

Framework Taxonomy: Architecture and Granularity

The AI Security Framework’s taxonomy is four-tiered, supporting strategic-to-operational integration:

  • Objectives (19): High-level adversarial goals (e.g., goal hijacking, communication compromise, supply chain attacks), each uniquely indexed and contextualized for domain-specific threat modeling.
  • Techniques (40): Detailed classes of attack methodologies (e.g., direct prompt injection, agentic manipulation, model extraction, code execution).
  • Subtechniques (112): Highly granular variants and vector-specific realizations, enabling precise detection, testing, and response planning.
  • Procedures: Concrete implementation specifics, acknowledging the dynamic, emergent nature of TTPs and focusing the taxonomy on abstraction above the instance level.

In addition, the taxonomy subsumes Model Context Protocol (MCP) threat vectors, supply chain attacks, and a 25-category harmful content safety taxonomy segmented into cybersecurity, safety/toxicity, integrity, intellectual property, and privacy—spanning all major modalities. The framework is thus uniquely positioned for systematic and fine-grained assessment across both input attack surfaces and output risks, as well as technical and content axes.

Lifecycle and Modality Awareness

Lifecycle coverage extends from data collection, training, and deployment to runtime operations and post-incident remediation, allowing defense-in-depth strategies to be mapped to evolving threats at each system phase. The explicit accommodation of multi-agent orchestration (e.g., agent collaboration risks, tool mediation, protocol exploits), multi-modality (text, vision, audio, sensor, code), and cross-modal attack vectors (e.g., image-text fusion exploits, context window attacks) broadens both the reach and applicability of the framework for frontier systems, including RAG pipelines, agentic stacks, wearables, and embedded/sensory infrastructure.

Operationalization and Regulatory Integration

The Framework is designed for direct integration with risk management and compliance processes:

  • Control Mapping: Each objective and technique can be associated with preventive, detective, and corrective controls, facilitating coverage analysis and resource optimization.
  • Testing and Red Teaming: The taxonomy enables granular test case generation across both systems and content, supporting adversarial evaluation suites and CI-integrated red team pipelines.
  • Incident Response and Reporting: Adoption of a common risk language promotes interoperability for incident response, sectoral information sharing, and regulatory reporting, including alignment with frameworks such as NIST AI RMF, the EU AI Act, and global cybercrime conventions.
  • Supply Chain Integrity: By cataloging attack vectors in AI artifacts, tool dependencies, build/pipeline stages, and runtime environments, the Framework supports both deep provenance analysis (AIBOMs) and actionable supply chain defense.

Comparative Assessment and Extension Beyond Prior Work

Unlike MITRE ATLAS, which is focused on threat enumeration without safety/content coverage or extensive lifecycle scope, and the OWASP Top 10s, which are limited to critical but selective risks with minimal granularity, Cisco’s Framework unifies these domains and addresses observed gaps:

  • Full-spectrum integration: Simultaneously represents technical threats, content harms, and multi-modal risks.
  • Lifecycle extensibility: Embeds awareness from data collection through runtime and remediation, supporting both defensive engineering and governance.
  • Agentic/systemic threats: Accounts for emergent risks in multi-agent, agent-tool, and protocol-driven orchestrations, not yet widely covered in industry frameworks.
  • Dynamic taxonomy updates: Commits to continuous expansion and refinement, anticipating continued evolution in adversarial techniques and deployment practices.

Notable Claims and Numerical Scope

The authors specifically claim that:

  • Only 33% of organizations have formal change management plans for AI adoption, and only 29% feel fully equipped for AI-specific threats.
  • The Framework consists, as of its initial release, of 19 objectives, 40 techniques, and 112 subtechniques, representing the most comprehensive AI threat taxonomy to date.
  • Agentic, supply chain, and multi-modal risks are covered in operational detail, supporting current and anticipated future deployment topologies.

Implications and Future Research Trajectories

From a practical perspective, the Framework supports actionable defense, system maturity benchmarking, prioritized threat assessment, and regulatory compliance workflows. Its modular extensibility positions it for rapid adaptation as new adversarial modalities (e.g., agent swarms, distributed toolchains) and content risks materialize.

Theoretically, the Framework’s fusion of security and safety (including content) creates an integrated research foundation for understanding coupled system-threat-content failures, which is relevant for both safe system design and for next-generation sociotechnical risk modeling. The direct mapping to regulatory frameworks (e.g., Budapest Convention, EU AI Act) implies that adoption could facilitate legal risk management and coherent international incident reporting.

As agentic AI and autonomous multimodal systems become increasingly prominent—encompassing real-world actuation, cross-organizational workflows, and sensitive decision-making—the extensibility of Cisco’s Framework and its incorporation of multi-agent and supply chain categories positions it as a standardization candidate and a baseline for both vendor and regulatory harmonization.

Conclusion

The Cisco Integrated AI Security and Safety Framework constitutes a unified, extensible, and highly granular taxonomy for operationalizing AI risk management. By tightly integrating adversarial threat modeling, harmful content taxonomy, supply chain compromise, and lifecycle awareness into a modular and extensible framework, it addresses the acute limitations of prior art in coverage, granularity, and applicability. The approach enables systematic control selection, adversarial testing, and cross-functional communication, while facilitating regulatory compliance in a rapidly evolving legal and threat environment. The ongoing evolution and open taxonomy approach ensure relevance for both current and anticipated AI system architectures, including those embedding multi-agent and multimodal capabilities. As organizations confront the continuing expansion of the AI threat landscape, frameworks of this scope and structural clarity will be central for translating research to resilient, trustworthy deployment and governance.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.

HackerNews