Concept Cones: Theory and Applications

Updated 21 February 2026

Concept cones are geometric structures defined as all nonnegative linear combinations of a set of vectors in a real vector space, ensuring convexity and clear boundaries.
They are applied across neural network interpretability, formal logic, decision theory, and hyperbolic embeddings to model abstract behaviors and hierarchical relations.
Their use in activation space interventions demonstrates practical control over model outputs while unifying theories in geometry, logic, and physics.

A concept cone is a fundamental geometric structure defined as the collection of all nonnegative linear combinations of a finite set of direction vectors in a real vector space. This structure surfaces prominently in diverse areas such as neural network interpretability, geometric representation learning, formal logic, decision theory, hyperbolic embeddings, and Lorentzian/Finsler geometry. Modern applications encompass modeling logical concepts, abstract behaviors in neural networks, compositionality in generative models, and formalizing causal or semantic structures in mathematical logic and physics.

1. Formal Definition and Core Properties

Given a real vector space $\mathbb{R}^d$ and a set of $k$ normalized vectors $v_1, \dots, v_k \in \mathbb{R}^d$ , the associated concept cone is

$\mathcal{C}(v_1, \ldots, v_k) = \left\{ \sum_{j=1}^k \lambda_j v_j : \lambda_j \ge 0 \right\} \setminus \{0\}$

This cone is always a closed, pointed, polyhedral convex cone if the $v_j$ are linearly independent. The geometry admits efficient computation of projections, duals (via polar cones), and supports full orthonegation (Leemhuis et al., 2020). In high dimensions, different concept cones can intersect non-trivially, and equivalence or containment between cones is central to geometric interpretability (Rocchi--Henry et al., 8 Dec 2025).

Key properties:

Convexity: All positive combinations of elements remain within the cone.
Polarity: For closed convex cones, the polar cone $(a^\circ)$ induces an orthocomplementation operation supporting involutive negation.
Orthogonality of basis: Most construction procedures (e.g., for LLM activations) enforce orthonormality among the spanning vectors to facilitate interpretability and intervention (Yu et al., 27 May 2025, Wollschläger et al., 24 Feb 2025).

2. Concept Cones in Machine Learning: Interpretability and Representation

LLMs and Factual/Refusal Behaviors

Recent works have empirically demonstrated that LLMs encode abstract concepts, such as “truth” or “refusal,” via multi-dimensional concept cones embedded in hidden-layer residual streams. For a typical layer $l$ in an LLM, a $k$ -dimensional cone is specified by an orthonormal matrix $V = [v_1, ..., v_k] \in \mathbb{R}^{d_m \times k}$ , and all vectors of the form $v = \sum_j \lambda_j v_j$ , $k$ 0, mediate a causal effect on the model’s output regarding the associated concept (Yu et al., 27 May 2025, Wollschläger et al., 24 Feb 2025).

Interventions add (or ablate) such vectors in the activation space to flip (“Yes”/ $k$ 1) or suppress (“No”/ $k$ 2) responses to specific prompts, with multidimensional cones supporting richer control than one-dimensional “concept directions.” Empirical metrics, such as Answer-Switching Rate (ASR) for “truth” and Attack Success Rate (ASR) for “refusal,” robustly indicate high efficacy and generalization when manipulating entire cones (Yu et al., 27 May 2025). Notably, cones generalize across architectures and model scales, and multidimensional cones outperform single directions, especially in large models (Wollschläger et al., 24 Feb 2025, Yu et al., 27 May 2025).

Supervised and Unsupervised Concept Discovery

Concept Bottleneck Models (CBMs) and Sparse Autoencoders (SAEs) both instantiate concept cones in activation space, albeit via different selection mechanisms—supervision in CBMs and sparsity in SAEs. The set of concept axes forms a cone that encodes either human-annotated or emergent (discovered) concepts. Their geometric relationship is quantified via cone containment, normalized reconstruction error, and geometric/statistical alignment metrics (Rocchi--Henry et al., 8 Dec 2025).

An illustrative trade-off emerges: increased sparsity in SAEs sharpens the cone axes (better interpretability, lower coverage), while higher expansion factors improve the coverage (more faithful reconstructions, less axis fidelity). These metrics enable principled comparison and hybridization of interpretability paradigms.

3. Concept Cones in Formal Logic and Decision Theory

Cone Logics for Conceptual Spaces

The set of all closed convex cones in $k$ 3, ordered by inclusion, forms an ortholattice $k$ 4, where $k$ 5 is the conic hull and $k$ 6 is the polar cone. This structure underlies a non-distributive, antitone, involutive logic for reasoning about conceptual spaces in knowledge representation (Leemhuis et al., 2020). Negation is implemented as polarity; distributivity generally fails, but a partial orthomodularity rule (pOM) holds everywhere.

Soundness and completeness results show that the propositional logic $k$ 7 is sound and complete for the class of ortholattices formed by closed convex cones. Applications include ontology embedding, multi-label classification with geometric regularizers, and expressivity for full negation and disjointness constraints.

Dual Representations in Topological Vector Spaces

Any pointed cone $k$ 8 in a locally convex Hausdorff space has a dual-family representation: $k$ 9 where $v_1, \dots, v_k \in \mathbb{R}^d$ 0 is a family of nonempty subsets of the topological dual $v_1, \dots, v_k \in \mathbb{R}^d$ 1. In Banach spaces, $v_1, \dots, v_k \in \mathbb{R}^d$ 2 is closed if and only if $v_1, \dots, v_k \in \mathbb{R}^d$ 3 can be chosen as nonempty convex compact sets (Leonetti et al., 2022). This duality provides a framework for multi-utility representation of preferences in decision theory and characterization of preorders satisfying independence axioms. The separation of convex, open, and closed cones corresponds to precise properties of the representing family.

4. Geometric and Physical Generalizations: Shadow Cones and Lorentz-Finsler Geometry

Shadow Cones in Hyperbolic Embedding

For representing partial orders and hierarchical relations (e.g., in knowledge graphs), shadow cones in hyperbolic space generalize previous entailment cone models by introducing variability in the light source and the opaque objects casting the shadow. The umbral variant uses a point light and volumetric objects, while the penumbral variant uses a volumetric light and point objects. These constructions excel at modeling partial orders, avoid implementation pathologies (e.g., “holes”), generalize across hyperbolic models (Poincaré, half-space, Lorentz), and achieve superior empirical link-prediction accuracy on standard taxonomy benchmarks (Yu et al., 2023).

Key experimental findings:

Model/Embedding	F1 (Mammal, 2D)	F1 (WordNet-Noun, 5D)
Entailment cone	66.5%	92.1%
Umbral-Half-space	80.3%	96.4%
Penumbral-Half-space	72.3%	88.3%

Shadow cones also provide physically intuitive metaphors and enhanced numerical properties.

Lorentz-Finsler Spacetimes and Cone Structures

A smooth, strong cone structure on a manifold $v_1, \dots, v_k \in \mathbb{R}^d$ 4 is a hypersurface in $v_1, \dots, v_k \in \mathbb{R}^d$ 5 such that each fiber is a strong cone (conic, salient, strongly convex in the interior) (Javaloyes et al., 2018). A canonical representation is as the zero set of a Lorentz-Finsler metric $v_1, \dots, v_k \in \mathbb{R}^d$ 6, and any such $v_1, \dots, v_k \in \mathbb{R}^d$ 7 in a fixed anisotropic conformal class defines the same cone. Cone geodesics correspond to null pregeodesics for any such metric, enabling a causal ordering and underpinning the geometry of time-dependent control problems (e.g., Zermelo navigation). Cone triples $v_1, \dots, v_k \in \mathbb{R}^d$ 8 provide local and global constructions of compatible Lorentz-Finsler metrics.

5. Algorithmic and Operational Aspects: Discovery and Manipulation

Neural Network Concept Cones

Gradient-based discovery: In both LLMs and diffusion models, concept cones are discovered by defining a loss function encoding desired interventions (e.g., flipping truth/falsehood, causing refusal, subject imprinting), and optimizing for a set of orthonormal direction vectors or neuron indices (Yu et al., 27 May 2025, Wollschläger et al., 24 Feb 2025, Liu et al., 2023).
Activation space interventions: Addition and ablation of cone directions enable causal manipulation of model responses with empirical metrics (ASR, KL-divergence) confirming specificity and efficacy.
Compositionality: In diffusion models, cones (sparse neuron subsets) for distinct subjects can be composed by mask union, supporting multi-concept generation with high storage efficiency (Liu et al., 2023).

Quantitative Metrics for Concept Cones

Containment and alignment: Metrics such as normalized reconstruction error, cone coverage, geometric axis alignment, and Hausdorff-style cone distance probe the correspondence between learned and reference cones in supervised/unsupervised interpretability settings (Rocchi--Henry et al., 8 Dec 2025).
Trade-offs: Sparsity and expansion factor in dictionary learning directly modulate the fidelity and coverage of emergent concept cones, uncovering “sweet spots” for interpretable and semantically lined representations.

6. Broader Connections and Theoretical Insights

Concept cones provide a unifying geometric language bridging neural network interpretability, knowledge representation, logic, decision theory, and geometry. In machine learning, they cast concept formation, reasoning, and manipulation as tractable operations in activation or parameter space. In formal logic and ontology modeling, cones underwrite ortholattices with full negation and partial orthomodularity, offering algorithmic benefits for symbolic-numeric systems. In geometric and physical settings, cone structures capture causal and hierarchical flows in hyperbolic and Lorentzian contexts.

A plausible implication is that future research in interpretable AI, causal modeling, and hierarchical representation will continue to leverage the geometric flexibility and operational tractability of concept cones as the foundation for both theory and practice.