Every Language Model Has a Forgery-Resistant Signature (2510.14086v1)

Published 15 Oct 2025 in cs.CR and cs.AI

Abstract: The ubiquity of closed-weight LLMs with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying models by their outputs. One successful approach to these goals has been to exploit the geometric constraints imposed by the LLM architecture and parameters. In this work, we show that a lesser-known geometric constraint--namely, that LLM outputs lie on the surface of a high-dimensional ellipse--functions as a signature for the model and can be used to identify the source model of a given output. This ellipse signature has unique properties that distinguish it from existing model-output association methods like LLM fingerprints. In particular, the signature is hard to forge: without direct access to model parameters, it is practically infeasible to produce log-probabilities (logprobs) on the ellipse. Secondly, the signature is naturally occurring, since all LLMs have these elliptical constraints. Thirdly, the signature is self-contained, in that it is detectable without access to the model inputs or the full weights. Finally, the signature is compact and redundant, as it is independently detectable in each logprob output from the model. We evaluate a novel technique for extracting the ellipse from small models and discuss the practical hurdles that make it infeasible for production-scale models. Finally, we use ellipse signatures to propose a protocol for LLM output verification, analogous to cryptographic symmetric-key message authentication systems.

Summary

The paper establishes that language model outputs lie on a high-dimensional ellipsoid, creating a natural and forgery-resistant signature.
It introduces a verification protocol using inverse affine transformations to confirm if logprobs reside on the expected ellipsoid.
Practical implications include enhanced model forensics and regulatory compliance, though extraction is infeasible for large-scale models.

Forgery-Resistant Ellipse Signatures in LLMs

Introduction

This paper establishes that the outputs of modern LMs are subject to a geometric constraint: they lie on the surface of a high-dimensional ellipsoid (hyperellipse) determined by the model’s final normalization and linear layers. This constraint is not merely a mathematical curiosity; it constitutes a robust, naturally occurring, and self-contained signature that can be used to identify the source model of a given output. The authors demonstrate that this "ellipse signature" is computationally hard to forge without access to the model parameters, distinguishing it from previously studied linear signatures and text-based watermarks. The paper further proposes a cryptographically inspired protocol for output verification based on these signatures.

Figure 1: LLM logits are subject to constraints that force them to lie on a high-dimensional ellipse, which serves as a forgery-resistant signature for model identification.

Mathematical Formulation of Ellipse Signatures

The final layers of typical LMs consist of a normalization (RMSNorm or LayerNorm) followed by a linear or affine transformation. The normalization maps hidden representations onto the surface of a $d$ -dimensional sphere, and the subsequent linear transformation stretches and rotates this sphere into a $d$ -dimensional ellipsoid embedded in the $v$ -dimensional vocabulary space.

Figure 2: The final layers of a LLM map normalized representations onto a sphere, which is then stretched and rotated into an ellipsoid by the linear layer.

Formally, for a hidden state $\mathbf{h} \in \mathbb{R}^d$ , the normalized output $\hat{\mathbf{h}}$ is mapped to logits via an affine transformation:

$\mathbf{z} = \mathbf{A} \hat{\mathbf{h}} + \mathbf{b}$

where $\mathbf{A}$ encodes rotation and scaling, and $\mathbf{b}$ is a bias. The set of possible output logits thus forms a $d$ -dimensional ellipsoid in $\mathbb{R}^v$ .

APIs typically expose log-probabilities (logprobs), which are centered versions of logits due to the invariance of softmax to scalar addition. The centering operation preserves the ellipsoid constraint, enabling signature verification directly on logprobs.

Signature Verification and Model Identification

To verify whether a logprob vector $\mathbf{\ell}$ was generated by a specific model, one applies the inverse affine transformation and checks if the result lies on the unit sphere:

$\left\| \mathbf{A}^+ (\mathbf{\ell} - \mathbf{b}) \right\|_2 \approx 1$

where $\mathbf{A}^+$ is the pseudoinverse of $\mathbf{A}$ . If the transformed logprob lies on the sphere, it is highly likely to have originated from the model.

Empirical evaluation across several open-weight models (Olmo 2 7B, Llama 3.1, Qwen 3 8B, GPT-OSS) demonstrates that the mean distance to the model ellipse is orders of magnitude smaller for the generating model than for others, even when projecting outputs onto each other's column spaces.

Figure 3: Mean distance to the model ellipse for logprobs from several models, showing clear separation and identification capability.

Forgery Resistance and Extraction Complexity

Unlike linear signatures, which can be forged by extracting linear constraints from API outputs, ellipse signatures are substantially harder to forge. The extraction of the model ellipse requires:

Sample Complexity: $O(d^2)$ outputs to uniquely determine the ellipsoid parameters, where $d$ is the hidden size.
API Query Complexity: $O(d^3 \log d)$ queries for typical API constraints.
Computational Complexity: Ellipsoid fitting algorithms require $O(d^6)$ time and $O(d^4)$ space.

For production-scale models (e.g., Llama 3 8B, $d=4096$ ), the required number of samples and computational resources render ellipse extraction infeasible in practice.

Figure 4: Predicted and true values for bias, singular values, and rotation angles for a small model, demonstrating high accuracy of parameter recovery with sufficient samples.

Figure 5: Distance between predicted and true parameters decreases with more samples, but exhibits diminishing returns due to normalization smoothing.

Figure 6: Extrapolated running time for ellipsoid extraction, showing infeasibility for large models.

Cryptographic Analogy: Message Authentication Codes

The forgery resistance and ease of verification of ellipse signatures enable a protocol analogous to symmetric-key message authentication codes (MACs). Here, the model ellipse acts as the secret key, and logprobs are the message-tag pairs. Verification consists of checking whether the logprobs lie on the secret ellipse.

Figure 7: Comparison of traditional MAC systems and the proposed ellipse-based verification system for LLMs.

This protocol allows third-party verification of model outputs without revealing model parameters or inputs, supporting forensic analysis, regulatory compliance, and accountability.

Implementation Details

The paper provides a detailed algorithm for extracting ellipse parameters from model outputs, leveraging ellipsoid-specific fitting methods (e.g., semidefinite programming via cvxpy and the mosek solver). The approach is robust to normalization smoothing for sufficiently large models, as the distribution of hidden state norms is tightly concentrated near 1.

Figure 8: Overview of intermediate representations and parameter recovery in the final LM layers.

Figure 9: LayerNorm output space and dimensionality reduction via isometric transforms.

Figure 10: Model representations and mappings for LMs with LayerNorm, showing the $d-1$ dimensional ellipse.

Figure 11: Distribution of L2 norms of hidden states for small and large models, illustrating the effect of normalization smoothing.

Comparison to Existing Methods

Ellipse signatures differ fundamentally from text-based watermarks, backdoor fingerprints, and linear signatures. They are:

Naturally Occurring: Present in all models with final normalization layers.
Self-Contained: Verifiable without access to model inputs or full weights.
Compact and Redundant: Detectable in any single logprob output.
Forgery-Resistant: Extraction and forgery are computationally infeasible for large models.

Existing methods either require intentional implementation, access to multiple outputs, or are vulnerable to forgery via API extraction.

Practical and Theoretical Implications

Ellipse signatures provide a new tool for model forensics, enabling robust identification of model outputs even in closed-weight, API-protected settings. The protocol supports regulatory and accountability frameworks by allowing trusted third parties to verify the provenance of outputs. However, the security guarantees are polynomial rather than cryptographic, and the protocol requires access to logprobs, which is not universally available.

Future work may focus on identifying stronger constraints with cryptographic hardness, extending signature protocols to settings with limited API access, and developing signatures that are difficult to remove or modify.

Conclusion

Ellipse signatures constitute a robust, naturally occurring, and forgery-resistant method for associating outputs with their generating LLMs. Their unique combination of properties fills a previously unaddressed niche in the landscape of output verification systems. While practical limitations exist for large-scale models, the theoretical framework and empirical results presented in this paper lay the groundwork for future research in model forensics, security, and accountability.

PDF Markdown

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Explain it Like I'm 14

What is this paper about?

This paper shows that every modern LLM (like ChatGPT or Llama) leaves a natural “signature” in its outputs. That signature isn’t hidden in the words themselves — it’s in the model’s internal numbers called log-probabilities (logprobs). The authors explain that these logprobs always land on the surface of a special, high-dimensional shape called an ellipsoid (think “a stretched sphere”), and that shape is unique to each model. Because of this, you can use the shape as a reliable way to tell which model produced a given output.

What questions does it ask?

In simple terms, the paper asks:

Do LLMs naturally produce outputs that sit on a unique, stretched-sphere shape (an ellipsoid)?
Can we use that shape to prove which model made an output?
Is it easy or hard for someone to fake outputs that look like they came from that model’s shape?
Can we turn this idea into a practical way to verify and trust model outputs?

How did the researchers paper it?

Here’s the everyday version of what’s going on inside a LLM:

Before the final step, the model “normalizes” its internal signals. A good analogy is putting all arrows onto the surface of a perfect sphere: every arrow has length 1, only their directions differ.
Then the model “stretches and rotates” that sphere in complex ways to produce scores for each word in the vocabulary. That stretching turns the sphere into an ellipsoid — a sphere squashed or stretched along some directions.
The API you call doesn’t give raw scores (logits), it gives log-probabilities (logprobs). But because of how softmax works, these logprobs still sit on a version of that same ellipsoid after a simple centering step.

Using this, the researchers:

Tested several open-source models (like Llama, Qwen, Olmo, GPT-OSS) and showed that each model’s outputs fit best onto its own ellipsoid, not others. That means the ellipsoid acts like a model-specific signature.
Built and tested algorithms that try to recover a model’s ellipsoid just from many API outputs. They found this is manageable for tiny models but becomes wildly expensive and slow for real, large models.
Compared this “ellipse signature” to other methods (like text watermarks or linear fingerprints) and explained why this one is both naturally present and hard to forge.

What did they find and why does it matter?

The big idea: the ellipsoid is a signature that has four special properties that make it particularly useful.

Here’s why it’s special:

Forgery-resistant: Without the model’s secret internal parameters, it’s practically infeasible to generate new logprobs that exactly sit on the model’s ellipsoid. For big models, you’d need a huge number of queries and extreme computing to reverse-engineer the ellipsoid.
Naturally occurring: You don’t need to modify the model to create this signature. It appears automatically because of the normalization layers used in practically all modern LMs.
Self-contained: You can check the signature using only the output logprobs (plus some information from the model’s final layer), not the original prompt or the full model weights.
Compact and redundant: You don’t need long text to verify it. Even a single generation step contains the signature.

Why this matters:

Identification: If you get a set of logprobs, you can check which model’s ellipsoid they lie on. The paper shows that outputs from a model sit much closer to its own ellipsoid than to others, cleanly identifying the source.
Security and trust: Because it’s hard to fake this signature for large models, it can help verify whether a model provider actually generated a piece of output.
Practicality: Verifying is cheap; forging is expensive. For large, commercial models, reverse-engineering the ellipsoid from an API would cost a ton of money and time, making forgery impractical.

What could this mean going forward?

The authors propose using the ellipsoid like a secret “stamp,” similar to a cryptographic Message Authentication Code (MAC):

The model’s ellipsoid acts like a shared secret key.
The logprob output is like a message that naturally contains a “tag” — its position on that secret ellipsoid.
Anyone trusted who knows the ellipsoid can verify whether the output really came from that model by checking if the logprobs lie on the ellipsoid.

This could help with:

Model forensics: Determining which model produced a harmful or disputed output.
Regulation and accountability: A trusted third party could verify outputs without needing the full model or the original prompt.
Ecosystem integrity: It’s a low-friction way to confirm authenticity without changing how models are run.

A few caveats:

This method relies on access to logprobs, which some APIs don’t provide or only provide in limited ways.
It’s not impossible (in the math sense) to forge — just very impractical for big models. Stronger cryptographic-style guarantees may require additional techniques.
Changing the model’s final layers or output processing can remove or alter the ellipsoid signature.

Overall, the paper introduces a simple but powerful idea: modern LLMs naturally sign their outputs with a geometric shape that’s very hard to fake. That can be turned into a practical tool to verify and trust model outputs in the real world.

View Paper Prompt View All Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a concise list of unresolved issues the paper leaves open; each item is framed to be concrete and actionable for future research.

Formalize and quantify the deviation induced by the normalization “ε” term:
- Derive bounds for how far logprob outputs lie inside (rather than on) the ellipsoid surface under RMSNorm/LayerNorm as a function of hidden size d and ε.
- Provide thresholding rules for verification that are provably robust to ε-smoothing across model scales and architectures.
Complete and validate the LayerNorm case:
- Finish the derivation for the d−1 dimensional sphere, the “lift” by bias back to ℝ^v, and the exact recovery protocol.
- Empirically evaluate ellipsoid recovery and verification on LayerNorm models of various sizes with realistic ε values.
Establish theoretical false-positive and false-negative guarantees:
- Characterize the distribution of “distance to the unit sphere” for in-model vs out-of-model logprobs under finite precision, ε-smoothing, and API noise.
- Provide provable bounds on the probability that a random output lies near the intersection of two (approximate) ellipsoids under practical tolerances.
Verify signature robustness under real-world API transformations:
- Analyze the impact of temperature, top-p/top-k truncation, logprob rounding, calibration layers, logit clipping, and logit_bias on ellipse detectability.
- Develop verification methods that work with partial logprob sets (e.g., top-k only) and quantify accuracy vs k and precision.
Clarify required verifier information and its IP/privacy implications:
- Precisely specify which final-layer parameters (e.g., W, b, U, singular values) must be shared to enable third-party verification.
- Analyze what proprietary information the shared ellipse may leak (e.g., vocabulary mapping, unembedding structure) and propose privacy-preserving sharing schemes.
Strengthen the hardness claims for ellipse forgery:
- Provide formal lower bounds or reductions showing that fitting/forging a high-dimensional LM ellipsoid is at least super-cubic (or hardness under standard assumptions).
- Explore whether randomized/sketching methods, tensor techniques, or distributed solvers can break the O(d⁶⁾ time (and O(d⁴⁾ space) barrier; report empirical limits.
Improve sample efficiency and prompt design for ellipse recovery:
- Determine minimal sample counts under noise and partial logprobs; design prefix/prompt strategies that guarantee points in “general position.”
- Quantify the token-length growth needed to generate O(d²⁾ independent samples as vocab limits are reached; propose practical query schedules.
Develop verification without full-vocabulary logprobs:
- Create algorithms that can verify ellipse membership from a fixed subset of tokens (≤ d), including numerical reconstruction procedures and stability analysis.
- Provide guidance on token selection strategies that minimize conditioning and numerical ill-conditioning across time and API changes.
Evaluate collision risk and signature uniqueness:
- Measure how often different models share indistinguishable or near-indistinguishable ellipses (e.g., when reusing tokenizers/unembedding matrices or training recipes).
- Propose tie-breaking or auxiliary signals when ellipses are similar (e.g., small affine off-sets, per-layer diagnostics).
Extend to complex final-layer architectures:
- Analyze MoE decoders, multi-branch/gated heads, mixture-of-softmax, tied embeddings, and adapters (LoRA/PEFT) to determine whether the signature becomes a union of ellipsoids or degrades.
- Build detectors capable of handling multi-ellipse unions and quantify their verification power.
Study effects of fine-tuning and versioning:
- Track how the ellipse parameters change under continued pretraining, instruction tuning, and alignment; design version-aware verification with revocation/rotation plans.
Quantify robustness under quantization and low-precision inference:
- Measure signature drift under INT8/FP8/quantization-aware training and define verification thresholds that remain effective.
Provide mid- to large-scale empirical evidence:
- Demonstrate ellipse extraction and verification on ≥7B open-weight models, and characterize runtime, memory, accuracy, and failure modes.
- Attempt partial extraction or “verification-only” procedures on closed-weight APIs to assess feasibility given real rate limits and pricing.
Develop sequence-level verification beyond per-step checks:
- Formalize replay/tampering resistance; evaluate inverter-based consistency checks and alternative mechanisms to detect stitched logprob sequences.
- Quantify detection power and error rates under adversaries assembling sequences from cached, signed logprobs.
Define the MAC-like protocol precisely:
- Specify correctness, completeness, and soundness definitions; formal adversary models (forgery, replay, chosen-message attacks).
- Design key management (distribution, rotation, revocation), multi-model support, and audit logging; compare cost/performance vs zkLLM.
Address ease of signature removal and propose hard-to-remove variants:
- Analyze how small changes (e.g., noise injection, alternate normalization) trade off accuracy vs signature erasure.
- Explore alternative naturally occurring constraints that are harder to remove yet remain self-contained and compact.
Reduce dependence on logprobs and broaden applicability:
- Investigate whether ellipse-like signatures can be inferred from observable text alone (e.g., top-1 traces, rank patterns), with statistical tests and sample-complexity analysis.
Provide rigorous cross-vocabulary verification procedures:
- Replace the ad hoc projection across shared tokens with a formal, stable mapping and quantify verification performance without full alignment across vocabularies.
Strengthen numerical stability of the recovery pipeline:
- Analyze conditioning of pseudoinverses and Cholesky/SVD steps; propose regularization and robust solvers tailored to LM ellipsoid recovery.
Formalize the cost model and sensitivity to API policy changes:
- Give end-to-end query/time/memory costs under realistic rate limits and pricing, with sensitivity analyses to top-k availability, logprob precision, and batching.
Explore quantum or advanced linear algebra accelerations:
- Assess whether quantum algorithms, advanced matrix-multiplication techniques, or randomized numerical linear algebra could materially reduce forgery/extraction cost.
Clarify limits of “self-contained” verification:
- Determine precisely what inputs or parameters are needed for reliable verification in practice when providers return only partial or transformed outputs; propose minimal disclosures.

View Paper Prompt View All Prompts

Practical Applications

Immediate Applications

The paper’s findings enable practical workflows today wherever log-probabilities (logprobs) are accessible or can be captured inside a controlled environment.

Output provenance checks for enterprise LLM deployments
- Sectors: software, finance, healthcare, legal/compliance.
- What: Verify that responses came from the approved internal model (and not a cheaper/older one) by checking on-ellipse distances for returned logprobs.
- Tools/products: “LLM-MAC Verifier” library or service; middleware that intercepts API responses and runs per-step ellipse checks; CI/CD guardrails for model routing.
- Assumptions/dependencies: Access to per-step logprobs for at least d tokens; provider shares final-layer ellipse parameters (or provides a verification oracle); stable inference stack (normalization present, no post-processing altering logits).
Third-party audit of model usage in vendor contracts
- Sectors: procurement, finance, regulated industries.
- What: Buyers request logprobs for spot-checked calls; auditors verify on-ellipse to confirm vendor actually used the contracted model/version.
- Tools/products: Contract clauses mandating a “verification mode” exposing logprobs on demand; auditor-side verification scripts; model registries storing ellipse parameters and versioning.
- Assumptions/dependencies: Cooperative vendors; limited exposure of logprobs permitted; alignment on token subset used for checks.
Internal red-teaming and incident triage
- Sectors: software security, trust & safety.
- What: When harmful output is reported, teams can rapidly determine if it originated from their model by validating the reported logprobs against the model ellipse.
- Tools/products: Incident-response playbooks; notebooks/CLI for batch verification of logged generations.
- Assumptions/dependencies: Systems must log per-token logprobs for sensitive pathways; availability of ellipse parameters to the internal T&S team.
Reproducibility and benchmarking integrity in research
- Sectors: academia, evaluation platforms.
- What: Authors or eval hosts publish logprobs with benchmark submissions; reviewers/organizers confirm model identity via ellipse checks to prevent “model swap” inflation.
- Tools/products: Eval harness plugins (e.g., for HELM, lm-eval) that emit/verify logprob-based signatures; open registry of ellipse parameters for open-weight models.
- Assumptions/dependencies: Communities agree to include logprobs; open models expose last-layer parameters or are easily derived from weights.
Multi-tenant inference integrity in model gateways
- Sectors: AI platforms, MLOps.
- What: Gateways that route to multiple backends verify that each backend’s responses match its declared ellipse to detect backend drift or misconfiguration.
- Tools/products: Router-integrated verifiers; monitoring dashboards showing on-ellipse deviation distributions over time.
- Assumptions/dependencies: Backends return logprobs; gateway stores ellipse metadata per backend/version.
Lightweight brand protection for model APIs and resellers
- Sectors: software marketplaces, API aggregators.
- What: Model owners provide an “ellipse challenge” endpoint; resellers can prove they proxy the genuine model by returning challenge logprobs that pass on-ellipse checks.
- Tools/products: Challenge-response API; badge/attestation for storefronts.
- Assumptions/dependencies: Short verified interactions that include logprobs; willingness to share a minimal verification interface without broader parameter disclosure.
Dataset leak and impersonation detection in open-model communities
- Sectors: open-source AI, research.
- What: Re-hosted checkpoints can be quickly identity-checked (ellipse distance trends) to confirm they are not modified forks posing as originals.
- Tools/products: “Model ID” scripts; community-maintained ellipse parameter snapshots tied to SHAs.
- Assumptions/dependencies: Open-weight models; stable final-layer parameters; reproducible tokenization.
Policy pilots with trusted third-party verifiers
- Sectors: policy/regulatory sandboxes, journalism.
- What: Providers escrow only final-layer parameters (ellipse) with a trusted verifier who can adjudicate provenance disputes without revealing full models or prompts.
- Tools/products: Verifier APIs; standardized submission format including logprobs; case-management workflows.
- Assumptions/dependencies: Providers expose logprobs to disputants or verifiers; legal arrangements for limited-parameter escrow.

Long-Term Applications

As logprob access standardizes and ecosystems mature, ellipse signatures enable stronger provenance, compliance, and accountability systems at scale.

Industry-wide LLM message authentication (LLM-MAC) standard
- Sectors: software, policy, cloud platforms.
- What: A standardized protocol where providers publish (or escrow) ellipse parameters for verification, analogous to MACs for messages; per-step verification acts as a cryptographic-style tag.
- Tools/products: RFC-like standard; open-source validator; certification programs.
- Assumptions/dependencies: Broad API support for logprobs; governance around parameter escrow and key rotation for model updates.
Content provenance infrastructure for AI-generated media
- Sectors: social media, journalism, creative tools.
- What: Attach verifiable “generation receipts” that include sparse logprob proofs; platforms can check on-ellipse to assert model identity alongside C2PA-style metadata.
- Tools/products: Receipt generators; verifier plugins for CMS and social platforms.
- Assumptions/dependencies: Consumer-facing assistants expose provenance receipts; storage and privacy policies for logprobs; UI norms for provenance.
Regulatory compliance and litigation-grade audit trails
- Sectors: healthcare, finance, public sector.
- What: High-stakes decisions include sealed logprob traces; regulators or courts verify output origin against registered ellipses when accountability is contested.
- Tools/products: Secure audit log formats; regulator-run verifiers; chain-of-custody procedures.
- Assumptions/dependencies: Legal mandates to retain and disclose logprobs; policies to prevent sensitive prompt leakage; alignment on retention windows.
SLA enforcement in model marketplaces and RAG vendors
- Sectors: enterprise software, data platforms.
- What: Buyers verify that vendors did not silently downgrade models mid-session; marketplaces penalize misrepresentation using automated on-ellipse checks.
- Tools/products: SLA monitors; dispute automation; marketplace trust scores.
- Assumptions/dependencies: Uniform access to logprobs; versioned ellipse registries; low false-positive calibration under epsilon smoothing.
Anti-forgery in adversarial environments (low-trust integrations)
- Sectors: fintech, ad-tech, cybersecurity.
- What: Use on-ellipse verification to prevent “proxy attacks” where a service blends outputs from multiple models; ensures quoted risk/compliance models are actually used.
- Tools/products: Inline verifiers in SDKs; anomaly detection on ellipse distances across sessions.
- Assumptions/dependencies: Access to enough token logprobs (≥ d per step or across steps) to robustly test; guardrails against replay of prior, valid logprob vectors.
Educational integrity workflows (with platform support)
- Sectors: education.
- What: Assignment tools capture optional logprob traces; instructors or proctors verify that claimed outputs came from allowed models (e.g., “foundation-only,” no external tools).
- Tools/products: LMS plugins; proctoring extensions storing minimal provenance.
- Assumptions/dependencies: Participation and consent for logging; platform-level logprob support; policies for privacy-preserving storage.
Safety-critical system provenance (robotics, healthcare devices)
- Sectors: robotics, healthcare, aviation.
- What: Devices using LLM planners log logprobs to enable post-incident forensic verification of the onboard model identity and version.
- Tools/products: Embedded verifiers; black-box recorders that store compressed logprob summaries.
- Assumptions/dependencies: Real-time logging overhead acceptable; stable final normalization layers; retention and encryption at the edge.
Payment, billing, and cost-transparency verification
- Sectors: cloud/AI platforms, finance ops.
- What: Customers spot-check bills by verifying that billed calls were served by the charged model tier, using on-ellipse checks of sampled interactions.
- Tools/products: Billing auditors; customer-facing verification endpoints; receipts annotated with model-ellipse IDs.
- Assumptions/dependencies: Provider cooperation; sampling strategies that do not leak sensitive context.
Hybrid cryptographic proofs combining zk and ellipses
- Sectors: advanced security, web3, privacy tech.
- What: Use the ellipse as a lightweight pre-filter plus zero-knowledge proofs (zkLLM) on a subset of steps to get stronger guarantees with reduced cost.
- Tools/products: zk circuits that verify on-ellipse constraints; probabilistic proof schedulers.
- Assumptions/dependencies: Advances in efficient zk for high-dimensional linear algebra; agreement on acceptable proof coverage.
API policy design and threat modeling
- Sectors: AI providers, policy.
- What: Use the paper’s hardness results (O(d³ log d) query cost, O(d⁶⁾ fitting) to guide safe defaults: rate-limit logprob access, expose enough for verification but not extraction, size models to raise forging cost.
- Tools/products: Provider policy playbooks; configurable “verification windows” (limited-time logprob access); automatic ellipse rotation on major releases.
- Assumptions/dependencies: Ongoing monitoring for algorithmic advances that could reduce extraction complexity; balancing verifiability with secrecy.
Insurance and risk scoring for AI services
- Sectors: insurance, risk management.
- What: Underwriters require verifiable provenance controls (ellipse checks) as a condition for coverage or premium discounts.
- Tools/products: Control frameworks referencing LLM-MAC; attestations in underwriting questionnaires.
- Assumptions/dependencies: Market adoption of provenance controls; standardized evidence formats.

Notes on feasibility across applications

The approach depends on access to logprobs for at least d tokens per verified step (or agreement on a fixed subset across steps). Many commercial APIs currently limit logprob access; provider cooperation or new API modes may be needed.
Verification requires knowledge of the model’s final-layer ellipse parameters or a verification oracle. Providers can share only those parameters (not full weights), but policy and trust frameworks must govern sharing and rotation.
Numerical thresholds must account for normalization “smoothing” (epsilon) and inference noise; calibration data is needed to set robust acceptance bands.
The signature authenticates model identity per step but does not bind to a specific input or prevent replay. Complementary measures (timestamps, nonces, cryptographic receipts, inversion-based coherence checks) are recommended when replay or splicing is a concern.
Future architectural changes removing final normalization could weaken this method; today, virtually all widely used LLMs retain such layers.

View Paper Prompt View All Prompts

Glossary

Affine transformation: A function composed of linear transformation followed by translation that preserves points, straight lines, and planes. "A typical LLM's final layers consist of normalization followed by a linear (or affine) transformation."
API-protected LLMs: Models accessible through an API that restricts direct access to parameters or full outputs. "Ellipse extraction from API-protected LLMs is extremely difficult"
Backdoors: Hidden behaviors intentionally trained into models that activate under specific inputs, often used to identify or manipulate models. "One common approach to fingerprinting is to train backdoors into LLMs"
Centered logits: Logits adjusted by subtracting their mean, making their entries sum to zero. "Centered logits also lie on an ellipsoid"
Centering: The operation of subtracting the mean from a vector so its entries sum to zero. "The centering operation on a vector subtracts the mean value of the vector entries"
Chain-of-thought outputs: Model-generated reasoning steps expressed in text, whose patterns can serve as fingerprints. "These include analysis of patterns in chain-of-thought outputs"
Cholesky decomposition: A matrix factorization for symmetric positive definite matrices into a product of a lower-triangular matrix and its transpose. "using Cholesky and singular value decomposition to find the latter."
Column space: The subspace spanned by the columns of a matrix, relevant to the transformation of outputs. "different vocabularies and column spaces"
Cross-entropy: A measure of difference between two probability distributions, often used to evaluate projections of outputs. "such that the cross-entropy between the original and projected outputs is minimized"
Diagonal matrix: A square matrix with nonzero entries only on its main diagonal, often representing axis-wise scaling. "is a diagonal (i.e., scaling) matrix"
Down-projection: Mapping higher-dimensional vectors into a lower-dimensional space to simplify analysis. "Choose a down-projection"
Ellipse extraction: The process of recovering the parameters of a model’s output ellipsoid from observed outputs. "Ellipse extraction from API-protected LLMs is extremely difficult"
Ellipse signature: A model-identifying property based on the constraint that outputs lie on a specific ellipsoid. "This ellipse signature has unique properties that distinguish it from existing model-output association methods like LLM fingerprints."
Ellipsoid fitting: Estimating an ellipsoid that best fits a set of points, often via optimization methods. "We use fast algorithms for multidimensional ellipsoid fitting using semidefinite programming"
Fingerprint (LLM fingerprinting): Methods that encode or discover identifiable signals in model outputs or behaviors. "LLM fingerprints"
Forgery resistance: The property of being difficult to imitate or replicate convincingly without secret information. "ellipse signatures are forgery-resistant"
Geodesic distance: The shortest distance between two points on a manifold; here used to compare rotations. "we use the geodesic distance \trace(^{\top^*)"}
Generalized inverse: A matrix inverse defined for non-square or singular matrices that satisfies certain properties (e.g., pseudoinverse). "^- is a generalized inverse of"
Geometric constraint: Structural restrictions on outputs induced by model architecture or parameters. "geometric constraints imposed by the LLM architecture and parameters."
Hidden size: The dimensionality of the model’s internal representations (embeddings), typically denoted as d. "vocabulary size $v$ much larger than their hidden size $d$ "
Hyperellipsoid: A high-dimensional generalization of an ellipse (ellipsoid). "a high-dimensional ellipse (a hyperellipsoid)"
Hypothesis testing: Statistical testing used to detect signals such as watermarks across multiple samples. "use hypothesis testing to verify that text has been watermarked."
Isometric transform: A distance-preserving linear transformation, used to simplify the geometry of normalized outputs. "we can apply an isometric transform that rotates $\bm1$ to align with an axis"
Layer norm: A normalization technique that centers and scales features across a layer’s dimensions. "Model architects tend to choose one of two normalization schemes: the root-mean-square (RMS) norm or the layer norm"
Linear signature: A model-identifying property based on linear constraints in output space. "This differentiates the ellipse signature from previously known linear signatures"
logit_bias parameter: An API feature allowing adjustments to the bias of specific token logits during inference. "the logit_bias parameter allows users to find the logprob of any token"
Logits: Pre-softmax scores over the vocabulary produced by a model’s final linear layer. "LLM logits are subject to constraints that force them to lie on a high-dimensional ellipse."
Log-probabilities (logprobs): Logarithms of predicted token probabilities, often exposed by APIs. "LLM APIs usually return log-probabilities (logprobs)"
Message Authentication Code (MAC): A cryptographic tag that verifies message integrity and authenticity using a shared secret. "a signer sends a message along with a message authentication code (MAC) to a verifier."
Positive definite: A property of symmetric matrices where all eigenvalues are positive, ensuring a valid ellipsoid. "the parameter~ $\Phi$ output by the fitting algorithm was not positive definite."
Pseudoinverse: A specific generalized inverse (Moore–Penrose) used to solve least-squares problems and invert non-square matrices. "where $\cdot^+$ denotes a pseudoinverse."
Quadric surface: A second-degree algebraic surface (including ellipsoids) described by a quadratic form. "The algorithm fits a quadric surface to the points"
Query complexity: The number of API calls needed to perform an extraction or computation, measured in asymptotic terms. "with $O(d^3\log d)$ query complexity in an OpenAI-like API"
Root-Mean-Square (RMS) norm: A normalization that scales vectors by their root-mean-square magnitude. "the root-mean-square (RMS) norm"
Rotation matrix: An orthogonal matrix representing a rotation in space; used to reparameterize output layers. "are unitary (i.e., rotation) matrices"
Semidefinite programming: An optimization framework for problems with semidefinite matrix constraints, used in ellipsoid fitting. "ellipsoid fitting using semidefinite programming"
Singular value decomposition (SVD): A factorization of a matrix into singular vectors and singular values, used to recover rotations and scales. "SVD-based ellipsoid fitting"
Softmax invariance (to scalar addition): The property that adding a constant to all logits does not change softmax probabilities. "Since the softmax function is invariant to scalar addition"
Strassen's algorithm: A fast matrix multiplication algorithm with sub-cubic complexity. "Faster methods, such as those based on Strassen's algorithm"
Time complexity: The computational running time of an algorithm, expressed in big-O notation. "actually fitting the ellipse has $O(d^6)$ time complexity."
Trapdoor function: A function that is easy to compute in one direction but hard to invert without a secret, enabling authentication. "creates a type of trapdoor function"
Unembedding matrix: The final linear transformation mapping hidden states to vocabulary logits. "the unembedding matrix"
Unitary matrix: A matrix whose inverse equals its conjugate transpose; in real-valued cases, an orthogonal (rotation) matrix. "are unitary (i.e., rotation) matrices"
Up-projection: Mapping lower-dimensional representations back into a higher-dimensional space. "Solve for up-projection"
Watermarks: Intentional signals embedded in generation (often via decoding) to later identify model outputs. "Text-based watermarks, a subclass of fingerprint methods"
Zero-knowledge proof (zkLLM): A protocol that proves model-generated outputs without revealing model details. "propose a zero-knowledge proof for LLMs (zkLLM)"

View Paper Prompt View All Prompts

Open Problems

Continue Learning

Authors (3)

Collections

Tweets

This paper has been mentioned in 6 tweets and received 102 likes.

Upgrade to Pro to view all of the tweets about this paper:

Start a free 7-day Pro trial

HackerNews

Every Language Model Has a Forgery-Resistant Signature (7 points, 2 comments)
Every Language Model Has a Forgery-Resistant Signature (2 points, 1 comment)

alphaXiv

Every Language Model Has a Forgery-Resistant Signature (16 likes, 0 questions)

Every Language Model Has a Forgery-Resistant Signature (2510.14086v1)

Summary

Forgery-Resistant Ellipse Signatures in LLMs

Introduction

Mathematical Formulation of Ellipse Signatures

Signature Verification and Model Identification

Forgery Resistance and Extraction Complexity

Cryptographic Analogy: Message Authentication Codes

Implementation Details

Comparison to Existing Methods

Practical and Theoretical Implications

Conclusion

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

What is this paper about?

What questions does it ask?

How did the researchers paper it?

What did they find and why does it matter?

What could this mean going forward?

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Practical Applications

Immediate Applications

Long-Term Applications

Glossary

Open Problems

Continue Learning

Related Papers

Authors (3)

Collections

Tweets

HackerNews

alphaXiv