Accelerating mathematical research with language models: A case study of an interaction with GPT-5-Pro on a convex analysis problem
Abstract: Recent progress in LLMs has made them increasingly capable research assistants in mathematics. Yet, as their reasoning abilities improve, evaluating their mathematical competence becomes increasingly challenging. The problems used for assessment must be neither too easy nor too difficult, their performance can no longer be summarized by a single numerical score, and meaningful evaluation requires expert oversight. In this work, we study an interaction between the author and a LLM in proving a lemma from convex optimization. Specifically, we establish a Taylor expansion for the gradient of the biconjugation operator--that is, the operator obtained by applying the Fenchel transform twice--around a strictly convex function, with assistance from GPT-5-pro, OpenAI's latest model. Beyond the mathematical result itself, whose novelty we do not claim with certainty, our main contribution lies in documenting the collaborative reasoning process. GPT-5-pro accelerated our progress by suggesting, relevant research directions and by proving some intermediate results. However, its reasoning still required careful supervision, particularly to correct subtle mistakes. While limited to a single mathematical problem and a single LLM, this experiment illustrates both the promise and the current limitations of LLMs as mathematical collaborators.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper is about two things:
- A small but tricky math result in convex analysis (a branch of math that studies “bowl-shaped” functions).
- A case study showing how a powerful LLM (GPT-5-pro) helped a human mathematician think through and prove that result.
The math result is a local “first-order” expansion for the gradient of a function after applying a special double transform called the biconjugate. It connects to optimal transport, which is about moving mass (like sand) from one place to another in the cheapest way.
Main Topic in Simple Terms
Think of a convex function as a smooth bowl. Its gradient is like a map telling you the direction of steepest climb at each point. Now take a small “perturbation” —like a tiny ripple added to the bowl—where is a small number and is a smooth test function. The paper studies what happens to the gradient after you:
- add the ripple, and then
- apply a double transform called (the biconjugate), which “convexifies” a function if needed.
The key statement they prove is that, near most points where the bowl is nicely curved, the change in the gradient is exactly what you’d expect to first order: , where means “an error that is much smaller than as goes to $0$.”
Key Questions and Objectives
The paper asks:
- If we add a tiny smooth bump to a convex function and then take the biconjugate, how does the gradient change at a point ?
- Can we prove a clean first-order formula for that change, and is the “tiny error” term actually zero for small enough at that point?
- How well can a LLM like GPT-5-pro help with discovering and proving such a math fact?
The main objective is to prove the formula
under reasonable assumptions (especially that is strictly convex and “twice differentiable” at , meaning it has a well-defined curvature there), and to document the collaboration with GPT-5-pro.
Methods and Approach Explained Simply
Here’s how they tackled it:
- Convex analysis tools:
- A convex function is “bowl-shaped.” At a good point , you can approximate it by a quadratic bowl: .
- The Hessian is like a matrix describing the local curvature of the bowl. “Positive definite” means the bowl curves up in all directions (no flat directions).
- The biconjugate is a way to take any function and replace it by the “tightest” convex function above it. If is already convex, everywhere.
- The core trick:
- Define .
- Show that, at the specific point , when is small enough. In other words, “convexifying” doesn’t change its value at for small .
- Use a careful local inequality: the true function stays above its tangent plane at with a positive margin—both for points near and for points far enough from . This margin survives the tiny perturbation if is small.
- Once is known, they use a standard convex analysis argument (subgradients) to conclude the gradient also matches at : .
- That gives the desired first-order formula for the gradient.
- Role of GPT-5-pro:
- The model suggested promising directions (like focusing on local behavior at the point and seeking conditions under which convexification is “inactive” there).
- It helped prove intermediate inequalities (for example, local lower bounds).
- It made some mistakes (like assuming strong convexity on a whole neighborhood from pointwise information), and the human author corrected and refined the approach.
- Together, the process led to a clean and rigorous proof.
Main Findings and Why They Matter
- The main result:
- At almost every point where has nice curvature, the gradient of the biconjugate of changes exactly as if you had just added and taken the gradient:
- .
- Even better, the “” error is actually zero for small enough—at that point . So locally, for small , the equality is exact at .
- Why it’s important:
- In optimal transport (moving mass as efficiently as possible), the optimal map is the gradient of a convex potential. Understanding how this gradient changes when you slightly tweak the potential helps study sensitivity and dynamics.
- The result is a neat, practical tool: if you gently perturb a convex potential, the immediate change in the gradient is exactly —no hidden surprise from the convexification step, at least at the point and for small perturbations.
- About the collaboration:
- GPT-5-pro sped up problem solving by proposing the right kind of local argument and key conjecture.
- However, the human researcher still needed to check details, spot errors, and guide the proof to completion.
- This shows that LLMs can be valuable thought partners in math, but they are not yet fully reliable on their own.
Implications and Impact
- For mathematics and optimal transport:
- The result adds a simple, local sensitivity formula for gradients of convex potentials under small perturbations, useful in analysis and applications.
- For AI-assisted research:
- This case study shows the promise of advanced LLMs as math collaborators: they can suggest ideas and partial proofs that meaningfully speed up research.
- It also highlights present limits: careful human oversight is essential to validate and correct the reasoning.
- More systematic ways to evaluate and use such collaborations could help the math community benefit from AI tools while maintaining rigor and reliability.
Final Takeaway
If you add a tiny, smooth bump to a nicely curved convex “bowl” and then do the standard “make it convex” transformation, the gradient at a good point changes exactly by —no extra surprises to first order. And with GPT-5-pro’s help, the authors found and proved this result faster, while showing the importance of human guidance to ensure the math is correct.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a single, prioritized list of what remains missing, uncertain, or unexplored in the paper, formulated so future researchers can act on it.
Mathematical gaps and extensions
- Formal literature placement: Clarify whether the first-order expansion for the gradient of at a point with is novel. Provide precise conditions and citations (e.g., in epi-derivative theory) under which biconjugation preserves first-order jets, and identify where existing results fall short of guaranteeing the pointwise identities proved here.
- Minimal regularity on the perturbation: The proof assumes with global bounds. Determine the weakest regularity and growth conditions on (e.g., , Lipschitz, , non-compact support with local bounds) under which still holds.
- Minimal convexity assumptions on : Assess whether strict convexity and twice differentiability at are necessary. Characterize the behavior at points where is not a singleton, or where exists but is only semidefinite, and identify conditions yielding a first-order gradient expansion or counterexamples when it fails.
- Local versus pointwise equality: The paper establishes only at the base point for small . Provide criteria ensuring local (neighborhood) equality of and —e.g., curvature thresholds, quantitative smallness of , or continuity/modulus conditions on —and delineate when convexification necessarily becomes active near .
- Explicit control of the constants: Make and the auxiliary radius quantitative in terms of local data (e.g., , pointwise Taylor remainder bounds, and local norms of ). Establish computable bounds or a measurable selection suitable for integrating over sets or measures.
- Uniform-in- bounds: Identify conditions (e.g., regularity on a compact set and a uniform lower bound on curvature) that yield a uniform threshold such that the gradient identity holds for all in a given region, enabling measure-level differentiation without pointwise thresholds.
- Second-order sensitivity: The result gives exact first-order behavior for small at . Develop second-order expansions for (and possibly for itself), including explicit remainder bounds and conditions under which admits a controlled Taylor expansion in .
- Epi-derivative framework: Provide a rigorous derivation connecting the paper’s pointwise argument to epi-derivatives and proto-derivatives of subdifferentials, specifying exact hypotheses under which the operation preserves first-order derivatives at points with unique subgradients.
- Extension beyond Fenchel transforms: Investigate whether an analogous gradient expansion holds for -transforms (general cost functions) and identify structural conditions on ensuring the same first-order behavior of the corresponding “double transform” operator.
- Non-Euclidean settings: Explore extensions to Banach spaces or Riemannian manifolds, and identify which parts of the argument rely on Euclidean geometry (e.g., Carathéodory-type representations, quadratic expansions) versus those that generalize.
- Stability for finite : Characterize when convexification becomes active as increases, and provide thresholds or bifurcation criteria for the onset of nontrivial convex envelope effects in .
- Measure-level consequences: The optimal transport motivation is to study . The paper does not analyze regularity, continuity, or differentiability in of , nor derivatives of integral functionals (e.g., entropy). Establish such properties under the paper’s local hypotheses, ideally with uniform-in- controls.
Methodological limits in LLM-assisted math
- Generalizability of the case study: The paper evaluates one problem with one model. Propose a systematic protocol (problem selection, difficulty calibration, expert oversight criteria, error taxonomy) to assess LLM contributions across diverse mathematical domains and levels of rigor.
- Reproducibility and benchmarking: Provide reproducible artifacts (prompts, transcripts, environment details) and standardized benchmarks for the kind of local convex-analytic reasoning demonstrated, enabling independent validation and comparison across models.
- Quantifying impact: Develop metrics to measure how LLM suggestions change time-to-proof, error rates, or proof quality (e.g., number and severity of corrections, proportion of salvageable lemmas, novelty of key ideas), rather than purely qualitative narratives.
- Handling subtle errors: The interaction revealed recurring issues (e.g., conflating local quadratic lower bounds with strong convexity on a neighborhood). Create checklists or automated verifications for common pitfalls in convex analysis (uniform curvature, envelope localization, differentiability transfer through transforms).
Practical Applications
Immediate Applications
Below are actionable uses that can be deployed now, based on the paper’s mathematical result and the documented LLM–human collaboration workflow.
- Safe local linearization and step-size control for convex potential perturbations (software, ML, optimization)
- Application: When updating a convex potential φ by adding a small perturbation t·h (e.g., regularization, side constraints, or learned corrections), use the paper’s local thresholds to guarantee that the convex envelope (biconjugate) is inactive at the point x and the gradient update is exact: ∇(φ + t h){**}(x) = ∇φ(x) + t ∇h(x).
- Tools/workflows: Implement a “local safety check” in Python/Julia OT libraries (e.g., POT, GeomLoss, CVXPy) that:
- Estimates λ = λmin(∇2φ(x)) (or a numerical proxy) and bounds L = ||∇2 h||∞, M = ||h||_∞.
- Chooses t within t_x = min(λ/(4L), λ δ2/(64 M)) for a small δ determined by local Taylor control.
- Applies the exact gradient update without computing conjugates globally.
- Assumptions/dependencies: φ twice differentiable at x with ∇2φ(x) ≻ 0; h ∈ C2_c; numerical estimation of λ, L, M, and δ available.
- First-order sensitivity analysis for Monge transport maps (ML, data science, geospatial, imaging)
- Application: In pipelines that use gradients of convex potentials as transport maps (Brenier maps), compute how the map changes under small perturbations of the potential using the identity ∂/∂t|_{t=0} ∇(φ + t h){**}(x) = ∇h(x) almost everywhere.
- Tools/workflows: Add a “sensitivity mode” to OT routines that outputs the local velocity field v_0(y) = ∇h(∇φ*(y)) for:
- Domain adaptation and dataset alignment (ML).
- Image registration updates and shape morphing (healthcare imaging).
- Real-time geospatial mass balancing (energy/logistics).
- Assumptions/dependencies: Availability of φ* numerically (or ∇φ* via inverse map), a.e. differentiability of φ, and smooth h.
- Faster gradient-based training in convex-potential models and OT-guided generative modeling (ML, software)
- Application: Use the exact local gradient expansion to compute parameter updates for potential-based models (e.g., normalizing flows with convex potentials, energy models with Fenchel structures) without needing full conjugation steps near well-behaved points.
- Tools/workflows: A PyTorch/JAX layer that wraps convex-potential modules and switches to the local formula when Hessian checks pass; falls back to standard methods otherwise.
- Assumptions/dependencies: Reliable local Hessian estimation; compactly supported or bounded-curvature perturbations.
- LLM-in-the-loop mathematical research protocol (academia, education)
- Application: Adopt the paper’s supervision-centric workflow to use frontier LLMs as math collaborators:
- Prompt LLMs to propose conjectures and intermediate lemmas.
- Require human oversight for checking subtle steps (e.g., strong convexity misclaims).
- Log and annotate interactions for reproducibility and peer review.
- Tools/workflows: “Research Diary” templates, prompt libraries for convex analysis, and versioned interaction repositories (e.g., Git + Markdown + PDF exports).
- Assumptions/dependencies: Expert oversight; institutional acceptance of documented LLM contributions; storage of interaction logs.
- Teaching modules on convex analysis and optimal transport with LLM assistance (education)
- Application: Use the case study to teach:
- Fenchel transforms, biconjugation, and convex envelopes.
- Alexandrov differentiability and local expansion techniques.
- How to critique and correct LLM-generated proofs.
- Tools/workflows: Interactive notebooks (Jupyter) with exercises that guide students through local inactivity of convexification and sensitivity of transport maps, coupled with constrained LLM prompts.
- Assumptions/dependencies: Classroom policies on LLM use; curated datasets of prompts and corrections.
- Lightweight policy guidance for evaluating LLMs as math assistants (policy, R&D ops)
- Application: Institute immediate guardrails and rubrics:
- Require local correctness checks and citations for claims.
- Mandate disclosure of LLM interaction logs in submissions involving LLM-derived insights.
- Evaluate models on realistic assistant tasks, not only benchmark problem sets.
- Tools/workflows: Checklists for reviewers; submission templates with “LLM contribution” sections.
- Assumptions/dependencies: Journal/conference buy-in; privacy-safe logging.
Long-Term Applications
The following items require further research, scaling, or development before broad deployment.
- Scalable, standardized evaluation frameworks for LLM mathematical collaboration (academia, policy, software)
- Application: Build a multi-institution platform to assess LLMs as math co-authors:
- Task banks in areas like convex analysis, OT, PDEs, combinatorics.
- Multi-dimensional scoring (novelty, correctness, utility, transparency).
- Human-in-the-loop protocols and longitudinal studies.
- Tools/products: Open datasets of annotated interactions; leaderboards; plugins for formal verification systems (Lean, Isabelle).
- Assumptions/dependencies: Community governance; funding; interoperability with proof assistants; ethical data policies.
- Automated theorem discovery pipelines with formal verification backends (software, academia)
- Application: Combine LLM conjecture generation with automated convex analysis checks and formal proof verifiers to reduce human burden and error rates.
- Tools/products: “ConvexProof” engines that:
- Detect local conditions (e.g., Hessian positivity at points).
- Enforce biconjugation identities and convex envelope characterizations.
- Interface with formal systems to certify proofs end-to-end.
- Assumptions/dependencies: Advances in formalization of convex analysis; robust LLM–theorem prover integration.
- High-dimensional OT solvers accelerated by local expansion heuristics (ML, robotics, energy, imaging)
- Application: Use local inactivity of convexification and first-order sensitivity to:
- Precondition large-scale OT computations.
- Warm-start iterative solvers with locally exact linearizations.
- Enable real-time adaptivity in robotics motion planning and geospatial resource flows.
- Tools/products: “Local OT Accelerator” modules for industrial solvers; APIs for dynamic map updates.
- Assumptions/dependencies: Reliable detection of “good” regions; robust fallback strategies when local assumptions fail; integration with existing PDE/OT stacks.
- Generalized first-order expansions for c-transforms and non-quadratic costs (academia, ML)
- Application: Extend the result beyond Fenchel biconjugation to broader cost structures (c-transforms), improving sensitivity analysis in more general OT settings (e.g., robust transport, Wasserstein variations).
- Tools/workflows: Research programs to derive analogous local inactivity conditions and gradient expansions for different c-costs; experimental comparison in ML applications.
- Assumptions/dependencies: New theoretical advances; stronger regularity conditions; empirical validation.
- Domain-specific LLMs for rigorous mathematical assistance with error-detection modules (software, education)
- Application: Train math-specialized LLMs that:
- Flag plausible-but-wrong reasoning steps (e.g., unjustified strong convexity).
- Suggest minimal counterexamples.
- Propose multiple proof routes with uncertainty quantification.
- Tools/products: “MathCopilot” with built-in convex analysis toolkits; examiner modes for classroom use.
- Assumptions/dependencies: High-quality training data; integration with symbolic computation; acceptance in classrooms and journals.
- Sensitivity-driven calibration in finance and risk (finance)
- Application: Use first-order map updates to calibrate transport-based scenario generators and risk transfers under small policy or market perturbations.
- Tools/products: “OT Risk Calibrator” that uses local expansions for rapid recalibration without full recomputation.
- Assumptions/dependencies: Mapping of financial constraints to convex potentials; accurate local curvature estimation; model risk controls.
- Medical imaging: robust diffeomorphic registration with local convexification guards (healthcare)
- Application: Improve registration pipelines by leveraging local inactivity of convexification to:
- Stabilize updates.
- Reduce compute on repeated biconjugation.
- Provide sensitivity maps for clinicians when tuning regularizers.
- Tools/products: Plugins for ITK/SimpleITK; research prototypes integrated in LDDMM-like frameworks.
- Assumptions/dependencies: Clinical validation; mapping of imaging objectives to convex potentials; regulatory compliance.
Notes on assumptions and dependencies that cut across applications:
- The key mathematical guarantees are local (pointwise or a.e.), not global; systems must detect and respect validity regions.
- φ must be twice differentiable at the point of interest with a positive-definite Hessian; h requires bounded curvature (C2 with compact support or similar).
- Numerical estimation of Hessians and their eigenvalues is nontrivial in high dimensions; robust proxies and uncertainty-aware decisions are needed.
- LLM-driven research requires expert supervision, transparent logging, and institutional policies that balance innovation with rigor.
Glossary
- a.e. (almost everywhere): Measure-theoretic qualifier meaning a property holds except on a set of measure zero. Example: "for almost every (a.e.) "
- absolutely continuous (a.c.): A measure is absolutely continuous with respect to another if it assigns zero mass to every set that the other does; for densities, this means having a Lebesgue density. Example: "since are a.c., we may pick a set of full -measure"
- Alexandrov Hessian: The almost-everywhere defined Hessian of a convex function (second derivative in the sense of Alexandrov). Example: "The Alexandrov Hessian exists and is symmetric positive definite for a.e. "
- Alexandrov’s second-order expansion: A pointwise quadratic expansion of a convex function at points of twice differentiability. Example: "By Alexandrov’s second-order expansion, for small , "
- affine envelope: The supremum of all affine functions lying below a given function; equals the biconjugate for proper lower semicontinuous convex functions. Example: "Using the characterization of as the affine envelope of (see~\cite[Section 12]{Roc70}), ."
- barycenter: The weighted average (center of mass) of points in a convex combination. Example: "any convex combination with barycenter "
- biconjugate: The Fenchel biconjugate of a function, obtained by applying the Fenchel transform twice; denoted . Example: "The biconjugate of is denoted ."
- biconjugation operator: The operator mapping a function to its Fenchel biconjugate. Example: "the biconjugation operator—that is, the operator obtained by applying the Fenchel transform twice—around a strictly convex function"
- Carathéodory’s theorem (Carathéodory representation): In finite dimensions, points in a convex hull can be represented as convex combinations of at most d+1 points. Example: "By finite-dimensional convex analysis (Carathéodory), one has "
- c-transform: A generalized conjugation associated with a cost function c in optimal transport. Example: "This lemma was later extended to -transforms by~\citet{gangbo1996geometry}"
- compact support: A function has compact support if it is zero outside a compact set. Example: "Let with compact support"
- convex envelope: The largest convex function lying below a given function. Example: "on the function is already convex (indeed strongly convex), so its convex envelope coincides with itself."
- convexification: The process of replacing a function by its convex envelope or biconjugate. Example: "the convexification is inactive"
- distributional Hessian: The second derivative of a function in the sense of distributions, a matrix-valued measure for convex functions. Example: "For a convex function , the distributional Hessian is a symmetric positive matrix-valued measure."
- epi-convergence: A notion of convergence for functions based on convergence of epigraphs, central in variational analysis. Example: "Epi-convergence and epi-derivatives (Ch. 7, 13)."
- epi-derivative: A generalized derivative concept defined via epigraphs, used to study variational stability. Example: "Epi-convergence and epi-derivatives (Ch. 7, 13)."
- epi-differentiability: The property of having epi-derivatives; a framework for first- and second-order analysis of functionals. Example: "The right language is epi-differentiability and tilt-stability in variational analysis."
- epi-continuity: Continuity with respect to epigraphs; a stability property of function transformations. Example: "biconjugation is epi-continuous and preserves first-order epi-derivatives"
- Fenchel transform: The convex conjugate of a function, defined by f*(y)=sup_x <x,y>−f(x). Example: "we denote by its Fenchel transform"
- Fréchet differentiable: A strong notion of differentiability in normed spaces, implying linear approximation with remainder o(∥h∥). Example: "if a scalar function is twice Fréchet differentiable at a point "
- Lipschitz (L-Lipschitz): A function whose gradient or value changes at most linearly with a constant L. Example: "for every , is –Lipschitz."
- Nesterov ODE: A continuous-time dynamical system modeling Nesterov’s accelerated gradient method. Example: "he could prove the pointwise convergence of the Nesterov ODE with help of GPT-5-pro."
- pushforward (measure-theoretic): The image measure induced by a map, denoted with the # symbol. Example: "where is a given probability measure and the pushforward operation."
- strictly convex: A function whose epigraph has strictly supporting hyperplanes; line segments lie strictly above the graph except at endpoints. Example: "assuming is strictly convex:"
- subdifferential: The set of subgradients (supporting hyperplane slopes) of a convex function at a point. Example: "For every , the subdifferential of at , denoted "
- subgradient reciprocity: The duality relation between subgradients of a convex function and its conjugate at paired points. Example: "subgradient reciprocity + -a.e.\ differentiability"
- tilt-stability: Stability of minimizers under linear perturbations (tilts) of the objective, analyzed via second-order variational tools. Example: "The right language is epi-differentiability and tilt-stability in variational analysis."
- variational analysis: The study of optimization and stability via generalized differentiation and epigraphical methods. Example: "Rockafellar–Wets, Variational Analysis (1998):"
Collections
Sign up for free to add this paper to one or more collections.