Optimization in Theory and Practice (2510.15734v1)
Abstract: Algorithms for continuous optimization problems have a rich history of design and innovation over the past several decades, in which mathematical analysis of their convergence and complexity properties plays a central role. Besides their theoretical properties, optimization algorithms are interesting also for their practical usefulness as computational tools for solving real-world problems. There are often gaps between the practical performance of an algorithm and what can be proved about it. These two facets of the field -- the theoretical and the practical -- interact in fascinating ways, each driving innovation in the other. This work focuses on the development of algorithms in two areas -- linear programming and unconstrained minimization of smooth functions -- outlining major algorithm classes in each area along with their theoretical properties and practical performance, and highlighting how advances in theory and practice have influenced each other in these areas. In discussing theory, we focus mainly on non-asymptotic complexity, which are upper bounds on the amount of computation required by a given algorithm to find an approximate solution of problems in a given class.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper is about “optimization,” which simply means finding the best choice among many possibilities. Think of picking the cheapest, fastest, or most efficient option that still follows the rules. The author explores two big types of optimization:
- Linear programming (LP): Problems where everything is “straight-line” (linear), like costs and rules.
- Unconstrained smooth optimization: Problems with a smooth “hill” or “valley” shaped function you want to go down (or up), without extra rules.
The paper explains how mathematicians design algorithms to solve these problems, prove that they work, and measure how fast they can find good answers. It also shows how theory and real-world practice influence each other.
Key Questions
The paper asks simple but deep questions like:
- Can we design algorithms that always get close to the best answer?
- How many steps or how much computer work do these algorithms need?
- Why do algorithms sometimes work much faster in practice than theory predicts?
- How can we bridge the gap between what we can prove and what we see in real-life problems?
Methods and Approach
The paper is a guided tour, not a single experiment. It explains:
- What the problems look like:
- Unconstrained smooth optimization: “Minimize f(x)” where f is a smooth function. You can imagine standing on a landscape and walking downhill to the lowest spot. The “gradient” is like the slope arrow telling you which way is steepest down.
- Linear programming (LP): “Minimize cᵀx” with rules Ax = b and x ≥ 0. Think of choosing amounts of different items (x) so you meet exact requirements (Ax = b), never pick negative amounts (x ≥ 0), and keep cost low (cᵀx).
- What “optimality conditions” are: These are tests that say, “You’re at the best point,” or “You’re close.” For smooth functions, being at a point where the gradient is zero means you’re at a flat spot — possibly the best point locally. For LP, special algebraic conditions involving both the main problem and a matching “dual” problem mark a true solution.
- How we measure algorithm speed:
- Iteration complexity: How many steps until you’re within a small error ε?
- Operation complexity: How many basic arithmetic operations (like +, −, ×, ÷) does it take? Think of this as total effort.
- Oracle complexity: How many times do we need to “ask” for information about the function (like “what’s f(x) and its gradient here?”). This is useful for problems where the main cost is evaluating the function.
- Why theory vs practice can differ:
- Worst-case scenarios are rare in real life.
- Real problems often have extra structure that algorithms can exploit.
- Smart algorithm tricks (like adapting step sizes) help in practice but are hard to capture neatly in proofs.
- Some nonconvex problems (which are “bumpy”) surprisingly behave nicely in many modern applications, like machine learning.
Main Results and Why They Matter
Here are the big takeaways explained in everyday terms:
- Simplex method (for LP):
- Picture the allowed solutions as a many-sided shape (a polyhedron). Simplex walks from corner to corner to improve the objective.
- In the worst case, this walk can take a very long time (exponential). But in practice, it’s often fast.
- Smoothed analysis (adding tiny random noise to the data) shows that, on average, simplex behaves well — giving a more realistic view of why it works in the real world.
- Ellipsoid method (for LP):
- Imagine enclosing all possible solutions inside a big “bubble” (ellipsoid) and shrinking it repeatedly.
- It was the first method proven to run in polynomial time (good in theory), but it’s slow in practice.
- Karmarkar’s projective algorithm and interior-point methods (for LP):
- Instead of walking along the edges like simplex, these methods move smoothly through the inside of the allowed region, guided by math that keeps them away from the boundary.
- They have strong theory (polynomial-time guarantees) and are also fast in practice — a win-win.
- Primal-dual interior-point methods, especially Mehrotra’s predictor-corrector approach, became the standard in high-quality LP software.
- Unconstrained smooth optimization:
- For general “bumpy” (nonconvex) landscapes, finding the true global minimum is hard.
- But many modern problems (like in machine learning) have special shapes that make good solutions easier to find.
- Complexity ideas like oracle complexity help us compare algorithms fairly and understand their fundamental limits.
- Some algorithms are provably optimal within a certain class — for example, Nesterov’s accelerated gradient method for smooth convex problems.
- Complexity types clarified:
- Iteration bounds like “O(1/ε)” or “O(log(1/ε))” tell you how steps shrink error.
- Operation bounds consider the cost per step (like solving large linear systems).
- Lower bounds say “no algorithm of a given type can do better than this,” helping identify truly optimal methods.
Why this matters: These insights guide how we design algorithms, choose the right tool for the job, and understand what’s possible and what’s not. They explain why certain methods dominate in software and why others remain mostly theoretical.
Implications and Impact
- Better algorithms and software: Interior-point methods transformed LP solving in the 1980s–1990s, and improvements continue today. Simplex also became much faster thanks to this competition.
- Smarter choices in practice: Complexity theory helps, but experience on similar problems is often a more reliable guide. Still, theory highlights limits and can inspire new ideas.
- Machine learning and modern applications: Optimization is everywhere — training models, fitting data, choosing features. Understanding when nonconvex problems are “benign” helps explain why training often works well.
- Ongoing dialogue between theory and practice: The paper reinforces the idea that practice inspires theory (by showing what works), and theory improves practice (by sharpening and systematizing methods).
In short, the paper shows how careful mathematical thinking and hands-on computing have together shaped powerful tools to solve real-world problems — and how that partnership keeps pushing optimization forward.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a single, focused list of what remains missing, uncertain, or unexplored in the paper, stated concretely to guide future research.
- Polynomial worst-case bounds for the simplex method with practically used pivot rules remain unproven; determine whether any deterministic or randomized pivot rule achieves polynomial worst-case iteration complexity on all LP instances.
- Bridge the gap between simplex worst-case analyses and practice by developing theoretical models that reflect the sparsity, block structure, and conditioning typical of real-world LPs (beyond dense, rotationally symmetric random matrices).
- Extend smoothed analysis to more realistic perturbation models (e.g., sparse, correlated, or structured perturbations; scaling-invariant models) and to pivot rules used in modern simplex implementations; obtain high-probability bounds with sharp dependence on , , sparsity, and conditioning.
- Quantify the rarity and structure of worst-case LP instances for simplex under practically relevant instance distributions; identify generative models that reproduce practical difficulty and support average-case or instance-wise bounds.
- Provide rigorous iteration and operation complexity bounds for Mehrotra’s predictor-corrector primal-dual algorithm (including its heuristics for centering, step selection, and parameter tuning) that match observed practical performance.
- Close the gap between theoretical iteration bounds for primal-dual interior-point methods (e.g., vs. ) and empirical iteration counts that grow weakly with ; derive refined, instance-aware bounds that capture sparsity and problem structure.
- Develop operation- and bit-complexity analyses for interior-point methods that account explicitly for sparsity, fill-in, factorization updates, caching/memory hierarchies, and communication costs in modern architectures.
- Establish stability and finite-precision guarantees (backward/forward error) for both simplex and interior-point methods under floating-point arithmetic, linking numerical conditioning to iteration complexity and termination accuracy.
- Analyze the effect of degeneracy on simplex and interior-point methods with guarantees (e.g., bounds parameterized by degeneracy measures); design pivot/centering rules robust to degeneracy with provable complexity.
- Provide rigorous stopping criteria that translate primal-dual surrogate measures (e.g., ) into explicit bounds on primal infeasibility, dual infeasibility, and optimality gap for LP, with certified tolerances.
- Derive lower bounds and optimality results for interior-point methods analogous to Nemirovski–Yudin style oracle lower bounds, clarifying whether common path-following schemes are optimal within well-defined algorithm classes.
- Unify oracle and operation complexity for nonlinear optimization: create models that convert variable oracle costs (e.g., due to backtracking/trust-region updates) into operation counts that reflect realistic evaluation costs and data access patterns.
- Provide tight lower bounds for modern adaptive algorithms in smooth nonconvex optimization (e.g., trust-region, cubic regularization, line-search with backtracking) beyond gradient-span classes, including second-order or mixed-order oracles.
- Characterize “benign nonconvexity” rigorously: identify structural properties and distributions underlying machine learning problems where global minima are efficiently found; quantify prevalence and provide instance-dependent guarantees.
- Develop average-case or smoothed analyses for nonconvex problems that incorporate data distributions, overparameterization, and strict-saddle-like structures common in ML losses, with explicit algorithmic implications.
- Analyze higher-order methods (e.g., quasi-Newton, cubic regularization) in nonconvex settings under non-asymptotic, instance-aware models that reflect practical performance, including variable batch sizes and stochastic estimates.
- Construct checkable and computationally meaningful surrogates for unobservable optimality measures (e.g., , ) to enable finite-termination guarantees and certified accuracy in nonconvex optimization.
- Extend barrier-function theory beyond classical self-concordant barriers: design new barrier families for structured convex sets (e.g., combinatorial polytopes, conic intersections) with improved complexity and implementability.
- Improve complexity bounds for path-following methods using neighborhoods tighter than while maintaining numerical robustness; analyze long-step variants with explicit step-size rules and their sparse linear-algebra costs.
- Provide comprehensive complexity analyses for factorization-reuse strategies in interior-point methods (e.g., iterative refinement, rank-one updates, partial refactorizations), with provable savings and stability guarantees.
- Investigate communication-avoiding and distributed algorithms for LP and smooth optimization, establishing iteration and communication complexity under realistic network and memory models; develop scalable certification of optimality in distributed settings.
- Formalize complexity models that incorporate data movement and hierarchical memory, enabling principled algorithm design for large-scale optimization beyond arithmetic counts.
- Explore practical preprocessing with guarantees: develop algorithms that detect and repair ill-posed or pathological LP formulations (e.g., rank deficiency, near infeasibility) with provable robustness and impact on downstream complexity.
- Update and expand benchmark suites beyond legacy sets (e.g., Netlib) to reflect modern applications; relate observed performance to measurable instance features and validate theoretical predictions across diverse, structured LP families.
- Address omissions noted in the paper (constrained optimization, finite-sum problems, parallel methods) by developing a parallel theory–practice synthesis: instance-aware complexity, smoothed/average-case results, and certified adaptive algorithms tailored to these prominent classes.
- Provide unified frameworks that connect worst-case, average-case, smoothed, and instance-dependent analyses, making complexity results predictive for real-world optimization workloads.
Practical Applications
Immediate Applications
The following bullets list concrete, deployable applications that leverage the paper’s findings on linear programming (LP), unconstrained smooth minimization, and the interplay of theory and practice. Each item notes sectors, possible tools/workflows, and key assumptions or dependencies.
- Interior-point LP for large-scale operations optimization
- Sectors: healthcare, energy, logistics/supply chain, finance, telecommunications
- What to deploy: Primal-dual path-following solvers (e.g., Mehrotra’s predictor–corrector), with central-path monitoring using
μ = (x^T s)/nas a progress metric; factorization reuse across iterations; sparse linear algebra and preconditioning tuned to problem structure. - Use cases:
- Healthcare: nurse rostering, operating room scheduling, ICU bed assignment under resource constraints.
- Energy: economic dispatch, transmission planning, market clearing with thousands/millions of variables.
- Logistics: fleet routing, warehouse placement and capacity planning.
- Finance: portfolio optimization with linear risk and turnover constraints; real-time rebalancing.
- Assumptions/dependencies: An interior feasible point or reliable feasibility phase; matrix A has full row rank (or preprocessed to be so); problem sparsity is exploitable; high-quality LP solver (MOSEK, Gurobi/CPLEX for LP, open-source SDPT3/SeDuMi for conic relatives); numerically well-scaled data.
- Smoothed preprocessing to improve simplex robustness
- Sectors: software (solver vendors), logistics, public-sector planning
- What to deploy: A data-conditioning step that injects tiny, controlled Gaussian perturbations to A and b (as in smoothed analysis) before simplex; a “robustification” toggle in commercial/open-source LP pipelines when degeneracy or cycling is detected.
- Use cases: LP instances with pathological degeneracy or near-degenerate pivots where simplex stalls; sensitive planning models with nearly collinear constraints.
- Assumptions/dependencies: Acceptability of minuscule random perturbations from domain stakeholders; tolerance controls to keep solution changes within policy bounds; documented data lineage for auditability.
- Solver selection and formulation diagnostics
- Sectors: software, consulting, academia, government analytics teams
- What to deploy: An “Optimization Readiness Diagnostic” that inspects formulation quality (rank deficiency, scaling, sparsity, constraint redundancy) and recommends simplex vs interior-point (and variant), reformulations (e.g., add
1^T x = 1and rescale), and barrier/step-size settings. - Use cases: Project kickoff for large analytics engagements; automated CI pipelines for analytics codebases.
- Assumptions/dependencies: Availability of metadata and sample instances; buy-in to reformulation; versioned solver configurations.
- Benchmarked termination criteria and complexity-aware runbooks
- Sectors: software engineering, MLOps, operations research teams
- What to deploy: Standardized stopping rules aligned with the paper’s approximate optimality conditions (e.g., gradient norm thresholds for smooth minimization;
μ ≤ εfor LP primal-dual), plus logging of iteration/operation counts matched to or expectations. - Use cases: Reproducible experiments; SLAs for analytics services; predictable runtime budgeting.
- Assumptions/dependencies: Clear accuracy tolerances tied to decision impact; instrumentation in solver interfaces; training for interpreting iteration vs operation complexity.
- Algorithm choice for smooth unconstrained optimization in engineering and ML
- Sectors: robotics, control, computer vision, scientific computing, ML
- What to deploy: For convex smooth objectives—accelerated gradient; for nonconvex but smooth—trust-region Newton or line-search quasi-Newton with approximate second-order checks (e.g., gradient norm ≤ ε and Hessian minimum eigenvalue ≥ −ε).
- Use cases:
- Robotics/control: trajectory optimization with real-time feasible local minima.
- Vision/graphics: bundle adjustment, shape fitting under smooth losses.
- ML: convex training (e.g., logistic regression) with accelerated methods; nonconvex training with robust local solvers for fine-tuning.
- Assumptions/dependencies: Lipschitz gradient or locally well-behaved curvature; reliable gradient/Hessian or Hessian-vector products; benign nonconvexity in domain tasks; protection against ill-conditioning.
- Hybrid LP workflows for integer programming back-ends
- Sectors: manufacturing, logistics, scheduling, energy planning
- What to deploy: Branch-and-bound/cut frameworks that call interior-point LP solvers for relaxations; warm starts and basis identification heuristics combining interior-point and simplex for rapid node solves.
- Use cases: Production planning with binary decisions; crew scheduling; unit commitment with discrete constraints.
- Assumptions/dependencies: Tight LP relaxations; effective cut generation; solver APIs supporting warm starts and matrix updates.
- Education and training modules based on theory–practice interplay
- Sectors: education, workforce development
- What to deploy: Course labs and interactive notebooks that illustrate central paths, barrier methods, simplex pivot rules, smoothed analysis effects, and oracle vs iteration complexity; capstone projects that reformulate real data problems.
- Use cases: University courses in optimization; internal training for analytics teams; bootcamps.
- Assumptions/dependencies: Curated datasets; solver licenses or open-source alternatives; instructional materials aligned to decision-making contexts.
- Policy analytics with LP-backed resource allocation
- Sectors: public policy, NGOs, emergency response
- What to deploy: LP-driven tools for allocating funds, staff, supplies under fairness and efficiency constraints; scenario analyses with complexity-aware run times.
- Use cases: Disaster relief logistics; school district budgeting; vaccine distribution prioritization.
- Assumptions/dependencies: Clean, up-to-date data; transparent modeling; stakeholder acceptance of linear approximations; governance for randomized conditioning if smoothed preprocessing is used.
- Personal and small-business decision aids using LP/smooth optimization
- Sectors: daily life, SMB tools
- What to deploy: Lightweight apps for budgeting, diet planning, simple scheduling; convex smooth optimizers for personalized fitness or learning plans.
- Use cases: Household budget allocation; small fleet scheduling; habit formation plans via smooth cost functions.
- Assumptions/dependencies: Simple, interpretable formulations; mobile-friendly solvers; guardrails for data entry and scaling.
Long-Term Applications
The following bullets identify applications that require further research, scaling, or productization to become broadly deployable.
- Instance-aware solver selection via learned meta-models
- Sectors: software, MLOps, operations research
- What could emerge: Automated frameworks that map problem features (sparsity, conditioning, geometry) to the best algorithm (simplex variant, interior-point family, first-/second-order smooth solver), using historical runs and structural diagnostics.
- Dependencies: Large corpora of labeled optimization instances; feature engineering for problem structure; robust generalization across domains.
- Smoothed analysis-inspired robust modeling in policy and markets
- Sectors: public policy, market design, energy markets
- What could emerge: Formal protocols for minimal randomization (noise injection) to stabilize planning models, with guarantees on solution quality and tractability, and privacy-preserving data conditioning.
- Dependencies: Regulatory approval; formal bounds on perturbation effects; stakeholder communication tools; sensitivity and fairness audits.
- Provably polynomial pivot rules or hybrid proofs for simplex
- Sectors: foundational algorithms, solver vendors
- What could emerge: New simplex pivot strategies with polynomial guarantees on practically relevant instance classes; hybrid simplex–interior proofs with de-randomized smoothed analysis.
- Dependencies: Breakthrough theory on instance distributions and structural properties; integration into industrial-strength codes without performance regression.
- Parallel and distributed interior-point methods at web scale
- Sectors: cloud platforms, energy, telecom, large retailers
- What could emerge: End-to-end parallel path-following with distributed linear algebra, streaming constraint updates, and online central-path tracking for very large LPs and conic programs.
- Dependencies: Advances in sparse distributed factorization; communication-avoidance algorithms; stability across asynchronous environments.
- Oracle-efficient frameworks for nonconvex optimization with guarantees
- Sectors: ML, robotics, scientific computing
- What could emerge: Algorithms that combine oracle complexity guarantees (e.g., second-order stationarity within oracle budgets) with practical heuristics (line search, trust regions), tailored to benign nonconvex subclasses common in ML.
- Dependencies: Better characterization of “benign nonconvexity” classes; scalable Hessian approximations; adaptive stopping tied to application risk.
- New self-concordant barriers and generalized cones in domain-specific modeling
- Sectors: finance (risk models), engineering (robust design), healthcare (clinical decision support)
- What could emerge: Barrier functions and conic formulations beyond LP/SOCP/SDP that encode domain constraints naturally while retaining path-following efficiency and complexity guarantees.
- Dependencies: Mathematical advances in barrier design; solver implementation; domain validation.
- Hybrid integer–continuous optimization with dynamic solver switching
- Sectors: manufacturing, mobility, smart grids
- What could emerge: Systems that dynamically switch between interior-point relaxations, simplex refinement, and cutting-plane phases, with live reuse of factorization artifacts and predictive runtime controls.
- Dependencies: Rich solver APIs; run-time orchestration; reliability engineering for live production systems.
- Complexity-aware governance and procurement standards
- Sectors: government, large enterprises
- What could emerge: Standards that require documented complexity analyses (iteration/oracle/operation) and reproducibility checks for optimization-based procurement, with risk controls for worst-case scenarios.
- Dependencies: Policy frameworks; audit tooling; upskilling of procurement teams.
- Educational ecosystems that bridge theory and practice at scale
- Sectors: education, professional certification
- What could emerge: Modular curricula and certifications emphasizing convergence/complexity, formulation craft, and solver engineering, with interactive cloud labs and industry-aligned case studies.
- Dependencies: Partnerships across academia–industry; sustained funding; evolving content as algorithms advance.
Each application’s feasibility depends on aligning algorithmic assumptions (convexity, smoothness, Lipschitz continuity of gradients, availability of interior feasible points, data scaling, sparsity) with the structure of the real problem, and on access to robust solver implementations and appropriate computational resources.
Glossary
- Approximate optimality conditions: Criteria that allow algorithms to stop once near-optimality is achieved rather than converging asymptotically. "we define {\em approximate} optimality conditions, allowing these algorithms to terminate finitely when such conditions are satisfied."
- Barrier function: A function added to an objective to enforce staying inside a convex feasible set, typically blowing up at the boundary; key to interior-point methods. "and is a {\em barrier function} whose domain is the relative interior of with the property that as approaches the boundary of ."
- Benign nonconvexity: A phenomenon where many nonconvex problems (notably in machine learning) are practically easy to solve to global optimality despite worst-case intractability. "One example is the ``benign nonconvexity'' phenomenon, which has been encountered in many problems (especially from machine learning) over the past 10 years, where global minima of nonconvex objectives are usually found easily \cite{Sun21}, despite global minimization of general nonconvex objectives being intractable."
- Bit-complexity model: A computational model counting bit operations on rational data, contrasting with real-number operation counts. "This model is closer to practical computation with floating-point numbers than the ``bit-complexity'' model, which assumes that problem data is rational and takes the unit of computation to be a bitwise operation."
- Blum–Shub–Smale (BSS) model: A computational complexity model over the reals where each arithmetic operation is one unit of cost. "Formally, we assume the Blum-Shub-Smale (BSS) model of complexity \cite{smale2000algorithms,blum2012complexity} in which the primitive objects are real numbers, and each arithmetic operations , , , (as well as comparisons , , and ) are each assumed to be a single unit of computation."
- Central path: The trajectory of strictly feasible primal-dual points where all complementarity products are equal; followed by path-following interior-point methods. "Path-following steps start by defining a central path, which is the set of strictly feasible points for which the products , are all identical."
- Degeneracy: In LP, when multiple bases represent the same vertex or steps change the basis without changing the solution point. "there may exist multiple partitions that define the same vertex, a phenomenon known as degeneracy."
- Ellipsoid method: A polynomial-time algorithm for convex optimization and LP that iteratively shrinks an ellipsoid containing the feasible region. "In 1979, Khachiyan~\cite{Kha79} achieved a breakthrough when he showed that an adaptation of this approach to LP converged in polynomial time --- the first polynomial-time algorithm for LP."
- Epigraph: The set of points lying on or above a function’s graph; used to define convexity. " is a convex function when its epigraph is a convex set, equivalently, for all $x,y \in \dom f$ and ."
- First-order necessary condition: A condition stating that the gradient must vanish at a local minimizer for differentiable functions. "the {\em first-order necessary} condition for to be a local solution of \cref{eq:f} is ."
- Interior-point method: An algorithm that maintains strict feasibility (positivity) and moves through the interior of the feasible region. "a property that gave rise to the term ``interior-point method''."
- Iteration complexity: Bounds on the number of algorithmic iterations required to reach a specified accuracy. "Complexity analysis of this type is sometimes referred to as {\em iteration complexity}, since ``'' refers to the iteration index of the algorithm."
- Lipschitz continuity: A regularity condition bounding how fast a function (or its gradient) can change, crucial in convergence rates. "an assumption that the gradient of in \cref{eq:f} is Lipschitz continuous with some constant is common in gradient-based methods for this problem."
- Log-barrier: A barrier term using logarithms to enforce positivity constraints, central in interior-point methods. "a log-barrier approach, in which the constraints are accounted for by subtracting a term (for some ) from the objective."
- Lower bounds: Fundamental limits showing the minimal number of oracle calls or operations required by any algorithm within a class. "Most complexity analyses are concerned with upper bounds on the relevant measure of computation. But there has also been much interest in {\em lower bounds}, which are usually defined in terms of both a class of algorithms and a class of problems."
- Mehrotra’s predictor-corrector primal-dual approach: A highly effective practical interior-point LP algorithm that alternates prediction and correction steps. "the algorithm underlying almost all interior-point software for LP has been Mehrotra's predictor-corrector primal-dual approach \cite{Meh92a}, which is a path-following method with clever heuristics to select certain critical parameters."
- Non-asymptotic analysis: Convergence analysis that provides explicit rates from the initial point rather than only asymptotic behavior. "In this paper, we focus mostly {\em non-asymptotic} analysis, in which we ``globalize'' the local analysis and seek to say something about the rate of convergence of the algorithm from its initial point."
- Optimal algorithm: An algorithm whose upper complexity bound matches the lower bound up to constant factors for a given problem and algorithm class. "Algorithms for which the lower bound is within a constant multiple (not depending on ) of the upper bound is called an {\em optimal algorithm}."
- Oracle complexity: A framework that counts the number of information queries (e.g., gradients) to an oracle needed to reach a target accuracy. "For nonlinear problems such as \cref{eq:f}, the {\em oracle complexity} model of Nemirovski and Yudin \cite{NemY83} is widely used to bound the amount of computation required by a certain algorithm on a given class of problems."
- Path-following methods: Interior-point strategies that track the central path by adjusting a barrier parameter and applying Newton-type steps. "The two major classes of methods in this are primal-dual potential reduction methods (proposed by Tanabe, Todd, and Ye \cite{Tan87,TodY90}) and path-following methods."
- Positive semidefinite: A matrix property indicating nonnegative quadratic forms; crucial in second-order optimality and LP notation. "we use to indicate that is positive semidefinite."
- Potential reduction methods: Primal-dual interior-point algorithms that minimize a chosen potential function to drive complementarity products down. "The two major classes of methods in this are primal-dual potential reduction methods (proposed by Tanabe, Todd, and Ye \cite{Tan87,TodY90}) and path-following methods."
- Projective method: Karmarkar’s rescaling-and-projection approach that maintains strict feasibility and reduces the objective. "This {\em projective} method (so named because of its use of the projection of the rescaled cost vector) was shown in \cite{Kar84} to require iterations to reduce the objective by a factor of "
- Primal-dual methods: Algorithms that work simultaneously with primal and dual variables to satisfy optimality conditions, central in modern LP solvers. "Methods of the latter type, known as {\em primal-dual methods}, proved to be particularly fruitful as an area for development."
- Self-concordant: A property of barrier functions bounding third derivatives by a power of second derivatives, enabling robust Newton steps. "The barrier function satisfies an additional property of {\em self-concordance}, which (roughly speaking) allows its third derivatives to be bounded in terms of a $3/2$ power of the second derivatives, as in the function for ."
- Semidefinite programming: A class of convex optimization over positive semidefinite matrices; amenable to self-concordant barrier methods. "Barrier functions with the self-concordant property can be constructed explicitly for several convex optimization problems, including LP, convex quadratic programming, second-order cone programming, and semidefinite programming."
- Shadow-vertex simplex method: A variant of the simplex method analyzed under smoothed analysis to obtain polynomial expected steps. "Their result works with the dual form \cref{eq:lp.dual2} and a particular variant of the simplex method, known as the shadow-vertex simplex method."
- Smoothed analysis: A framework analyzing performance under slight random perturbations of instances, explaining typical efficiency of algorithms. "A breakthrough in theoretical understanding of the simplex method came in 2004 with the {\em smoothed analysis} of Spielman and Teng~\cite{SpeT04}."
- Stationary point: A point where the gradient is zero; a candidate for local optimality. "(Points satisfying this condition are termed {\em stationary}.)"
- Strong duality: Equality of optimal primal and dual objective values for LP under feasibility. "the optimal values of the primal and dual LPs are the same, a property known as {\em strong duality}."
- Subexponential lower bound: A complexity lower bound growing faster than polynomial but slower than exponential, e.g., exp(nc) with c in (0,1). "A subexponential lower bound is one in which the number of pivots is bounded below by , for constants , , and ."
- Sublinear rate: Convergence rate slower than linear, commonly O(1/k) or O(1/√k) in optimization. "Depending on the algorithm, convergence of to zero can occur at arithmetic, ``sublinear'' rates, such as , , or "
- Trust-region strategy: An adaptive mechanism constraining steps within a region where the model is trusted, improving robustness. "Algorithms may contain adaptive mechanisms (for example, line searches or trust-region strategies) that allow them to exploit variations in the properties of problems across the parameter space."
- Weak duality: Inequality relating primal and dual feasible objective values, guaranteeing the dual is a lower bound to the primal. ", a property known as {\em weak duality}."
Collections
Sign up for free to add this paper to one or more collections.