Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints (2402.18012v3)

Published 28 Feb 2024 in cs.LG and cs.AI

Abstract: Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailable. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in practice. To deal with such unknown constraints, we propose to perform optimization within the data manifold using diffusion models. To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model. Depending on the differentiability of the objective function, we propose two different sampling methods. For differentiable objectives, we propose a two-stage framework that begins with a guided diffusion process for warm-up, followed by a Langevin dynamics stage for further correction. For non-differentiable objectives, we propose an iterative importance sampling strategy using the diffusion model as the proposal distribution. Comprehensive experiments on a synthetic dataset, six real-world black-box optimization datasets, and a multi-objective molecule optimization dataset show that our method achieves better or comparable performance with previous state-of-the-art baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (113)
  1. Two decades of blackbox optimization applications. EURO Journal on Computational Optimization, 9:100011, 2021.
  2. Anderson, B. D. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12(3):313–326, 1982.
  3. Botorch: Programmable bayesian optimization in pytorch. Advances in Neural Information Processing Systems, 2020.
  4. Baran, P. S. Natural product total synthesis: as exciting as ever and here to stay, 2018.
  5. Linear programming and network flows. John Wiley & Sons, 2011.
  6. Linear convergence bounds for diffusion models via stochastic localization. arXiv preprint arXiv:2308.03686, 2023.
  7. Convex optimization. Cambridge university press, 2004.
  8. Sequential sampling in noisy environments. In International Conference on Parallel Problem Solving from Nature, pp.  202–211. Springer, 2004.
  9. Conditioning by adaptive sampling for robust design. International Conference on Machine Learning, 2019.
  10. A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM Journal on Optimization, 15(3):751–779, 2005.
  11. Gradient sampling methods for nonsmooth optimization. Numerical nonsmooth optimization: State of the art algorithms, pp.  201–225, 2020.
  12. Bayesian ecg reconstruction using denoising diffusion generative models. arXiv preprint arXiv:2401.05388, 2023.
  13. Time reversal of diffusion processes under a finite entropy condition. In Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques, volume 59, pp.  1844–1881. Institut Henri Poincaré, 2023.
  14. Bidirectional learning for offline model-based biological sequence design. arXiv preprint arXiv:2301.02931, 2023.
  15. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. arXiv preprint arXiv:2209.11215, 2022.
  16. Cheng, X. The Interplay between Sampling and Optimization. University of California, Berkeley, 2020.
  17. Stochastic gradient and langevin processes. In International Conference on Machine Learning, pp.  1810–1819. PMLR, 2020.
  18. Diffusion posterior sampling for general noisy inverse problems. arXiv preprint arXiv:2209.14687, 2022a.
  19. Improving diffusion models for inverse problems using manifold constraints. Advances in Neural Information Processing Systems, 35:25683–25696, 2022b.
  20. Score diffusion models without early stopping: finite fisher information is all you need. arXiv preprint arXiv:2308.12240, 2023.
  21. Introduction to derivative-free optimization. SIAM, 2009.
  22. De Bortoli, V. Convergence of denoising diffusion models under the manifold hypothesis. arXiv preprint arXiv:2208.05314, 2022.
  23. On quantitative laplace-type convergence results for some exponential probability measures, with two applications. arXiv preprint arXiv:2110.12922, 2021.
  24. Diffusion schrödinger bridge with applications to score-based generative modeling. Advances in Neural Information Processing Systems, 34:17695–17709, 2021.
  25. Dixon, L. C. W. The global optimization problem: an introduction. Towards Global Optimiation 2, pp.  1–15, 1978.
  26. A flexible diffusion model. In International Conference on Machine Learning, pp.  8678–8696. PMLR, 2023a.
  27. Molgensurvey: A systematic survey in machine learning models for molecule design. arXiv preprint arXiv:2203.14500, 2022.
  28. Reduce, reuse, recycle: Compositional generation with energy-based diffusion models and mcmc. In International Conference on Machine Learning, pp.  8489–8510. PMLR, 2023b.
  29. On the geometric convergence for mala under verifiable conditions. arXiv preprint arXiv:2201.01951, 2022.
  30. Limo: Latent inceptionism for targeted molecule generation. In International Conference on Machine Learning, pp.  5777–5792. PMLR, 2022.
  31. Adaptive state-dependent diffusion for derivative-free optimization. Communications on Applied Mathematics and Computation, pp.  1–29, 2024.
  32. Diffusion models for constrained domains. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=xuWTFQ4VGO. Expert Certification.
  33. Offline model-based optimization via normalized maximum likelihood estimation. In International Conference on Learning Representations, 2020.
  34. The synthesizability of molecules proposed by generative models. Journal of chemical information and modeling, 60(12):5714–5723, 2020.
  35. Compositional sculpting of iterative generative processes. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=w79RtqIyoM.
  36. Exact combinatorial optimization with graph convolutional neural networks. Advances in neural information processing systems, 32, 2019.
  37. Chembl: a large-scale bioactivity database for drug discovery. Nucleic acids research, 40(D1):D1100–D1107, 2012.
  38. Recursive stochastic algorithms for global optimization in r^d. SIAM Journal on Control and Optimization, 29(5):999–1018, 1991.
  39. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  40. Representations of knowledge in complex systems. Journal of the Royal Statistical Society: Series B (Methodological), 56(4):549–581, 1994.
  41. Protein design with guided discrete diffusion. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=MfiK69Ga6p.
  42. Hansen, N. The CMA evolution strategy: A comparing review. In Towards a New Evolutionary Computation, pp.  75–102.
  43. Hastings, W. K. Monte carlo sampling methods using markov chains and their applications. 1970.
  44. Time reversal of diffusions. The Annals of Probability, pp.  1188–1205, 1986.
  45. Manifold preserving guided diffusion. arXiv preprint arXiv:2311.16424, 2023.
  46. Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural computation, 14(8):1771–1800, 2002.
  47. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  48. Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303, 2022.
  49. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  50. Therapeutics data commons: Machine learning datasets and tasks for drug discovery and development. Advances in neural information processing systems, 2021.
  51. Hwang, C.-R. Laplace’s method revisited: weak convergence of probability measures. The Annals of Probability, pp.  1177–1182, 1980.
  52. Isakov, V. Inverse problems for partial differential equations, volume 127. Springer, 2006.
  53. Gflownets for ai-driven scientific discovery. Digital Discovery, 2(3):557–577, 2023.
  54. Hierarchical generation of molecular graphs using structural motifs. In International conference on machine learning, pp.  4839–4848. PMLR, 2020.
  55. Erdos goes neural: an unsupervised learning framework for combinatorial optimization on graphs. Advances in Neural Information Processing Systems, 33:6659–6672, 2020.
  56. Learning combinatorial optimization algorithms over graphs. Advances in neural information processing systems, 30, 2017.
  57. Bootstrapped training of score-conditioned generator for offline design of biological sequences. arXiv preprint arXiv:2306.03111, 2023.
  58. Optimization by simulated annealing. science, 220(4598):671–680, 1983.
  59. Stochastic differential equations. Springer, 1992.
  60. Attention, learn to solve routing problems! In International Conference on Learning Representations, 2018.
  61. Self-referencing embedded strings (selfies): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024, 2020.
  62. Diffusion models for black-box optimization. International Conference on Machine Learning, 2023.
  63. Model inversion networks for model-based optimization. Advances in Neural Information Processing Systems, 2020.
  64. rdkit/rdkit: 2020_03_1 (q1 2020) release, March 2020. URL https://doi.org/10.5281/zenodo.3732262.
  65. The vehicle routing problem with stochastic travel times. Transportation science, 26(3):161–170, 1992.
  66. Derivative-free optimization methods. Acta Numerica, 28:287–404, 2019.
  67. Learning to optimize. In International Conference on Learning Representations, 2016.
  68. From distribution learning in training to gradient search in testing for combinatorial optimization. Advances in Neural Information Processing Systems, 2023.
  69. Combinatorial optimization with graph convolutional networks and guided tree search. Advances in neural information processing systems, 31, 2018.
  70. Mirror diffusion models for constrained and watermarked generation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  71. Reflected diffusion models. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.  22675–22701. PMLR, 23–29 Jul 2023.
  72. Sampling can be faster than optimization. Proceedings of the National Academy of Sciences, 116(42):20881–20885, 2019.
  73. Equation of state calculations by fast computing machines. The journal of chemical physics, 21(6):1087–1092, 1953.
  74. Unsupervised learning for solving the travelling salesman problem. arXiv preprint arXiv:2303.10538, 2023.
  75. Oksendal, B. Stochastic differential equations: an introduction with applications. Springer Science & Business Media, 2013.
  76. Operational research: Methods and applications. Journal of the Operational Research Society, pp.  1–195, 2023.
  77. Pidstrigach, J. Score-based generative models detect manifolds. Advances in Neural Information Processing Systems, 35:35852–35865, 2022.
  78. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
  79. Non-convex learning via stochastic gradient langevin dynamics: a nonasymptotic analysis. In Conference on Learning Theory, pp.  1674–1703. PMLR, 2017.
  80. Rao, S. S. Engineering optimization: theory and practice. John Wiley & Sons, 2019.
  81. Exponential convergence of langevin distributions and their discrete approximations. Bernoulli, pp.  341–363, 1996.
  82. Should ebms model the energy or the score? In Energy Based Models Workshop-ICLR 2021, 2021.
  83. Inverse molecular design using machine learning: Generative models for matter engineering. Science, 361(6400):360–365, 2018.
  84. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2015.
  85. Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590(7844):89–96, 2021.
  86. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012.
  87. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp.  2256–2265. PMLR, 2015.
  88. Pseudoinverse-guided diffusion models for inverse problems. In International Conference on Learning Representations, 2022.
  89. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  90. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020.
  91. Gaussian process optimization in the bandit setting: no regret and experimental design. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pp.  1015–1022, 2010.
  92. Your diffusion model secretly knows the dimension of the data manifold. arXiv preprint arXiv:2212.12611, 2022.
  93. Stochastic gradient descent as approximate bayesian inference. Journal of Machine Learning Research, 18(134):1–35, 2017.
  94. DIFUSCO: Graph-based diffusion solvers for combinatorial optimization. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=JV8Ff0lgVV.
  95. Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 1999.
  96. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
  97. Mujoco: A physics engine for model-based control. IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.  5026–5033, 2012.
  98. Conservative objective models for effective offline model-based optimization. International Conference on Machine Learning, 2021a.
  99. Conservative objective models for effective offline model-based optimization. In International Conference on Machine Learning, pp.  10358–10368. PMLR, 2021b.
  100. Design-bench: Benchmarks for data-driven offline model-based optimization. In International Conference on Machine Learning, pp.  21658–21676. PMLR, 2022.
  101. From optimization to sampling through gradient flows. NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY, 70(6), 2023.
  102. Syba: Bayesian estimation of synthetic accessibility of organic compounds. Journal of cheminformatics, 12(1):1–13, 2020.
  103. Unsupervised learning for combinatorial optimization with principled objective relaxation. Advances in Neural Information Processing Systems, 35:31444–31458, 2022.
  104. De novo design of protein structure and function with rfdiffusion. Nature, 620(7976):1089–1100, 2023.
  105. Score-based generative models learn manifold-like structures with constrained mixing. arXiv preprint arXiv:2311.09952, 2023.
  106. Wibisono, A. Sampling as optimization in the space of measures: The langevin dynamics as a composite optimization problem. In Conference on Learning Theory, pp.  2093–3027. PMLR, 2018.
  107. Deep kernel learning. In Artificial intelligence and statistics, pp.  370–378. PMLR, 2016a.
  108. Stochastic variational deep kernel learning. Advances in neural information processing systems, 29, 2016b.
  109. Practical and asymptotically exact conditional sampling in diffusion models. arXiv preprint arXiv:2306.17775, 2023.
  110. Roma: Robust model adaptation for offline model-based optimization. Advances in Neural Information Processing Systems, 34:4619–4631, 2021.
  111. Importance-aware co-teaching for offline model-based optimization. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  112. Unifying likelihood-free inference with black-box optimization and beyond. In International Conference on Learning Representations, 2021.
  113. Let the flows tell: Solving graph combinatorial problems with GFlownets. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=sTjW3JHs2V.
Citations (6)

Summary

  • The paper introduces a novel method that leverages diffusion models as constrained samplers by integrating Boltzmann and data distribution densities.
  • The approach uses a structured two-stage framework with guided diffusion warm-up followed by Langevin dynamics to hone in on feasible, optimized solutions.
  • Experimental results demonstrate that the method outperforms existing baselines in multi-objective and real-world optimization tasks, including molecular design.

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

Recent advancements in optimization, especially within complex real-world scenarios, often confront bounded constraints or scenarios where these constraints are unknown. Existing methods that handle unknown objective functions, categorized broadly under black-box optimization, have insufficiently addressed these challenges where feasibility constraints might not be articulated. When these constraints are disregarded, resulting solutions may lack viability when applied practically.

This paper proposes a novel approach leveraging diffusion models to address optimization problems on the data manifold, especially when faced with unknown constraints. By reframing optimization challenges as sampling problems from a product of probabilistic models, the authors have delineated a methodology that efficiently combines optimization objectives with the data space learned by a diffusion model.

Methodology

The approach is based on sampling from the product of two densities: the Boltzmann distribution, which is formed by the objective function, and the data distribution acquired via a diffusion model. Given that diffusion models have demonstrated a robust ability to learn intricate data distributions, utilizing them allows the practical constraints inherent in the data samples to be modeled effectively within the optimization process.

To enhance sampling efficiency, the authors proposed a structured two-stage framework. The initial stage involves a guided diffusion process that serves as a warm-up. This stage shifts the distribution focus towards feasible solutions, offering a preferable initial point for the subsequent step. Following this, Langevin dynamics refine solutions, pushing the exploration further and correcting initial samples. This two-stage approach ensures that the resultant samples respect learned constraints and minimize the objective within these bounds.

Experimental Results

Experiments validate the efficacy of this approach using synthetic data, six datasets from real-world black-box optimization scenarios, and a multi-objective optimization task within the field of molecular design. Notably, on tasks such as Superconductor, which involves optimizing material properties, the novel method surpassed existing baseline strategies in performance by significant margins.

The contribution also holds strong numerical results in multi-objective optimization tasks, where the model not only attains high scores in specific objectives but adeptly balances multiple objective constraints, showcasing higher validity rates and competitive optimization results compared to existing methods.

Theoretical Implications

From a theoretical perspective, the approach demonstrates that the guided diffusion stage can effectively limit the optimization to within feasible constraints, as diffusion models are calibrated to understand the underlying data manifolds—proven in image, video, and 3D space modeling. The proposed method further expands on diffusion models' capabilities, manifesting their potential beyond generative tasks and situating them as viable tools for complex constrained optimization problems.

Conclusions and Potential for Future Work

The research showcases a substantial leap in handling optimization under unknown constraints by adeptly employing diffusion models, providing a foundation for future developments in AI that explore real-world, conundrum-filled optimization landscapes. Practical applications range from drug design to material science, highlighting an attractive prospect for industries needing advanced, accurate optimization solutions under incomplete information about constraints.

Looking ahead, optimizing and enhancing manifold learning during the guided diffusion process could offer improvements, as would focusing on ensuring hard constraints directly within diffusion spaces. Additionally, extending this framework to cover derivative-free optimization represents a promising direction for further research.