Diffusion Model for Data-Driven Black-Box Optimization (2403.13219v1)
Abstract: Generative AI has redefined artificial intelligence, enabling the creation of innovative content and customized solutions that drive business practices into a new era of efficiency and creativity. In this paper, we focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization over complex structured variables. Consider the practical scenario where one wants to optimize some structured design in a high-dimensional space, based on massive unlabeled data (representing design variables) and a small labeled dataset. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. The goal is to generate new designs that are near-optimal and preserve the designed latent structures. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models for modeling complex distributions. In particular, we propose a reward-directed conditional diffusion model, to be trained on the mixed data, for sampling a near-optimal solution conditioned on high predicted rewards. Theoretically, we establish sub-optimality error bounds for the generated designs. The sub-optimality gap nearly matches the optimal guarantee in off-policy bandits, demonstrating the efficiency of reward-directed diffusion models for black-box optimization. Moreover, when the data admits a low-dimensional latent subspace structure, our model efficiently generates high-fidelity designs that closely respect the latent structure. We provide empirical experiments validating our model in decision-making and content-creation tasks.
- Improved algorithms for linear stochastic bandits. Advances in neural information processing systems, 24, 2011.
- Dynamic discrete choice structural models: A survey. Journal of Econometrics, 156(1):38–67, 2010.
- Is conditional generative modeling all you need for decision-making? arXiv preprint arXiv:2211.15657, 2022.
- Is conditional generative modeling all you need for decision making? In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=sP1fo2K9DFG.
- Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797, 2023.
- Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
- Identification and efficient semiparametric estimation of a dynamic discrete game. Technical report, National Bureau of Economic Research, 2015.
- ediffi: Text-to-image diffusion models with an ensemble of expert denoisers. arXiv preprint arXiv:2211.01324, 2022.
- Universal guidance for diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 843–852, 2023.
- Survey of variation in human transcription factors reveals prevalent dna binding changes. Science, 351(6280):1450–1454, 2016.
- Linear convergence bounds for diffusion models via stochastic localization. arXiv preprint arXiv:2308.03686, 2023.
- Generative modeling with denoising auto-encoders and langevin sampling. arXiv preprint arXiv:2002.00107, 2020.
- Offline contextual bandits with overparameterized models. In International Conference on Machine Learning, pages 1049–1058. PMLR, 2021.
- Conditioning by adaptive sampling for robust design. In International conference on machine learning, pages 773–782. PMLR, 2019.
- Offline pricing and demand learning with censored data. Management Science, 69(2):885–903, 2023.
- Information-theoretic considerations in batch reinforcement learning. In International Conference on Machine Learning, pages 1042–1051. PMLR, 2019.
- Towards understanding hierarchical learning: Benefits of neural representations. Advances in Neural Information Processing Systems, 33:22134–22145, 2020.
- Nonparametric regression on low-dimensional manifolds using deep relu networks: Function approximation and statistical recovery. Information and Inference: A Journal of the IMA, 11(4):1203–1253, 2022a.
- Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. arXiv preprint arXiv:2302.07194, 2023a.
- Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. arXiv preprint arXiv:2209.11215, 2022b.
- The probability flow ode is provably fast. arXiv preprint arXiv:2305.11798, 2023b.
- Restoration-degradation beyond linear diffusions: A non-asymptotic analysis for ddim-type samplers. In International Conference on Machine Learning, pages 4462–4484. PMLR, 2023c.
- Locally robust semiparametric estimation. Econometrica, 90(4):1501–1535, 2022.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
- Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. arXiv preprint arXiv:2208.05314, 2022.
- Diffusion schrödinger bridge with applications to score-based generative modeling. Advances in Neural Information Processing Systems, 34:17695–17709, 2021.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Minimax-optimal off-policy evaluation with linear function approximation. In International Conference on Machine Learning, pages 2701–2709. PMLR, 2020.
- Scaling rectified flow transformers for high-resolution image synthesis. arXiv preprint arXiv:2403.03206, 2024.
- A theoretical analysis of deep q-learning. In Learning for Dynamics and Control, pages 486–489. PMLR, 2020.
- Spectral ranking inferences based on general multiway comparisons. arXiv preprint arXiv:2308.02918, 2023.
- Offline model-based optimization via normalized maximum likelihood estimation. arXiv preprint arXiv:2102.07970, 2021.
- Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375, 2022.
- On the intrinsic dimensionality of image representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3987–3996, 2019.
- Diffusion models as plug-and-play priors. arXiv preprint arXiv:2206.09012, 2022.
- Diffusion models in bioinformatics and computational biology. Nature Reviews Bioengineering, pages 1–19, 2023.
- A distribution-free theory of nonparametric regression, volume 1. Springer, 2002.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Bayesian optimization using deep gaussian processes. arXiv preprint arXiv:1905.03350, 2019.
- Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- The multichannel pricing dilemma: Do consumers accept higher offline than online prices? International Journal of Research in Marketing, 36(4):597–612, 2019.
- A tail inequality for quadratic forms of subgaussian random vectors, 2011.
- Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(4), 2005.
- Offline reinforcement learning as one big sequence modeling problem. Advances in neural information processing systems, 34:1273–1286, 2021.
- Planning with diffusion for flexible behavior synthesis. In International Conference on Machine Learning, 2022.
- Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks. arXiv preprint arXiv:2206.02887, 2022.
- Is pessimism provably efficient for offline rl? In International Conference on Machine Learning, pages 5084–5096. PMLR, 2021.
- Morel: Model-based offline reinforcement learning. Advances in neural information processing systems, 33:21810–21823, 2020.
- Diffusion models for black-box optimization. In International Conference on Machine Learning, pages 17842–17857. PMLR, 2023.
- Learning multiple layers of features from tiny images. 2009.
- Generative artificial intelligence in marketing: Applications, opportunities, challenges, and research agenda, 2023.
- Model inversion networks for model-based optimization. Advances in Neural Information Processing Systems, 33:5126–5137, 2020.
- Convergence for score-based generative modeling with polynomial complexity. arXiv preprint arXiv:2206.06227, 2022a.
- Convergence of score-based generative modeling for general data distributions. arXiv preprint arXiv:2209.12381, 2022b.
- Convergence of score-based generative modeling for general data distributions. In International Conference on Algorithmic Learning Theory, pages 946–985. PMLR, 2023a.
- Proteinsgm: Score-based generative modeling for de novo protein design. bioRxiv, 2023b. doi: 10.1101/2022.07.13.499967. URL https://www.biorxiv.org/content/early/2023/02/04/2022.07.13.499967.
- Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training, 2021.
- Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643, 2020.
- Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems, 35:4328–4343, 2022.
- Reinforcement learning with human feedback: Learning dynamic choices via pessimism. arXiv preprint arXiv:2305.18438, 2023.
- Adaptdiffuser: Diffusion models as adaptive self-evolving planners. arXiv preprint arXiv:2302.01877, 2023.
- Breaking the curse of horizon: Infinite-horizon off-policy estimation. Advances in Neural Information Processing Systems, 31, 2018.
- Sora: A review on background, technology, limitations, and opportunities of large vision models. arXiv preprint arXiv:2402.17177, 2024.
- Linear and nonlinear programming, volume 2. Springer, 2021.
- Deep networks as denoising algorithms: Sample-efficient learning of diffusion models in high-dimensional graphical models. arXiv preprint arXiv:2309.11420, 2023.
- Posterior sampling from the spiked models via diffusion processes. arXiv preprint arXiv:2304.11449, 2023.
- Finite-time bounds for fitted value iteration. Journal of Machine Learning Research, 9(5), 2008.
- Adaptive approximation and generalization of deep neural network with intrinsic dimensionality. The Journal of Machine Learning Research, 21(1):7018–7055, 2020.
- Offline neural contextual bandits: Pessimism, optimization and generalization. arXiv preprint arXiv:2111.13807, 2021.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
- Diffusion models are minimax optimal distribution estimators. In ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation Models, 2023. URL https://openreview.net/forum?id=6961CeTSFA.
- The potential of generative artificial intelligence across disciplines: Perspectives and future directions. Journal of Computer Information Systems, pages 1–32, 2023.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- A novel recommendation model for online-to-offline service based on the customer network and service location. Journal of Management Information Systems, 37(2):563–593, 2020.
- User activity measurement in rating-based online-to-offline (o2o) service recommendation. Information Sciences, 479:180–196, 2019.
- Imitating human behaviour with diffusion models. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=Pv1GPQzRrC8.
- Reinforcement learning by reward-weighted regression for operational space control. In Proceedings of the 24th international conference on Machine learning, pages 745–750, 2007.
- The intrinsic dimension of images and its impact on learning. arXiv preprint arXiv:2104.08894, 2021.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Face image quality assessment: A literature survey. ACM Computing Surveys, 54(10s):1–49, January 2022. ISSN 1557-7341. doi: 10.1145/3507901. URL http://dx.doi.org/10.1145/3507901.
- Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402, 2022.
- Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2015.
- Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012.
- Input warping for bayesian optimization of non-stationary functions. In International Conference on Machine Learning, pages 1674–1682. PMLR, 2014.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=St1giarCHLP.
- Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
- Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in Artificial Intelligence, pages 574–584. PMLR, 2020a.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
- Consistency models. arXiv preprint arXiv:2303.01469, 2023.
- Learning to summarize with human feedback. Advances in Neural Information Processing Systems, 33:3008–3021, 2020.
- A global geometric framework for nonlinear dimensionality reduction. science, 290(5500):2319–2323, 2000.
- Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated, 1st edition, 2008. ISBN 0387790519.
- Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
- Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
- Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019.
- Online-offline competitive pricing with reference price effect. Journal of the Operational Research Society, 72(3):642–653, 2021.
- Skill preferences: Learning to extract and execute robotic skills from human feedback. In Conference on Robot Learning, pages 1259–1268. PMLR, 2022.
- Stylediffusion: Controllable disentangled style transfer via diffusion models, 2023.
- De novo design of protein structure and function with rfdiffusion. Nature, 620(7976):1089–1100, 2023.
- Optimal score estimation via empirical bayes smoothing. arXiv preprint arXiv:2402.07747, 2024.
- Recursively summarizing books with human feedback. arXiv preprint arXiv:2109.10862, 2021.
- Bellman-consistent pessimism for offline reinforcement learning. Advances in neural information processing systems, 34:6683–6694, 2021.
- Reward-directed conditional diffusion: Provable distribution estimation and reward improvement. arXiv preprint arXiv:2307.07055, 2023.
- Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543, 2023.
- High-dimensional dueling optimization with preference embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 11280–11288, 2023.
- Principled reinforcement learning with human feedback from pairwise or k𝑘kitalic_k-wise comparisons. arXiv preprint arXiv:2301.11270, 2023.
- Diffiqa: Face image quality assessment using denoising diffusion probabilistic models, 2023.
- Zihao Li (161 papers)
- Hui Yuan (71 papers)
- Kaixuan Huang (70 papers)
- Chengzhuo Ni (9 papers)
- Yinyu Ye (104 papers)
- Minshuo Chen (44 papers)
- Mengdi Wang (199 papers)