Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Model-Based Evolutionary Optimizer: Reasoning with elitism (2403.02054v1)

Published 4 Mar 2024 in cs.AI

Abstract: LLMs have demonstrated remarkable reasoning abilities, prompting interest in their application as black-box optimizers. This paper asserts that LLMs possess the capability for zero-shot optimization across diverse scenarios, including multi-objective and high-dimensional problems. We introduce a novel population-based method for numerical optimization using LLMs called Language-Model-Based Evolutionary Optimizer (LEO). Our hypothesis is supported through numerical examples, spanning benchmark and industrial engineering problems such as supersonic nozzle shape optimization, heat transfer, and windfarm layout optimization. We compare our method to several gradient-based and gradient-free optimization approaches. While LLMs yield comparable results to state-of-the-art methods, their imaginative nature and propensity to hallucinate demand careful handling. We provide practical guidelines for obtaining reliable answers from LLMs and discuss method limitations and potential research directions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Analyzing the nuances of transformers’ polynomial simplification abilities. arXiv preprint arXiv:2104.14095, 2021.
  2. Optimus: Optimization modeling using mip solvers and large language models. arXiv preprint arXiv: Arxiv-2310.06116, 2023.
  3. Automating genetic algorithm mutations for molecules using a masked language model. IEEE Transactions on Evolutionary Computation, 26(4):793–799, 2022. doi: 10.1109/TEVC.2022.3144045.
  4. J. Blank and K. Deb. pymoo: Multi-objective optimization in python. IEEE Access, 8:89497–89509, 2020.
  5. Autonomous chemical research with large language models. Nature, 624:570–578, 12 2023. doi: 10.1038/s41586-023-06792-0.
  6. On maximum ballistic coefficient axisymmetric geometries in hypersonic flows. Journal of Spacecraft and Rockets, 55(2):518–522, 2018.
  7. Chemcrow: Augmenting large-language models with chemistry tools. arXiv preprint arXiv: Arxiv-2304.05376, 2023.
  8. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  9. A limited memory algorithm for bound constrained optimization. SIAM Journal on scientific computing, 16(5):1190–1208, 1995.
  10. François Charton. Linear algebra with transformers. arXiv preprint arXiv:2112.01898, 2022.
  11. Instructzero: Efficient instruction optimization for black-box large language models. arXiv preprint arXiv: Arxiv-2306.03082, 2023.
  12. Decision transformer: Reinforcement learning via sequence modeling. arXiv preprint arXiv:2106.01345, 2021.
  13. Prompt optimization in multi-step tasks (promst): Integrating human feedback and preference alignment. arXiv preprint arXiv:2402.08702, 2024.
  14. Towards learning universal hyperparameter optimizers with transformers. In Neural Information Processing Systems (NeurIPS) 2022, pp. 32053–32068, 2022.
  15. Tri Dao. Flashattention-2: Faster attention with better parallelism and work partitioning, 2023.
  16. Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
  17. A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182–197, 2002.
  18. High-temperature hypersonic laval nozzle for non-lte cavity ringdown spectroscopy. The Journal of Chemical Physics, 152(13), 2020.
  19. Towards optimizing with large language models. arXiv preprint arXiv: Arxiv-2310.05204, 2023.
  20. Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation. In Proceedings of IEEE international conference on evolutionary computation, pp.  312–317. IEEE, 1996.
  21. Can LLMs generate random numbers? evaluating LLM sampling in controlled domains. In ICML 2023 Workshop: Sampling and Optimization in Discrete Space, 2023. URL https://openreview.net/forum?id=Vhh1K9LjVI.
  22. Tackling the curse of dimensionality with physics-informed neural networks. arXiv preprint arXiv: Arxiv-2307.12306, 2023.
  23. Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. arXiv preprint arXiv: Arxiv-2212.10403, 2023.
  24. Leveraging large language models for predictive chemistry. Nature Machine Intelligence, 6(2):161–169, Feb 2024. ISSN 2522-5839. doi: 10.1038/s42256-023-00788-1. URL https://doi.org/10.1038/s42256-023-00788-1.
  25. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, pp.  462–466, 1952.
  26. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  27. Optimization by simulated annealing. science, 220(4598):671–680, 1983.
  28. Large language models are zero-shot reasoners. arXiv preprint arXiv: Arxiv-2205.11916, 2023.
  29. Deep learning for symbolic mathematics. arXiv preprint arXiv:1912.01412, 2019.
  30. Awq: Activation-aware weight quantization for llm compression and acceleration. arXiv preprint arXiv:2306.00978, 2023.
  31. Large language model for multi-objective evolutionary optimization. arXiv preprint arXiv: Arxiv-2310.12541, 2023a.
  32. Algorithm evolution using large language model. arXiv preprint arXiv: Arxiv-2311.15249, 2023b.
  33. Large language models as evolutionary optimizers. arXiv preprint arXiv: Arxiv-2310.19046, 2023c.
  34. Language models as black-box optimizers for vision-language models. arXiv preprint arXiv:2309.05950, 2023d.
  35. Summary of chatgpt-related research and perspective towards the future of large language models. Meta-Radiology, 1(2):100017, 2023e. ISSN 2950-1628. doi: https://doi.org/10.1016/j.metrad.2023.100017. URL https://www.sciencedirect.com/science/article/pii/S2950162823000176.
  36. Jieyi Long. Large language model guided tree-of-thought. arXiv preprint arXiv:2305.08291, 2023.
  37. Eureka: Human-level reward design via coding large language models. arXiv preprint arXiv: Arxiv-2310.12931, 2023.
  38. Propane: Prompt design as an inverse problem. arXiv preprint arXiv:2311.07064, 2023.
  39. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Comput. Surv., 56(2), sep 2023. ISSN 0360-0300. doi: 10.1145/3605943. URL https://doi.org/10.1145/3605943.
  40. Gpt-4 technical report. arXiv preprint arXiv: Arxiv-2303.08774, 2023.
  41. Leveraging large language models for the generation of novel metaheuristic optimization algorithms. In Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO ’23 Companion, pp.  1812–1820, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701207. doi: 10.1145/3583133.3596401. URL https://doi.org/10.1145/3583133.3596401.
  42. Michael JD Powell. A direct search optimization method that models the objective and constraint functions by linear interpolation. Springer, 1994.
  43. Automatic prompt optimization with "gradient descent" and beam search. arXiv preprint arXiv: Arxiv-2305.03495, 2023.
  44. GVR Rao. Exhaust nozzle contour for optimum thrust. Journal of Jet Propulsion, 28(6):377–382, 1958.
  45. A generalist agent. arXiv preprint arXiv:2205.06175, 2022.
  46. Mathematical discoveries from program search with large language models. Nature, pp.  1–3, 2023.
  47. Progprompt: Generating situated robot task plans using large language models. arXiv preprint arXiv: Arxiv-2209.11302, 2022.
  48. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615, 2023.
  49. How can llms transform the robotic design process? Nature Machine Intelligence, 5(6):561–564, Jun 2023. ISSN 2522-5839. doi: 10.1038/s42256-023-00669-7. URL https://doi.org/10.1038/s42256-023-00669-7.
  50. Symbolicgpt: A generative transformer model for symbolic regression. arXiv preprint arXiv:2106.14131, 2021.
  51. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 30, pp.  6000–6010. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
  52. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv: Arxiv-2305.16291, 2023.
  53. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763, 2017.
  54. Chain-of-thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2023.
  55. Smoothquant: Accurate and efficient post-training quantization for large language models. In International Conference on Machine Learning, pp. 38087–38099. PMLR, 2023.
  56. Large language models as optimizers. arXiv preprint arXiv:2309.03409, 2023.
  57. Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36, 2024.
  58. Using large language models for hyperparameter optimization. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023a. URL https://openreview.net/forum?id=FUdZ6HEOre.
  59. Automl-gpt: Automatic machine learning with gpt. arXiv preprint arXiv:2305.02499, 2023b.
  60. A survey of large language models. arXiv preprint arXiv: Arxiv-2303.18223, 2023.
  61. Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970, 2023.
  62. Large language models are human-level prompt engineers. arXiv preprint arXiv: Arxiv-2211.01910, 2023.
  63. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary computation, 8(2):173–195, 2000.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Shuvayan Brahmachary (2 papers)
  2. Subodh M. Joshi (2 papers)
  3. Aniruddha Panda (5 papers)
  4. Kaushik Koneripalli (3 papers)
  5. Arun Kumar Sagotra (2 papers)
  6. Harshil Patel (4 papers)
  7. Ankush Sharma (12 papers)
  8. Ameya D. Jagtap (21 papers)
  9. Kaushic Kalyanaraman (4 papers)
Citations (12)