Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Large Language Model Meets Optimization (2405.10098v1)

Published 16 May 2024 in cs.NE

Abstract: Optimization algorithms and LLMs enhance decision-making in dynamic environments by integrating artificial intelligence with traditional techniques. LLMs, with extensive domain knowledge, facilitate intelligent modeling and strategic decision-making in optimization, while optimization algorithms refine LLM architectures and output quality. This synergy offers novel approaches for advancing general AI, addressing both the computational challenges of complex problems and the application of LLMs in practical scenarios. This review outlines the progress and potential of combining LLMs with optimization algorithms, providing insights for future research directions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (163)
  1. A hybrid harris hawks optimization algorithm with simulated annealing for feature selection. Artificial Intelligence Review 54, 593–637.
  2. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 .
  3. Optimus: Optimization modeling using mip solvers and large language models. arXiv preprint arXiv:2310.06116 .
  4. Metaheuristic procedures for training neural networks. volume 35. Springer Science & Business Media.
  5. Survey on tabu search meta-heuristic optimization, in: 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), IEEE. pp. 1539–1543.
  6. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 26–38.
  7. Cognizant multitasking in multiobjective multifactorial evolution: Mo-mfea-ii. IEEE transactions on cybernetics 51, 1784–1796.
  8. Evolutionary multi-objective optimization of large language model prompts for balancing sentiments, in: International Conference on the Applications of Evolutionary Computation (Part of EvoStar), Springer. pp. 212–224.
  9. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 .
  10. Stableyolo: Optimizing image generation for large language models, in: International Symposium on Search Based Software Engineering, Springer. pp. 133–139.
  11. Quality-diversity through ai feedback. arXiv preprint arXiv:2310.13032 .
  12. The openelm library: Leveraging progress in language models for novel evolutionary algorithms, in: Genetic Programming Theory and Practice XX. Springer, pp. 177–201.
  13. Large language model-based evolutionary optimizer: Reasoning with elitism. arXiv preprint arXiv:2403.02054 .
  14. Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901.
  15. Enhancing genetic improvement mutations using large language models, in: International Symposium on Search Based Software Engineering, Springer. pp. 153–159.
  16. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 .
  17. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226 .
  18. An effective cooperative coevolution framework integrating global and local search for large scale optimization problems, in: 2015 IEEE Congress on Evolutionary Computation (CEC), IEEE. pp. 1986–1993.
  19. Multipl-e: A scalable and polyglot approach to benchmarking neural code generation. IEEE Transactions on Software Engineering 49, 3675–3691.
  20. Carousel greedy: A generalized greedy algorithm with applications in optimization. Computers & Operations Research 85, 97–112.
  21. Clip-tuning: Towards derivative-free prompt learning with a mixture of rewards. arXiv preprint arXiv:2210.12050 .
  22. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology .
  23. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology 15, 1–45.
  24. A match made in consistency heaven: when large language models meet evolutionary algorithms. arXiv preprint arXiv:2401.10510 .
  25. Evoprompting: Language models for code-level neural architecture search. Advances in Neural Information Processing Systems 36.
  26. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 .
  27. Prompt optimization in multi-step tasks (promst): Integrating human feedback and preference alignment. arXiv preprint arXiv:2402.08702 .
  28. Mapo: Boosting large language model performance with model-adaptive prompt optimization, in: Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 3279–3304.
  29. Seed: Simple, efficient, and effective data management via large language models. arXiv preprint arXiv:2310.00749 .
  30. Jack and masters of all trades: One-pass learning sets of model sets from large pre-trained models. IEEE Computational Intelligence Magazine 18, 29–40.
  31. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 1–113.
  32. A survey on multimodal large language models for autonomous driving, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 958–979.
  33. Recent advances in differential evolution–an updated survey. Swarm and evolutionary computation 27, 1–30.
  34. Emergence of scale-free networks in social interactions among large language models. arXiv preprint arXiv:2312.06619 .
  35. Heuristic and meta-heuristic algorithms and their relevance to the real world: a survey. Int. J. Comput. Eng. Res. Trends 351, 2349–7084.
  36. Learning heuristics for the tsp by policy gradient, in: Integration of Constraint Programming, Artificial Intelligence, and Operations Research: 15th International Conference, CPAIOR 2018, Delft, The Netherlands, June 26–29, 2018, Proceedings 15, Springer. pp. 170–181.
  37. Black-box prompt learning for pre-trained language models. arXiv preprint arXiv:2201.08531 .
  38. A latent space-based estimation of distribution algorithm for large-scale global optimization. Soft Computing 23, 4593–4615.
  39. Neural architecture search: A survey. Journal of Machine Learning Research 20, 1–21.
  40. Gradient-free textual inversion, in: Proceedings of the 31st ACM International Conference on Multimedia, pp. 1364–1373.
  41. Promptbreeder: Self-referential self-improvement via prompt evolution. arXiv preprint arXiv:2309.16797 .
  42. Supershaper: Task-agnostic super pre-training of bert models with variable hidden dimensions. arXiv preprint arXiv:2110.04711 .
  43. Autobert-zero: Evolving bert backbone from scratch, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10663–10671.
  44. Can pruning make large language models more efficient? arXiv preprint arXiv:2310.04573 .
  45. Miplib 2017: data-driven compilation of the 6th mixed-integer programming library. Mathematical Programming Computation 13, 443–490.
  46. Connecting large language models with evolutionary algorithms yields powerful prompt optimizers, in: The Twelfth International Conference on Learning Representations.
  47. Black-box tuning of vision-language models with effective gradient approximation. arXiv preprint arXiv:2312.15901 .
  48. Multiobjective multifactorial optimization in evolutionary multitasking. IEEE transactions on cybernetics 47, 1652–1665.
  49. Optimizing deep feedforward neural network architecture: A tabu search based approach. Neural Processing Letters 51, 2855–2870.
  50. When gradient descent meets derivative-free optimization: A match made in black-box scenario. arXiv preprint arXiv:2305.10013 .
  51. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es). Evolutionary computation 11, 1–18.
  52. Double q-learning. Advances in neural information processing systems 23.
  53. Evolving code with a large language model. arXiv preprint arXiv:2401.07102 .
  54. A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowledge-Based Systems 204, 106244.
  55. Litetransformersearch: Training-free neural architecture search for efficient language models. Advances in Neural Information Processing Systems 35, 24254–24267.
  56. Llm performance predictors are good initializers for architecture search. arXiv preprint arXiv:2310.16712 .
  57. Benchmarking and explaining large language model-based code generation: A causality-centric approach. arXiv preprint arXiv:2310.06680 .
  58. Learning combinatorial optimization algorithms over graphs. Advances in neural information processing systems 30.
  59. Structural pruning of large language models via neural architecture search .
  60. Actor-critic algorithms. Advances in neural information processing systems 12.
  61. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475 .
  62. Open sesame! universal black box jailbreaking of large language models. arXiv preprint arXiv:2309.01446 .
  63. Genetic algorithm based deep learning neural network structure and hyperparameter optimization. Applied Sciences 11, 744.
  64. Evolution through large models, in: Handbook of Evolutionary Machine Learning. Springer, pp. 331–366.
  65. Solve routing problems with a residual edge-graph attention neural network. Neurocomputing 508, 79–98.
  66. Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE. pp. 919–931.
  67. An improved differential evolution by hybridizing with estimation-of-distribution algorithm. Information Sciences 619, 439–456.
  68. Spell: Semantic prompt evolution based on a llm. arXiv preprint arXiv:2310.01260 .
  69. Enhancing gaussian estimation of distribution algorithm by exploiting evolution direction with archive. IEEE transactions on cybernetics 50, 140–152.
  70. Drugchat: towards enabling chatgpt-like capabilities on drug molecule graphs. arXiv preprint arXiv:2309.03907 .
  71. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 .
  72. Use your instinct: instruction optimization using neural bandits coupled with transformers. arXiv preprint arXiv:2310.02905 .
  73. Pangu drug model: learn a molecule like a human. Biorxiv , 2022–03.
  74. Adaptive dynamic programming for control: A survey and recent advances. IEEE Transactions on Systems, Man, and Cybernetics: Systems 51, 142–160.
  75. Large language model for multi-objective evolutionary optimization. arXiv preprint arXiv:2310.12541 .
  76. An example of evolutionary computation+ large language model beating human: Design of efficient guided local search. arXiv preprint arXiv:2401.02051 .
  77. Algorithm evolution using large language model. arXiv preprint arXiv:2311.15249 .
  78. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys 55, 1–35.
  79. Large language models as evolutionary optimizers. arXiv preprint arXiv:2310.19046 .
  80. Large language model agent for hyper-parameter optimization. arXiv preprint arXiv:2402.01881 .
  81. A survey on evolutionary neural architecture search. IEEE transactions on neural networks and learning systems 34, 550–570.
  82. Self: Language-driven self-evolution for large language model. arXiv preprint arXiv:2310.00533 .
  83. The centrality of language in human cognition. Language Learning 66, 516–553.
  84. Developing a hybrid system for stock selection and portfolio optimization with many-objective optimization based on deep learning and improved nsga-iii. Information Sciences , 120549.
  85. Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936 .
  86. Are large language models good prompt optimizers? arXiv preprint arXiv:2402.02101 .
  87. Llm-pruner: On the structural pruning of large language models. Advances in neural information processing systems 36, 21702–21720.
  88. Efficient immune algorithm for optimal allocations in series-parallel continuous manufacturing systems. Journal of intelligent manufacturing 23, 1603–1619.
  89. Equation of state calculations by fast computing machines. The journal of chemical physics 21, 1087–1092.
  90. Language model crossover: Variation through few-shot prompting. arXiv preprint arXiv:2302.12170 .
  91. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys 56, 1–40.
  92. Hybridizing particle swarm optimization with simulated annealing and differential evolution. Cluster Computing 24, 1135–1163.
  93. Asynchronous methods for deep reinforcement learning, in: International conference on machine learning, PMLR. pp. 1928–1937.
  94. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 .
  95. Llmatic: Neural architecture search via large language models and quality-diversity optimization. arXiv preprint arXiv:2306.01102 .
  96. Reinforcement learning for solving the vehicle routing problem. Advances in neural information processing systems 31.
  97. Recent advances in differential evolution: a survey and experimental analysis. Artificial intelligence review 33, 61–106.
  98. A survey of linear and mixed-integer optimization tutorials. INFORMS Transactions on Education 14, 26–38.
  99. Plum: Prompt learning using metaheuristic. arXiv preprint arXiv:2311.08364 .
  100. Enhancing large language models-based code generation by leveraging genetic improvement, in: European Conference on Genetic Programming (Part of EvoStar), Springer. pp. 108–124.
  101. Leveraging large language models for the generation of novel metaheuristic optimization algorithms, in: Proceedings of the Companion Conference on Genetic and Evolutionary Computation, pp. 1812–1820.
  102. Grips: Gradient-free, edit-based instruction search for prompting large language models. arXiv preprint arXiv:2203.07281 .
  103. Automatic prompt optimization with" gradient descent" and beam search. arXiv preprint arXiv:2305.03495 .
  104. Automatic prompt optimization with "gradient descent" and beam search, in: Conference on Empirical Methods in Natural Language Processing.
  105. Improving language understanding by generative pre-training .
  106. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys (CSUR) 54, 1–34.
  107. A branch-and-price algorithm for the generalized assignment problem. Operations research 45, 831–841.
  108. Trust region policy optimization, in: International conference on machine learning, PMLR. pp. 1889–1897.
  109. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 .
  110. Reliable gradient-free and likelihood-free prompt tuning. arXiv preprint arXiv:2305.00593 .
  111. Cooperative co-evolutionary differential evolution for function optimization, in: Advances in Natural Computation: First International Conference, ICNC 2005, Changsha, China, August 27-29, 2005, Proceedings, Part II 1, Springer. pp. 1080–1088.
  112. Deterministic policy gradient algorithms, in: International conference on machine learning, Pmlr. pp. 387–395.
  113. Explaining data patterns in natural language with language models, in: Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pp. 31–55.
  114. Hybridizing salp swarm algorithm with particle swarm optimization algorithm for recent optimization functions. Evolutionary Intelligence 15, 23–56.
  115. The evolved transformer, in: International conference on machine learning, PMLR. pp. 5877–5886.
  116. Fedbpt: Efficient federated black-box prompt tuning for large language models. arXiv preprint arXiv:2310.01467 .
  117. Make prompt-based black-box tuning colorful: Boosting model generalization from three orthogonal perspectives. arXiv preprint arXiv:2305.08088 .
  118. Bbtv2: Towards a gradient-free future with large language models. arXiv preprint arXiv:2205.11200 .
  119. Black-box tuning for language-model-as-a-service, in: International Conference on Machine Learning, PMLR. pp. 20841–20855.
  120. An evolutionary model of personality traits related to cooperative behavior using a large language model. Scientific Reports 14, 5989.
  121. Evolutionary transfer optimization-a new frontier in evolutionary computation research. IEEE Computational Intelligence Magazine 16, 22–33.
  122. Program synthesis with generative pre-trained transformers and grammar-guided genetic programming grammar, in: 2023 IEEE Latin American Conference on Computational Intelligence (LA-CCI), IEEE. pp. 1–6.
  123. An indicator-based multiobjective evolutionary algorithm with reference point adaptation for better versatility. IEEE Transactions on Evolutionary Computation 22, 609–622.
  124. Local model-based pareto front estimation for multiobjective optimization. IEEE Transactions on Systems, Man, and Cybernetics: Systems 53, 623–634.
  125. A new algorithm for adapting the configuration of subcomponents in large-scale optimization with cooperative coevolution. Information Sciences 372, 773–795.
  126. Computing machinery and intelligence (1950) .
  127. Deep reinforcement learning with double q-learning, in: Proceedings of the AAAI conference on artificial intelligence.
  128. Attention is all you need. Advances in neural information processing systems 30.
  129. Pointer networks. Advances in neural information processing systems 28.
  130. Stochastic gradient descent with nonlinear conjugate gradient-style adaptive momentum. ArXiv abs/2012.02188. URL: https://api.semanticscholar.org/CorpusID:227255095.
  131. A new ensemble feature selection approach based on genetic algorithm. Soft Computing 24, 15811–15820.
  132. Learning from delayed rewards .
  133. A review on evolutionary multitask optimization: Trends and challenges. IEEE Transactions on Evolutionary Computation 26, 941–960.
  134. Exploring parameter-efficient fine-tuning techniques for code generation with large language models. arXiv preprint arXiv:2308.10462 .
  135. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 229–256.
  136. Visual chatgpt: Talking, drawing and editing with visual foundation models. arXiv preprint arXiv:2303.04671 .
  137. Deceptprompt: Exploiting llm-driven code generation via adversarial natural language instructions. arXiv preprint arXiv:2312.04730 .
  138. Multimodal large language models: A survey, in: 2023 IEEE International Conference on Big Data (BigData), IEEE. pp. 2247–2256.
  139. Evolutionary computation in the era of large language model: Survey and roadmap. arXiv preprint arXiv:2401.10034 .
  140. Enhancing large language models with evolutionary fine-tuning for news summary generation. Journal of Intelligent & Fuzzy Systems , 1–13.
  141. Patterngpt: A pattern-driven framework for large language model text generation, in: Proceedings of the 2023 12th International Conference on Computing and Pattern Recognition, pp. 72–78.
  142. Gps: Genetic prompt search for efficient few-shot learning. arXiv preprint arXiv:2210.17041 .
  143. Large language models as optimizers. arXiv preprint arXiv:2309.03409 .
  144. Instoptima: Evolutionary multi-objective instruction optimization via large language model-based instruction operators. arXiv:2310.17630.
  145. An adaptive covariance scaling estimation of distribution algorithm. Mathematics 9, 3207.
  146. Reevo: Large language models as hyper-heuristics with reflective evolution. arXiv preprint arXiv:2402.01145 .
  147. Autotinybert: Automatic hyper-parameter optimization for efficient pre-trained language models. arXiv preprint arXiv:2107.13686 .
  148. Black-box prompt tuning for vision-language model as a service, in: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pp. 1686--1694.
  149. Language models as black-box optimizers for vision-language models. arXiv preprint arXiv:2309.05950 .
  150. Using large language models for hyperparameter optimization, in: NeurIPS 2023 Foundation Models for Decision Making Workshop.
  151. Hybrid estimation of distribution algorithm for global optimization. Engineering computations 21, 91--107.
  152. Deep reinforcement learning for traveling salesman problem with time windows and rejections, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE. pp. 1--8.
  153. Automl-gpt: Automatic machine learning with gpt. arXiv preprint arXiv:2305.02499 .
  154. Auto-instruct: Automatic instruction generation and ranking for black-box language models, in: Findings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Association for Computational Linguistics.
  155. Genetic prompt search via exploiting language model probabilities, in: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI. pp. 5296--5305.
  156. A survey of large language models. arXiv preprint arXiv:2303.18223 .
  157. Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970 .
  158. Black-box prompt tuning with subspace learning. arXiv preprint arXiv:2305.03518 .
  159. Leveraging large language model to generate a novel metaheuristic algorithm with crispe framework. arXiv preprint arXiv:2403.16417 .
  160. An estimation of distribution algorithm with cheap and expensive local search methods. IEEE Transactions on Evolutionary Computation 19, 807--822.
  161. Survival of the most influential prompts: Efficient black-box prompt search via clustering and pruning, in: The 2023 Conference on Empirical Methods in Natural Language Processing.
  162. Training-free transformer architecture search with zero-cost proxy guided evolution. IEEE Transactions on Pattern Analysis and Machine Intelligence .
  163. A survey of advances in evolutionary neural architecture search, in: 2021 IEEE congress on evolutionary computation (CEC), IEEE. pp. 950--957.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sen Huang (9 papers)
  2. Kaixiang Yang (18 papers)
  3. Sheng Qi (4 papers)
  4. Rui Wang (996 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

HackerNews