Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery (2404.19094v2)

Published 29 Apr 2024 in cs.CL and cs.LG

Abstract: State of the art Symbolic Regression (SR) methods currently build specialized models, while the application of LLMs remains largely unexplored. In this work, we introduce the first comprehensive framework that utilizes LLMs for the task of SR. We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an LLM and determines its coefficients with an external optimizer. ICSR leverages LLMs' strong mathematical prior both to propose an initial set of possible functions given the observations and to refine them based on their errors. Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks, while yielding simpler equations with better out of distribution generalization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Yi: Open foundation models by 01.AI. Preprint, arXiv:2403.04652.
  2. Flamingo: a visual language model for few-shot learning. In Advances in Neural Information Processing Systems.
  3. Palm 2 technical report. Preprint, arXiv:2305.10403.
  4. Anthropic. 2024. Introducing the next generation of Claude.
  5. Introducing our multimodal models.
  6. Neural symbolic regression that scales. Preprint, arXiv:2106.06427.
  7. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  8. Contemporary symbolic regression methods and their relative performance. Preprint, arXiv:2107.14351.
  9. Discovering symbolic models from deep learning with inductive biases. In Advances in Neural Information Processing Systems, volume 33, pages 17429–17442. Curran Associates, Inc.
  10. A survey on in-context learning. Preprint, arXiv:2301.00234.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. Preprint, arXiv:2010.11929.
  12. Transformers learn higher-order optimization methods for in-context learning: A study with linear models. Preprint, arXiv:2310.17086.
  13. Dedication. McGraw-Hill Education, New York, NY.
  14. Large language models are zero-shot time series forecasters. In Thirty-seventh Conference on Neural Information Processing Systems.
  15. End-to-end symbolic regression with transformers. In Advances in Neural Information Processing Systems.
  16. Maarten Keijzer. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In Genetic Programming, Lecture Notes in Computer Science, pages 70–82, Berlin, Heidelberg. Springer.
  17. C. T. Kelley. 1999. Iterative Methods for Optimization, pages 22–25. Society for Industrial and Applied Mathematics.
  18. Generating images with multimodal language models. In Thirty-seventh Conference on Neural Information Processing Systems.
  19. John R. Koza and Riccardo Poli. 2005. Genetic programming. In Edmund K. Burke and Graham Kendall, editors, Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques, pages 127–164. Springer US, Boston, MA.
  20. Krzysztof Krawiec and Tomasz Pawlak. 2013. Approximating geometric crossover by semantic backpropagation. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, GECCO ’13, page 941–948, New York, NY, USA. Association for Computing Machinery.
  21. Guillaume Lample and François Charton. 2019. Deep learning for symbolic mathematics. Preprint, arXiv:1912.01412.
  22. Otterhd: A high-resolution multi-modality model. Preprint, arXiv:2311.04219.
  23. Otter: A multi-modal model with in-context instruction tuning. Preprint, arXiv:2305.03726.
  24. Symbolic expression transformer: A computer vision approach for symbolic regression. Preprint, arXiv:2205.11798.
  25. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. Preprint, arXiv:2301.12597.
  26. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. Preprint, arXiv:2201.12086.
  27. Transformer-based model for symbolic regression via joint supervised learning. In The Eleventh International Conference on Learning Representations.
  28. Lemma: Bootstrapping high-level mathematical reasoning with learned symbolic abstractions. Preprint, arXiv:2211.08671.
  29. Improved baselines with visual instruction tuning. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.
  30. Llava-next: Improved reasoning, ocr, and world knowledge.
  31. Visual instruction tuning. In Thirty-seventh Conference on Neural Information Processing Systems.
  32. Lost in the middle: How language models use long contexts. Preprint, arXiv:2307.03172.
  33. Mathematical language models: A survey. Preprint, arXiv:2312.07622.
  34. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct. Preprint, arXiv:2308.09583.
  35. Meta. 2024. Meta Llama 3.
  36. Large language models as general pattern machines. Preprint, arXiv:2307.04721.
  37. Semantically-based crossover in genetic programming: Application to real-valued symbolic regression. Genetic Programming and Evolvable Machines, 12:91–119.
  38. OpenAI. 2023. Gpt-4 technical report. Preprint, arXiv:2303.08774.
  39. Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations.
  40. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8748–8763. PMLR.
  41. Hierarchical text-conditional image generation with clip latents. Preprint, arXiv:2204.06125.
  42. Zero-shot text-to-image generation. Preprint, arXiv:2102.12092.
  43. Code llama: Open foundation models for code. Preprint, arXiv:2308.12950.
  44. Michael Schmidt and Hod Lipson. 2011. Age-fitness pareto optimization. In Rick Riolo, Trent McConaghy, and Ekaterina Vladislavleva, editors, Genetic Programming Theory and Practice VIII, pages 129–146. Springer New York, New York, NY.
  45. Transformer-based planning for symbolic regression. In Thirty-seventh Conference on Neural Information Processing Systems.
  46. Flava: A foundational language and vision alignment model. In CVPR, pages 15617–15629.
  47. Guido F. Smits and Mark Kotanchek. 2005. Pareto-front exploitation in symbolic regression. In Una-May O’Reilly, Tina Yu, Rick Riolo, and Bill Worzel, editors, Genetic Programming Theory and Practice II, pages 283–299. Springer US, Boston, MA.
  48. Llama 2: Open foundation and fine-tuned chat models. Preprint, arXiv:2307.09288.
  49. Symbolicgpt: A generative transformer model for symbolic regression. Preprint, arXiv:2106.14131.
  50. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  51. Improving model-based genetic programming for symbolic regression of small expressions. Evolutionary Computation, 29(2):211–237.
  52. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272.
  53. Genetic programming: An introduction and survey of applications. pages 314 – 319.
  54. Large language models as optimizers. Preprint, arXiv:2309.03409.
  55. Vision-language models for vision tasks: A survey. Preprint, arXiv:2304.00685.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Matteo Merler (3 papers)
  2. Nicola Dainese (6 papers)
  3. Katsiaryna Haitsiukevich (5 papers)
  4. Pekka Marttinen (56 papers)
Citations (2)

Summary

  • The paper introduces an innovative in-context symbolic regression approach using LLMs and VLMs to generate initial seed functions and iteratively refine them based on prediction error.
  • The method, known as Optimization by Prompting (OPRO), significantly outperforms traditional genetic programming methods on several benchmarks with higher average R² values.
  • Incorporating visual data via VLMs enhances performance in complex cases, though its benefits vary compared to using text-only inputs.

Exploring Symbolic Regression with LLMs and Vision-Language Approaches

Understanding the Study: Symbolic regression through LLMs

Symbolic Regression (SR) traditionally leverages Genetic Programming (GP) to find mathematical models that describe data. This paper proposes a novel method incorporating pre-trained LLMs and Vision-LLMs (VLMs) to handle SR tasks. The authors introduce a system wherein an LLM generates initial functional forms based on data observations. These forms are then refined using an iterative method until satisfactory results are achieved. The novel introduction of VLMs uses visual data (plots) alongside textual data to enhance the model's understanding and performance.

Methodology: How the approach works

The approach utilized here, termed Optimization by Prompting (OPRO), begins by having the LLM generate a range of possible mathematical functions based on initial data. These are the seed functions. Then, through iterative refinements where each function's performance (measured by its prediction error) informs the next generation of functions, the model refines its guesses. This process leverages both the generative capabilities of LLMs for creating new function forms and external optimizers for fine-tuning function coefficients.

An intriguing extension involves integrating visual input. Here, plots of data and previous function guesses are fed into a VLM to potentially enhance the model's understanding and performance, particularly in more complex scenarios where textual information might be insufficient.

Results: Strong performance indicated

The results from this paper were quite promising. The LLM-based approach outperformed traditional GP-based methods on several benchmarks. The inclusion of visual data through VLMs also showed potential in complex cases, although it didn't always outperform the text-only model. These results suggest that pre-trained LLMs, especially when equipped with mechanisms like OPRO, can effectively explore and optimize mathematical expressions fitting observed data.

Here's a breakdown of the performance improvements:

  • The LLM approach yielded an average R2R^2 value significantly higher than that of the GP baselines across multiple benchmarks.
  • Inclusion of visual inputs via VLMs helped in some complex benchmarks but was not universally superior to the text-only approach.

Implications and Future Prospects

Both theoretically and practically, the integration of LLMs into SR tasks could pave the way for more versatile and powerful analytical tools that benefit from advancements in natural language processing and machine learning. Particularly, the ability of these models to generate and refine hypotheses in an iterative fashion could make them valuable in fields requiring automated modeling of complex phenomena.

However, the practical application of this approach, especially in higher-dimensional spaces or with larger data sets, will likely need ongoing refinement. Advances in model capabilities, such as extended context windows or enhanced integration of multimodal data, could further improve performance.

Limitations and Challenges

While promising, the approach faces limitations primarily related to the handling of high-dimensional data and the finite size of the context window in current LLMs, which can constrain the amount of data that can be processed in one go. Future iterations of this technology will need to address these constraints to fully leverage the potential of LLMs in symbolic regression.

Conclusion

This paper provides a compelling look at how modern AI techniques can be extended beyond traditional application boundaries into areas like symbolic regression. As AI continues to evolve, particularly with improvements in LLMs and VLMs, we are likely to see even more innovative applications that can tackle increasingly complex tasks effectively.