In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery (2404.19094v2)
Abstract: State of the art Symbolic Regression (SR) methods currently build specialized models, while the application of LLMs remains largely unexplored. In this work, we introduce the first comprehensive framework that utilizes LLMs for the task of SR. We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an LLM and determines its coefficients with an external optimizer. ICSR leverages LLMs' strong mathematical prior both to propose an initial set of possible functions given the observations and to refine them based on their errors. Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks, while yielding simpler equations with better out of distribution generalization.
- Yi: Open foundation models by 01.AI. Preprint, arXiv:2403.04652.
- Flamingo: a visual language model for few-shot learning. In Advances in Neural Information Processing Systems.
- Palm 2 technical report. Preprint, arXiv:2305.10403.
- Anthropic. 2024. Introducing the next generation of Claude.
- Introducing our multimodal models.
- Neural symbolic regression that scales. Preprint, arXiv:2106.06427.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Contemporary symbolic regression methods and their relative performance. Preprint, arXiv:2107.14351.
- Discovering symbolic models from deep learning with inductive biases. In Advances in Neural Information Processing Systems, volume 33, pages 17429–17442. Curran Associates, Inc.
- A survey on in-context learning. Preprint, arXiv:2301.00234.
- An image is worth 16x16 words: Transformers for image recognition at scale. Preprint, arXiv:2010.11929.
- Transformers learn higher-order optimization methods for in-context learning: A study with linear models. Preprint, arXiv:2310.17086.
- Dedication. McGraw-Hill Education, New York, NY.
- Large language models are zero-shot time series forecasters. In Thirty-seventh Conference on Neural Information Processing Systems.
- End-to-end symbolic regression with transformers. In Advances in Neural Information Processing Systems.
- Maarten Keijzer. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In Genetic Programming, Lecture Notes in Computer Science, pages 70–82, Berlin, Heidelberg. Springer.
- C. T. Kelley. 1999. Iterative Methods for Optimization, pages 22–25. Society for Industrial and Applied Mathematics.
- Generating images with multimodal language models. In Thirty-seventh Conference on Neural Information Processing Systems.
- John R. Koza and Riccardo Poli. 2005. Genetic programming. In Edmund K. Burke and Graham Kendall, editors, Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques, pages 127–164. Springer US, Boston, MA.
- Krzysztof Krawiec and Tomasz Pawlak. 2013. Approximating geometric crossover by semantic backpropagation. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, GECCO ’13, page 941–948, New York, NY, USA. Association for Computing Machinery.
- Guillaume Lample and François Charton. 2019. Deep learning for symbolic mathematics. Preprint, arXiv:1912.01412.
- Otterhd: A high-resolution multi-modality model. Preprint, arXiv:2311.04219.
- Otter: A multi-modal model with in-context instruction tuning. Preprint, arXiv:2305.03726.
- Symbolic expression transformer: A computer vision approach for symbolic regression. Preprint, arXiv:2205.11798.
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. Preprint, arXiv:2301.12597.
- Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. Preprint, arXiv:2201.12086.
- Transformer-based model for symbolic regression via joint supervised learning. In The Eleventh International Conference on Learning Representations.
- Lemma: Bootstrapping high-level mathematical reasoning with learned symbolic abstractions. Preprint, arXiv:2211.08671.
- Improved baselines with visual instruction tuning. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.
- Llava-next: Improved reasoning, ocr, and world knowledge.
- Visual instruction tuning. In Thirty-seventh Conference on Neural Information Processing Systems.
- Lost in the middle: How language models use long contexts. Preprint, arXiv:2307.03172.
- Mathematical language models: A survey. Preprint, arXiv:2312.07622.
- Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct. Preprint, arXiv:2308.09583.
- Meta. 2024. Meta Llama 3.
- Large language models as general pattern machines. Preprint, arXiv:2307.04721.
- Semantically-based crossover in genetic programming: Application to real-valued symbolic regression. Genetic Programming and Evolvable Machines, 12:91–119.
- OpenAI. 2023. Gpt-4 technical report. Preprint, arXiv:2303.08774.
- Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations.
- Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8748–8763. PMLR.
- Hierarchical text-conditional image generation with clip latents. Preprint, arXiv:2204.06125.
- Zero-shot text-to-image generation. Preprint, arXiv:2102.12092.
- Code llama: Open foundation models for code. Preprint, arXiv:2308.12950.
- Michael Schmidt and Hod Lipson. 2011. Age-fitness pareto optimization. In Rick Riolo, Trent McConaghy, and Ekaterina Vladislavleva, editors, Genetic Programming Theory and Practice VIII, pages 129–146. Springer New York, New York, NY.
- Transformer-based planning for symbolic regression. In Thirty-seventh Conference on Neural Information Processing Systems.
- Flava: A foundational language and vision alignment model. In CVPR, pages 15617–15629.
- Guido F. Smits and Mark Kotanchek. 2005. Pareto-front exploitation in symbolic regression. In Una-May O’Reilly, Tina Yu, Rick Riolo, and Bill Worzel, editors, Genetic Programming Theory and Practice II, pages 283–299. Springer US, Boston, MA.
- Llama 2: Open foundation and fine-tuned chat models. Preprint, arXiv:2307.09288.
- Symbolicgpt: A generative transformer model for symbolic regression. Preprint, arXiv:2106.14131.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Improving model-based genetic programming for symbolic regression of small expressions. Evolutionary Computation, 29(2):211–237.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272.
- Genetic programming: An introduction and survey of applications. pages 314 – 319.
- Large language models as optimizers. Preprint, arXiv:2309.03409.
- Vision-language models for vision tasks: A survey. Preprint, arXiv:2304.00685.
- Matteo Merler (3 papers)
- Nicola Dainese (6 papers)
- Katsiaryna Haitsiukevich (5 papers)
- Pekka Marttinen (56 papers)