OmniPred: Language Models as Universal Regressors
Abstract: Regression is a powerful tool to accurately predict the outcome metric of a system given a set of parameters, but has traditionally been restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training LLMs as universal end-to-end regressors over $(x,y)$ data from arbitrary formats. Using data sourced from Google Vizier, one of the largest proprietary blackbox optimization databases in the world, our extensive experiments demonstrate that LLMs are capable of very precise numerical regression using only textual representations of mathematical parameters and values, and if given the opportunity to train at scale over multiple tasks, can significantly outperform traditional regression models.
- Multi-task gaussian process prediction. In Advances in Neural Information Processing Systems, volume 20, 2007.
- Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952. ISSN 00063444. URL http://www.jstor.org/stable/2334029.
- Once-for-all: Train one network and specialize it for efficient deployment. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=HylxE1HKwS.
- François Charton. Linear algebra with transformers. Trans. Mach. Learn. Res., 2022, 2022. URL https://openreview.net/forum?id=Hp4g7FAXXG.
- Towards learning universal hyperparameter optimizers with transformers. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/cf6501108fced72ee5c47e2151c4e153-Abstract-Conference.html.
- Takashi Daimon. Box-cox transformation. In Miodrag Lovric (ed.), International Encyclopedia of Statistical Science, pp. 176–178. Springer, 2011. doi: 10.1007/978-3-642-04898-2_152. URL https://doi.org/10.1007/978-3-642-04898-2_152.
- Deep symbolic regression for recurrent sequences. CoRR, abs/2201.04600, 2022. URL https://arxiv.org/abs/2201.04600.
- Efficient benchmarking of hyperparameter optimizers via surrogates. In Blai Bonet and Sven Koenig (eds.), Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, pp. 1114–1120. AAAI Press, 2015. doi: 10.1609/AAAI.V29I1.9375. URL https://doi.org/10.1609/aaai.v29i1.9375.
- COCO: The large scale black-box optimization benchmarking (bbob-largescale) test suite. ArXiv, abs/1903.06396, 2019.
- Transfer learning for bayesian optimization on heterogeneous search spaces. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://openreview.net/forum?id=emXh4M7TyH.
- Runtime performance prediction for deep learning models with graph neural network. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023, pp. 368–380. IEEE, 2023. doi: 10.1109/ICSE-SEIP58684.2023.00039. URL https://doi.org/10.1109/ICSE-SEIP58684.2023.00039.
- Google vizier: A service for black-box optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017, pp. 1487–1495. ACM, 2017. doi: 10.1145/3097983.3098043. URL https://doi.org/10.1145/3097983.3098043.
- Large language models are zero-shot time series forecasters. CoRR, abs/2310.07820, 2023. doi: 10.48550/ARXIV.2310.07820. URL https://doi.org/10.48550/arXiv.2310.07820.
- Learning memory access patterns. In Jennifer G. Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 1924–1933. PMLR, 2018. URL http://proceedings.mlr.press/v80/hashemi18a.html.
- Training independent subnetworks for robust prediction. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL https://openreview.net/forum?id=OGg9XnKxFAH.
- Measuring mathematical problem solving with the MATH dataset. In Joaquin Vanschoren and Sai-Kit Yeung (eds.), Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, 2021. URL https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/be83ab3ecd0db773eb2dc1b0a17836a1-Abstract-round2.html.
- Tabpfn: A transformer that solves small tabular classification problems in a second. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=cp5PvcI6w8_.
- Tabtransformer: Tabular data modeling using contextual embeddings. CoRR, abs/2012.06678, 2020. URL https://arxiv.org/abs/2012.06678.
- Fantastic generalization measures and where to find them. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=SJgIPJBFvH.
- A learned performance model for tensor processing units. In Alex Smola, Alex Dimakis, and Ion Stoica (eds.), Proceedings of Machine Learning and Systems 2021, MLSys 2021, virtual, April 5-9, 2021. mlsys.org, 2021. URL https://proceedings.mlsys.org/paper/2021/hash/85d8ce590ad8981ca2c8286f79f59954-Abstract.html.
- Contextual gaussian process bandit optimization. In Advances in Neural Information Processing Systems, 2011.
- Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 66–71, 2018.
- Data-driven offline optimization for architecting hardware accelerators. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=GsH-K1VIyy.
- Solving quantitative reasoning problems with language models. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.
- Competition-level code generation with alphacode. CoRR, abs/2203.07814, 2022. doi: 10.48550/ARXIV.2203.07814. URL https://doi.org/10.48550/arXiv.2203.07814.
- Tune: A research platform for distributed model selection and training. CoRR, abs/1807.05118, 2018. URL http://arxiv.org/abs/1807.05118.
- Neural architecture performance prediction using graph neural networks. In Zeynep Akata, Andreas Geiger, and Torsten Sattler (eds.), Pattern Recognition - 42nd DAGM German Conference, DAGM GCPR 2020, Tübingen, Germany, September 28 - October 1, 2020, Proceedings, volume 12544 of Lecture Notes in Computer Science, pp. 188–201. Springer, 2020.
- Neural architecture search without training. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp. 7588–7598. PMLR, 2021. URL http://proceedings.mlr.press/v139/mellor21a.html.
- Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pp. 4505–4515. PMLR, 2019. URL http://proceedings.mlr.press/v97/mendis19a.html.
- Investigating the limitations of the transformers with simple arithmetic tasks. CoRR, abs/2102.13019, 2021. URL https://arxiv.org/abs/2102.13019.
- OpenAI. Introducing chatgpt. 2022.
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67, 2020. URL http://jmlr.org/papers/v21/20-074.html.
- Large language models encode clinical knowledge. CoRR, abs/2212.13138, 2022. doi: 10.48550/ARXIV.2212.13138. URL https://doi.org/10.48550/arXiv.2212.13138.
- Open source vizier: Distributed infrastructure and API for reliable and flexible blackbox optimization. In Isabelle Guyon, Marius Lindauer, Mihaela van der Schaar, Frank Hutter, and Roman Garnett (eds.), International Conference on Automated Machine Learning, AutoML 2022, 25-27 July 2022, Johns Hopkins University, Baltimore, MD, USA, volume 188 of Proceedings of Machine Learning Research, pp. 8/1–17. PMLR, 2022. URL https://proceedings.mlr.press/v188/song22a.html.
- Lamda: Language models for dialog applications. CoRR, abs/2201.08239, 2022. URL https://arxiv.org/abs/2201.08239.
- Design-bench: Benchmarks for data-driven offline model-based optimization. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp. 21658–21676. PMLR, 2022. URL https://proceedings.mlr.press/v162/trabucco22a.html.
- Few-shot bayesian optimization with deep kernel surrogates. arXiv preprint arXiv:2101.07667, 2021.
- A new family of power transformations to improve normality or symmetry. Biometrika, 87(4):954–959, 2000. ISSN 00063444. URL http://www.jstor.org/stable/2673623.
- Surrogate NAS benchmarks: Going beyond the limited search spaces of tabular NAS benchmarks. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=OnpFa95RVqs.
- Unlocking the transferability of tokens in deep models for tabular data. CoRR, abs/2310.15149, 2023. doi: 10.48550/ARXIV.2310.15149. URL https://doi.org/10.48550/arXiv.2310.15149.
- Fine-tuning language models from human preferences. CoRR, abs/1909.08593, 2019. URL http://arxiv.org/abs/1909.08593.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.