Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OmniPred: Language Models as Universal Regressors (2402.14547v4)

Published 22 Feb 2024 in cs.LG, cs.AI, cs.CL, and cs.DB

Abstract: Regression is a powerful tool to accurately predict the outcome metric of a system given a set of parameters, but has traditionally been restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training LLMs as universal end-to-end regressors over $(x,y)$ data from arbitrary formats. Using data sourced from Google Vizier, one of the largest proprietary blackbox optimization databases in the world, our extensive experiments demonstrate that LLMs are capable of very precise numerical regression using only textual representations of mathematical parameters and values, and if given the opportunity to train at scale over multiple tasks, can significantly outperform traditional regression models.

LLMs as Capable Predictors in Universal Regression Tasks

Introduction

Regression tasks have been integral to numerous scientific and industrial applications, aiming to predict continuous outcomes based on a set of input variables. Traditional regression models, while powerful in their specific domains, often require substantial customization and feature engineering to adapt to new tasks. OmniPred introduces a novel approach to regression, harnessing the flexibility and scalability of LLMs to serve as universal end-to-end regressors. By leveraging textual representations of experimental parameters and outcomes, OmniPred showcases the potential for LLMs to perform accurate metric predictions across a diverse array of real-world datasets, notably outperforming traditional models in many instances.

Technical Approach

The methodology detailed in OmniPred focuses on transforming regression into a text processing task. Given the varied nature of data in experimental settings, the paper crafts a specialized representation of both input parameters and target metrics in textual format. This transformative step allows leveraging LLMs—in this case, a T5 model with 200 million parameters—for regression tasks without the need for explicit feature engineering or normalization typically seen in traditional models.

Key aspects of the methodology include:

  • Task Representation: Utilizing a key-value text format for input parameters and a custom tokenization for numerical outcomes.
  • Model Training: Adopting a standard cross-entropy loss function, similar to conventional LLM training but specified towards numerical regression.
  • Sampling and Decoding: Implementing temperature decoding to generate outcome distributions, showcasing the model's utility in approximating the underlying distribution of results.

Experimental Insights

OmniPred was rigorously evaluated against both synthetic and real-world datasets, demonstrating its capability to learn and predict across tasks with varying input spaces and objective scales. One notable experiment included data from Google Vizier, a comprehensive source for blackbox optimization tasks, illustrating OmniPred's superior performance compared to traditional regression models on unseen tasks, further emphasizing the model’s adaptability and potential for generalization.

Implications and Future Directions

The paper’s findings suggest significant implications for the field of experimental design and beyond:

  • Transfer Learning: OmniPred's ability to leverage textual representations allows it to benefit from transfer learning, significantly improving performance on tasks with little to immediate prior data.
  • Multi-Task Learning: Demonstrating superior performance in multi-task settings over single-task models, OmniPred paves the way for more efficient and scalable modeling approaches in data-rich environments.
  • Practical Applications: From hyperparameter tuning to complex system predictions, OmniPred’s framework suggests a shift towards more flexible and adaptive regression models, potentially reducing the reliance on domain-specific knowledge for feature engineering.

While OmniPred sets an exciting precedent, it also opens avenues for future research. Improvements could include exploring diverse input space representations, further refining the textual representation of numerical values, and investigating the utility of pre-trained models on regression tasks. Moreover, considering the computational overhead of LLMs, optimizing model efficiency without compromising on prediction accuracy remains a critical challenge.

Concluding Remarks

OmniPred represents a pioneering step towards universal regression models using LLMs. By successfully applying LLMs to a wide range of regression tasks, this work introduces a new paradigm in predictive modeling, blending the fields of natural language processing and quantitative prediction. While challenges remain, OmniPred's framework offers a compelling vision for the future of experimental design and predictive analytics, underlining the untapped potential of LLMs in quantitative domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Multi-task gaussian process prediction. In Advances in Neural Information Processing Systems, volume 20, 2007.
  2. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952. ISSN 00063444. URL http://www.jstor.org/stable/2334029.
  3. Once-for-all: Train one network and specialize it for efficient deployment. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=HylxE1HKwS.
  4. François Charton. Linear algebra with transformers. Trans. Mach. Learn. Res., 2022, 2022. URL https://openreview.net/forum?id=Hp4g7FAXXG.
  5. Towards learning universal hyperparameter optimizers with transformers. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/cf6501108fced72ee5c47e2151c4e153-Abstract-Conference.html.
  6. Takashi Daimon. Box-cox transformation. In Miodrag Lovric (ed.), International Encyclopedia of Statistical Science, pp.  176–178. Springer, 2011. doi: 10.1007/978-3-642-04898-2_152. URL https://doi.org/10.1007/978-3-642-04898-2_152.
  7. Deep symbolic regression for recurrent sequences. CoRR, abs/2201.04600, 2022. URL https://arxiv.org/abs/2201.04600.
  8. Efficient benchmarking of hyperparameter optimizers via surrogates. In Blai Bonet and Sven Koenig (eds.), Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, pp.  1114–1120. AAAI Press, 2015. doi: 10.1609/AAAI.V29I1.9375. URL https://doi.org/10.1609/aaai.v29i1.9375.
  9. COCO: The large scale black-box optimization benchmarking (bbob-largescale) test suite. ArXiv, abs/1903.06396, 2019.
  10. Transfer learning for bayesian optimization on heterogeneous search spaces. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://openreview.net/forum?id=emXh4M7TyH.
  11. Runtime performance prediction for deep learning models with graph neural network. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023, pp.  368–380. IEEE, 2023. doi: 10.1109/ICSE-SEIP58684.2023.00039. URL https://doi.org/10.1109/ICSE-SEIP58684.2023.00039.
  12. Google vizier: A service for black-box optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017, pp.  1487–1495. ACM, 2017. doi: 10.1145/3097983.3098043. URL https://doi.org/10.1145/3097983.3098043.
  13. Large language models are zero-shot time series forecasters. CoRR, abs/2310.07820, 2023. doi: 10.48550/ARXIV.2310.07820. URL https://doi.org/10.48550/arXiv.2310.07820.
  14. Learning memory access patterns. In Jennifer G. Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp.  1924–1933. PMLR, 2018. URL http://proceedings.mlr.press/v80/hashemi18a.html.
  15. Training independent subnetworks for robust prediction. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL https://openreview.net/forum?id=OGg9XnKxFAH.
  16. Measuring mathematical problem solving with the MATH dataset. In Joaquin Vanschoren and Sai-Kit Yeung (eds.), Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, 2021. URL https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/be83ab3ecd0db773eb2dc1b0a17836a1-Abstract-round2.html.
  17. Tabpfn: A transformer that solves small tabular classification problems in a second. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=cp5PvcI6w8_.
  18. Tabtransformer: Tabular data modeling using contextual embeddings. CoRR, abs/2012.06678, 2020. URL https://arxiv.org/abs/2012.06678.
  19. Fantastic generalization measures and where to find them. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=SJgIPJBFvH.
  20. A learned performance model for tensor processing units. In Alex Smola, Alex Dimakis, and Ion Stoica (eds.), Proceedings of Machine Learning and Systems 2021, MLSys 2021, virtual, April 5-9, 2021. mlsys.org, 2021. URL https://proceedings.mlsys.org/paper/2021/hash/85d8ce590ad8981ca2c8286f79f59954-Abstract.html.
  21. Contextual gaussian process bandit optimization. In Advances in Neural Information Processing Systems, 2011.
  22. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp.  66–71, 2018.
  23. Data-driven offline optimization for architecting hardware accelerators. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=GsH-K1VIyy.
  24. Solving quantitative reasoning problems with language models. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.
  25. Competition-level code generation with alphacode. CoRR, abs/2203.07814, 2022. doi: 10.48550/ARXIV.2203.07814. URL https://doi.org/10.48550/arXiv.2203.07814.
  26. Tune: A research platform for distributed model selection and training. CoRR, abs/1807.05118, 2018. URL http://arxiv.org/abs/1807.05118.
  27. Neural architecture performance prediction using graph neural networks. In Zeynep Akata, Andreas Geiger, and Torsten Sattler (eds.), Pattern Recognition - 42nd DAGM German Conference, DAGM GCPR 2020, Tübingen, Germany, September 28 - October 1, 2020, Proceedings, volume 12544 of Lecture Notes in Computer Science, pp.  188–201. Springer, 2020.
  28. Neural architecture search without training. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  7588–7598. PMLR, 2021. URL http://proceedings.mlr.press/v139/mellor21a.html.
  29. Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pp.  4505–4515. PMLR, 2019. URL http://proceedings.mlr.press/v97/mendis19a.html.
  30. Investigating the limitations of the transformers with simple arithmetic tasks. CoRR, abs/2102.13019, 2021. URL https://arxiv.org/abs/2102.13019.
  31. OpenAI. Introducing chatgpt. 2022.
  32. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67, 2020. URL http://jmlr.org/papers/v21/20-074.html.
  33. Large language models encode clinical knowledge. CoRR, abs/2212.13138, 2022. doi: 10.48550/ARXIV.2212.13138. URL https://doi.org/10.48550/arXiv.2212.13138.
  34. Open source vizier: Distributed infrastructure and API for reliable and flexible blackbox optimization. In Isabelle Guyon, Marius Lindauer, Mihaela van der Schaar, Frank Hutter, and Roman Garnett (eds.), International Conference on Automated Machine Learning, AutoML 2022, 25-27 July 2022, Johns Hopkins University, Baltimore, MD, USA, volume 188 of Proceedings of Machine Learning Research, pp.  8/1–17. PMLR, 2022. URL https://proceedings.mlr.press/v188/song22a.html.
  35. Lamda: Language models for dialog applications. CoRR, abs/2201.08239, 2022. URL https://arxiv.org/abs/2201.08239.
  36. Design-bench: Benchmarks for data-driven offline model-based optimization. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  21658–21676. PMLR, 2022. URL https://proceedings.mlr.press/v162/trabucco22a.html.
  37. Few-shot bayesian optimization with deep kernel surrogates. arXiv preprint arXiv:2101.07667, 2021.
  38. A new family of power transformations to improve normality or symmetry. Biometrika, 87(4):954–959, 2000. ISSN 00063444. URL http://www.jstor.org/stable/2673623.
  39. Surrogate NAS benchmarks: Going beyond the limited search spaces of tabular NAS benchmarks. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=OnpFa95RVqs.
  40. Unlocking the transferability of tokens in deep models for tabular data. CoRR, abs/2310.15149, 2023. doi: 10.48550/ARXIV.2310.15149. URL https://doi.org/10.48550/arXiv.2310.15149.
  41. Fine-tuning language models from human preferences. CoRR, abs/1909.08593, 2019. URL http://arxiv.org/abs/1909.08593.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xingyou Song (32 papers)
  2. Oscar Li (10 papers)
  3. Chansoo Lee (18 papers)
  4. Daiyi Peng (17 papers)
  5. Sagi Perel (8 papers)
  6. Yutian Chen (51 papers)
  7. Bangding Yang (4 papers)
Citations (8)
Youtube Logo Streamline Icon: https://streamlinehq.com