Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning (2404.09491v2)

Published 15 Apr 2024 in cs.LG

Abstract: LLMs, with their remarkable ability to tackle challenging and unseen reasoning problems, hold immense potential for tabular learning, that is vital for many real-world applications. In this paper, we propose a novel in-context learning framework, FeatLLM, which employs LLMs as feature engineers to produce an input data set that is optimally suited for tabular predictions. The generated features are used to infer class likelihood with a simple downstream machine learning model, such as linear regression and yields high performance few-shot learning. The proposed FeatLLM framework only uses this simple predictive model with the discovered features at inference time. Compared to existing LLM-based approaches, FeatLLM eliminates the need to send queries to the LLM for each sample at inference time. Moreover, it merely requires API-level access to LLMs, and overcomes prompt size limitations. As demonstrated across numerous tabular datasets from a wide range of domains, FeatLLM generates high-quality rules, significantly (10% on average) outperforming alternatives such as TabLLM and STUNT.

Citations (7)

View on Semantic Scholar

Summary

The paper introduces FeatLLM, a method that uses LLMs to automatically engineer features for efficient few-shot tabular learning.
It employs a dual-phase reasoning process and ensemble techniques to generate and refine binary features, reducing computational overhead.
Experimental results on 13 diverse datasets show that FeatLLM outperforms conventional methods in low-sample, high-complexity settings.

Analysis of the Paper: LLMs Can Automatically Engineer Features for Few-Shot Tabular Learning

The paper introduces a pivotal approach designated as FeatLLM, which leverages the capabilities of LLMs to perform automated feature engineering specifically tailored for few-shot tabular learning. This method is particularly significant due to the ubiquitous nature of tabular data across numerous real-world applications where data labeling is often expensive and time-consuming. FeatLLM offers a compelling strategy by not just deploying LLMs for direct prediction tasks, but rather by utilizing them to derive valuable features from data, which can then be used with simple machine learning models for efficient and scalable tabular learning.

Summary of the Approach

The core proposition of FeatLLM involves harnessing the reasoning and generalization power of LLMs to automatically engineer features into a format that suits downstream tabular prediction tasks. Unlike traditional LLM-based methods that perform one inference per sample for predictions, which can be computationally intense, FeatLLM's feature engineering occurs during training, leading to significant efficiency in inference. This enables the reduction in the number of calls to LLMs, only requiring API-level access, effectively circumventing extensive fine-tuning or computational costs associated with proprietary LLMs.

Methodology and Implementation

At its core, FeatLLM employs a dual-phase reasoning process initiated through well-structured prompts to decipher problem context and deduce rules. These rules are subsequently converted into binary features indicating the likelihood of specific classes. The framework employs an ensemble approach alongside bagging techniques to mitigate constraints of excessive prompt sizes and to ensure feature diversity. By extracting fundamental criteria or "rules" that guide LLM predictions rather than end-to-end leveraging of LLM predictions, FeatLLM ensures a significant reduction in inference latency and increases generalization across different domains.

The prompt architecture forms the backbone, enriching LLM's contextual understanding and formulating rules based on provided examples while remaining within prompt size limitations.
Parsing these generated rules involves a secondary LLM-driven task, translating them into executable code that adjusts tabulated data into actionable binary features.
FeatLLM refines class likelihoods through a low-complexity model using these binary indicators, accentuated with interpretative weights.
Repeated execution of this method, integrated with ensemble learning, balances the robustness and feasibility of the approach across datasets with varying feature complexity.

Experimental Results and Implications

The experimental validation included a range of 13 datasets featuring traditional binary and multi-class categories, showing FeatLLM's superior performance over baselines utilizing fewer sample sizes. Noteworthy improvements, such as outstripping existing approaches like TabLLM, can be largely attributed to sophisticated feature generalization and extraction enabled by the methodological insights of FeatLLM. Key findings elaborate on how conventional data-driven methods fall short when tackling low-shot regimes without extensive unlabeled datasets or inefficient frequent LLM querying.

Discussions and Prospective Directions

The implications of FeatLLM stretch across both practical and theoretical dimensions. Practically, it paves the way for more AI applications in sectors with limited data access, such as healthcare analytics or financial predictions. Theoretically, it explores an exciting avenue of how model-agnostic, LLM-driven feature engineering can set a precedent for similar architecture explorations in other data modalities.

Moreover, future extensions could explore adaptability for abundant labeled datasets and broader feature extraction types to enhance interpretability further. Establishing a comprehensive understanding of the biases yielded from LLM-provided prior knowledge and its systemic impact on inferences is additional scope for continued research.

Overall, FeatLLM presents a transformative approach, leading to enhanced productivity and efficiency in few-shot learning setups by effectively leveraging the latent capabilities of LLMs. By reframing LLM roles primarily as dynamic feature engineers rather than predictors, it charted new frontiers in AI-driven data interaction, heralding optimization opportunities for industry and academia.

PDF Markdown

Related Papers

Tweets

https://twitter.com/fly51fly/status/1780190806701436938

https://twitter.com/knishimae0531/status/1780384900664922376

YouTube

Show All Videos