LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks (2206.06565v4)

Published 14 Jun 2022 in cs.LG and cs.CL

Abstract: Fine-tuning pretrained LLMs (LMs) without making any architectural changes has become a norm for learning various language downstream tasks. However, for non-language downstream tasks, a common practice is to employ task-specific designs for input, output layers, and loss functions. For instance, it is possible to fine-tune an LM into an MNIST classifier by replacing the word embedding layer with an image patch embedding layer, the word token output layer with a 10-way output layer, and the word prediction loss with a 10-way classification loss, respectively. A natural question arises: Can LM fine-tuning solve non-language downstream tasks without changing the model architecture or loss function? To answer this, we propose Language-Interfaced Fine-Tuning (LIFT) and study its efficacy and limitations by conducting an extensive empirical study on a suite of non-language classification and regression tasks. LIFT does not make any changes to the model architecture or loss function, and it solely relies on the natural language interface, enabling "no-code machine learning with LMs." We find that LIFT performs comparably well across a wide range of low-dimensional classification and regression tasks, matching the performances of the best baselines in many cases, especially for the classification tasks. We also report experimental results on the fundamental properties of LIFT, including inductive bias, robustness, and sample complexity. We also analyze the effect of pretraining on LIFT and a few properties/techniques specific to LIFT, e.g., context-aware learning via appropriate prompting, calibrated predictions, data generation, and two-stage fine-tuning. Our code is available at https://github.com/UW-Madison-Lee-Lab/LanguageInterfacedFineTuning.

PDF Abstract

Summary of "LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks"

The paper "LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks" investigates the potential of using LLMs (LMs) to address non-language machine learning tasks without modifying the model architecture or loss function. The authors introduce Language-Interfaced Fine-Tuning (LIFT) as a method to fine-tune pretrained LMs on tasks typically solved through domain-specific network architectures.

Methodology

LIFT is a two-phase procedure:

Dataset Conversion: This phase involves transforming labeled datasets into sentence-format inputs. Two main prompting strategies are considered — one using available task and feature names explicitly and another using generic placeholders.
Fine-Tuning: In this phase, a pretrained LM is fine-tuned solely based on the converted sentence inputs, without any changes to the original architecture or loss functions.

Empirical Evaluation

The paper presents an extensive empirical analysis of LIFT across various non-language classification and regression tasks, highlighting its performance and properties:

Classification and Regression Performance: LIFT achieves comparable accuracies to strong baselines across low-dimensional tasks, underlining its adaptability despite the absence of architectural modifications.
Sample Complexity: Analysis indicates that LIFT efficiently learns tasks with a relatively small number of samples, though performance varies with task complexity.
Inductive Bias and Robustness: Insights into LIFT's decision boundaries suggest adaptations similar to tree-based classifiers, exhibiting fractal properties and robustness to outliers and small perturbations.
LM Pretraining Dependency: The effectiveness of LIFT hinges on leveraging LMs pretrained on natural language data, accentuating the importance of pretraining in diverse applications.

Enhancements and Implications

The authors explore improvements via context incorporation through feature names and descriptions — demonstrating enhanced sample efficiency for classification tasks. Two-stage fine-tuning and data augmentation are proposed to further enhance LIFT's capability on smaller training datasets.

Moreover, LIFT's potential applications extend to generative tasks, showcasing its versatility in yielding high-quality image data completion for well-established datasets like MNIST.

Discussion

While offering compelling results, LIFT presents limitations, notably when dealing with high-dimensional data or tasks with an extensive number of classes. The constraints of the LM's context length also present challenges. As such, exploring more memory-efficient transformer models could address these issues.

Conclusion

The research solidifies the notion of LMs as versatile, general-purpose solutions for diverse tasks beyond traditional NLP applications. The findings prompt further inquiry into refining LIFT for handling a broader spectrum of machine learning tasks, underscoring its potential contribution to reducing the barrier to entry for non-specialists in machine learning.

This paper invites exploration of the broader implications for building universal models capable of adapting to any data modality or domain, heralding a new frontier in machine learning research.