Introduction
The paper introduces Aligner, an innovative Parameter-Efficient Fine-Tuning (PEFT) method designed for aligning multi-constructs in LLMs. Aligner employs a globally shared set of tunable tokens that significantly modifies the attention mechanism within every layer of a transformer-based model. Notably, Aligner demonstrates that even a minimal number of tokens, such as a single token representing a mere 5,000 parameters, can effectively align LLMs that traditionally contain millions of parameters.
Methodology
Aligner is inspired by previous PEFT methods like LoRA and LLaMA-Adapters, but diverges by introducing a global prefix token. This token is shared across all layers as opposed to layer-specific tokens employed by current methods. The primary feature of Aligner is the innovative use of a globally shared prefix token paradigm, contrasting the traditional layer-specific tokens. It applies a separate attention computation for these tokens, and the attention from prefix tokens, modulated by a layer-specific gating factor, is added back to the original attention scores.
Evaluation
The evaluation of Aligner focused on two form alignment tasks: instruction following and value alignment with human preferences. Tests conducted on GPT-4, leveraging the Vicuna Benchmark for instruction following and the Beaver Benchmark for safety preference, indicated that even a single global token could rival the performance of leading PEFT methods like LLaMa-Adapter and LoRA. The efficiency of Aligner offers the practical utility of fitting over a million instances alongside a 7-billion-parameter LLM within a 24GB GPU, providing implications for customized user-oriented models in industrial applications.
Conclusion and Discussion
Aligner’s success provides insights into the inner workings of LLMs, illustrating that “form” can indeed function orthogonally to “knowledge” or “ability.” The architecture's exceptional efficiency and performance on form alignment tasks suggest a distinct global component for "form," influencing how knowledge is applied within LLMs. This finding holds promise for future research into LLM mechanisms and aligning AI models with human values.
Ultimately, the paper positions Aligner not only as a highly efficient PEFT method for form alignment tasks but also as a beneficial tool for understanding LLM operational mechanisms. The findings showcase Aligner as a desirable option across various tasks, paving the way for further exploration in AI safety and control.