- The paper introduces a two-stage framework that employs LLMs to generate and rerank candidate labels for zero-shot extreme multi-label classification.
- It leverages in-context demonstrations to condense an expansive label space into an effective shortlist for improved prediction efficiency.
- Empirical evaluations on large-scale benchmarks show that ICXML outperforms baselines without relying on a pre-built input candidate corpus.
Overview
Yaxin Zhu and Hamed Zamani's work, ICXML (In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification), introduces a novel two-stage framework designed to address the problem of extreme multi-label classification (XMC) in a zero-shot setting. The challenge within XMC lies in predicting multiple relevant labels for each instance from a vast label space, with practical applications spanning various domains such as text categorization, recommendation systems, and image tagging.
Methodology
The authors propose an initial in-context learning stage where LLMs are prompted to generate a set of candidate labels consonant with a support set of generated demonstrations. The generated outputs are streamlined to resonate with the large-scale label space, effectively distilling it into a manageable shortlist of candidates. The second stage employs LLMs for a reranking task, reinserting the refined list of candidate labels in the context of the test instance, ultimately exploiting the LLM's multitasking propensity to handle concurrent multiple labels.
Contributions
The core contributions of the paper include the development of a two-fold framework entailing generation-based label shortlisting paired with label reranking, advocating a generation-driven approach to synthesize high-quality input-label pairs, and significantly advancing state-of-the-art in zero-shot XMC. In empirical evaluations, ICXML shows effectiveness on large public benchmarks such as LF-Amazon-131K and LF-WikiSeeAlso-320K, without necessitating a corpus of input candidates, a need that current baselines express. The authors make the implementation scripts and codes accessible for research purposes.
Related Work and Future Directions
The paper situates ICXML within the landscape of existing research, acknowledging prior methods that leverage dense retrieval models, pseudo annotations in training, and the deployment of LLMs in label generation. The evolution of in-context learning capabilities, effectively exhibited by LLMs, sets a backdrop against which ICXML operates. Considering the framework's performance, it's inferred that the method could be enriched by incorporating diverse knowledge sources and domain-specificities which holds as a promising direction for future work. Its potential to scale across various real-world scenarios, especially those with sparse or no available annotations, posits ICXML as a significant stride in XMC research.
Evaluation and Analysis
An extensive evaluation showcases the superiority of ICXML over a range of baseline methods across two datasets, distinct in their correlation patterns between labels and instances. The framework's robustness is highlighted, reflecting a strong performance without resorting to an input corpus. Comparative analysis between content-based and label-centric demonstration generation reveals interesting insights into their respective efficacies across different label spaces. Moreover, a detailed ablation paper sheds light on the impact of different components of the proposed framework, evidencing the vitality of LLMs in rendering effective reranking.
In conclusion, the ICXML framework presents a compelling solution for zero-shot XMC, combining the generation and retrieval paradigms to rectify the discrepancy between test instances and an extensive label space. The research sets a precedent for building more advanced, generation-centric XMC technologies that can perform with utmost precision, even in the most challenging scenarios with minimal supervision.