LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value Extraction (2403.00863v2)
Abstract: Product attribute value extraction is a pivotal component in NLP and the contemporary e-commerce industry. The provision of precise product attribute values is fundamental in ensuring high-quality recommendations and enhancing customer satisfaction. The recently emerging LLMs have demonstrated state-of-the-art performance in numerous attribute extraction tasks, without the need for domain-specific training data. Nevertheless, varying strengths and weaknesses are exhibited by different LLMs due to the diversity in data, architectures, and hyperparameters. This variation makes them complementary to each other, with no single LLM dominating all others. Considering the diverse strengths and weaknesses of LLMs, it becomes necessary to develop an ensemble method that leverages their complementary potentials. In this paper, we propose a novel algorithm called LLM-ensemble to ensemble different LLMs' outputs for attribute value extraction. We iteratively learn the weights for different LLMs to aggregate the labels with weights to predict the final attribute value. Not only can our proposed method be proven theoretically optimal, but it also ensures efficient computation, fast convergence, and safe deployment. We have also conducted extensive experiments with various state-of-the-art LLMs, including Llama2-13B, Llama2-70B, PaLM-2, GPT-3.5, and GPT-4, on Walmart's internal data. Our offline metrics demonstrate that the LLM-ensemble method outperforms all the state-of-the-art single LLMs on Walmart's internal dataset. This method has been launched in several production models, leading to improved Gross Merchandise Volume (GMV), Click-Through Rate (CTR), Conversion Rate (CVR), and Add-to-Cart Rate (ATC).
- Large language models are few-shot clinical information extractors. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 1998–2022.
- Generative Models for Product Attribute Extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track. 575–585.
- Product Attribute Value Extraction using Large Language Models. arXiv preprint arXiv:2310.12537 (2023).
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- Knowledge Graph Completion Models are Few-shot Learners: An Empirical Study of Relation Labeling in E-commerce with LLMs. arXiv preprint arXiv:2305.09858 (2023).
- Label augmented and weighted majority voting for crowdsourcing. Information Sciences 606 (2022), 397–409.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 240 (2023), 1–113.
- Alexander Philip Dawid and Allan M Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28, 1 (1979), 20–28.
- Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1–15.
- Text mining for product attribute extraction. ACM SIGKDD Explorations Newsletter 8, 1 (2006), 41–48.
- Applied logistic regression. Vol. 398. John Wiley & Sons.
- Large language models are zero-shot rankers for recommender systems. arXiv preprint arXiv:2305.08845 (2023).
- LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion. arXiv preprint arXiv:2306.02561 (2023).
- Txtract: Taxonomy-aware knowledge extraction for thousands of product categories. arXiv preprint arXiv:2004.13852 (2020).
- Recognizing salient entities in shopping queries. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 107–111.
- Hongwei Li and Bin Yu. 2014. Error rate bounds and iterative weighted majority voting for crowdsourcing. arXiv preprint arXiv:1411.4086 (2014).
- LLM-TAKE: Theme-Aware Keyword Extraction Using Large Language Models. In 2023 IEEE International Conference on Big Data (BigData). IEEE, 4318–4324.
- Ajinkya More. 2016. Attribute extraction from product titles in ecommerce. arXiv preprint arXiv:1608.04670 (2016).
- GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 3664–3686.
- Duangmanee Putthividhya and Junling Hu. 2011. Bootstrapped named entity recognition for product attribute extraction. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 1557–1567.
- Accurate product attribute extraction on the field. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 1862–1873.
- Tian Tian and Jun Zhu. 2015. Max-margin majority voting for learning from crowds. Advances in neural information processing systems 28 (2015).
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
- Faceted product search powered by the semantic web. Decision Support Systems 53, 3 (2012), 425–437.
- Knowledge fusion of large language models. arXiv preprint arXiv:2401.10491 (2024).
- Learning to extract attribute value from product via question answering: A multi-task approach. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 47–55.
- Code4struct: Code generation for few-shot event structure prediction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 3640–3663.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022).
- Scalable attribute-value extraction from semi-structured text. In 2009 IEEE international conference on data mining workshops. IEEE, 302–307.
- Scaling up open tagging from tens to thousands: Comprehension empowered attribute value extraction from product title. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5214–5223.
- AdaTag: Multi-attribute value extraction from product profiles with adaptive decoding. arXiv preprint arXiv:2106.02318 (2021).
- Mixpave: Mix-prompt tuning for few-shot product attribute value extraction. In Findings of the Association for Computational Linguistics: ACL 2023. 9978–9991.
- MAVE: A product dataset for multi-source attribute value extraction. In Proceedings of the fifteenth ACM international conference on web search and data mining. 1256–1265.
- A Framework for an Ontology-based E-commerce Product Information Retrieval System. J. Comput. 4, 6 (2009), 436–443.
- Oa-mine: open-world attribute mining for e-commerce products with weak supervision. In Proceedings of the ACM Web Conference 2022. 3153–3161.
- Opentag: Open attribute value extraction from product profiles. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1049–1058.