Z-BERT-A: a zero-shot Pipeline for Unknown Intent detection (2208.07084v3)
Abstract: Intent discovery is a crucial task in natural language processing, and it is increasingly relevant for various of industrial applications. Identifying novel, unseen intents from user inputs remains one of the biggest challenges in this field. Herein, we propose Zero-Shot-BERT-Adapters, a two-stage method for multilingual intent discovery relying on a Transformer architecture, fine-tuned with Adapters. We train the model for Natural Language Inference (NLI) and later perform unknown intent classification in a zero-shot setting for multiple languages. In our evaluation, we first analyze the quality of the model after adaptive fine-tuning on known classes. Secondly, we evaluate its performance in casting intent classification as an NLI task. Lastly, we test the zero-shot performance of the model on unseen classes, showing how Zero-Shot-BERT-Adapters can effectively perform intent discovery by generating semantically similar intents, if not equal, to the ground-truth ones. Our experiments show how Zero-Shot-BERT-Adapters outperforms various baselines in two zero-shot settings: known intent classification and unseen intent discovery. The proposed pipeline holds the potential for broad application in customer care. It enables automated dynamic triage using a lightweight model that can be easily deployed and scaled in various business scenarios, unlike LLMs. Zero-Shot-BERT-Adapters represents an innovative multi-language approach for intent discovery, enabling the online generation of novel intents. A Python package implementing the pipeline and the new datasets we compiled are available at the following link: https://github.com/GT4SD/zero-shot-bert-adapters.
- Almawave-slu: A new dataset for slu in italian.
- A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.
- Language models are few-shot learners. CoRR, abs/2005.14165.
- Efficient intent detection with dual sentence encoders. CoRR, abs/2003.04807.
- Universal sentence encoder.
- Palm: Scaling language modeling with pathways.
- Bavarian State Library dbmdz. 2020. Italian xxl electra. https://github.com/dbmdz/berts.
- Deepset. 2019. Bert-base-german-cased. https://huggingface.co/bert-base-german-cased.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- ExplosionAI. 2015. spacy. https://github.com/explosion/spaCy.
- ExplosionAI. 2019. spacy-transformers. https://github.com/explosion/spacy-transformers.
- Google. 2023. Google translate api.
- Matthew Honnibal and Mark Johnson. 2015. An improved non-monotonic transition system for dependency parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1373–1378, Lisbon, Portugal. Association for Computational Linguistics.
- spaCy: Industrial-strength Natural Language Processing in Python.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
- Zero-shot learning across heterogeneous overlapping domains. In INTERSPEECH, pages 2914–2918.
- DataBricks lab. 2023. Free dolly: Introducing the world’s first truly open instruction-tuned llm. https://ibm.biz/dolly-release-blog.
- An evaluation dataset for intent classification and out-of-scope prediction.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.
- Unsupervised cross-task generalization via retrieval augmentation.
- A simple meta-learning paradigm for zero-shot intent classification with mixture attention mechanism. arXiv preprint arXiv:2206.02179.
- Open intent discovery through unsupervised semantic clustering and dependency parsing. arXiv preprint arXiv:2104.12114.
- Attention-informed mixed-language training for zero-shot cross-lingual task-oriented dialogue systems.
- Edward Loper and Steven Bird. 2002. Nltk: The natural language toolkit.
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization.
- George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
- Recent advances in deep learning based dialogue systems: A systematic survey. arXiv preprint arXiv:2105.04387.
- Translate & fill: Improving zero-shot multilingual semantic parsing with synthetic data.
- Joakim Nivre and Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 99–106, Ann Arbor, Michigan. Association for Computational Linguistics.
- OpenAI. 2022. Introducing chatgpt. https://openai.com/blog/chatgpt.
- Training language models to follow instructions with human feedback.
- Exploring zero and few-shot techniques for intent classification. arXiv preprint arXiv:2305.07157.
- Adapterhub: A framework for adapting transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54.
- Benchmarking commercial intent detection services with practice-driven evaluations.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
- Multitask prompted training enables zero-shot task generalization. arXiv preprint arXiv:2110.08207.
- Amazon Science. 2022. Intent induction from conversations for task-oriented dialogue. https://github.com/amazon-research/dstc11-track2-intent-induction.
- Prafull Sharma and Yingbo Li. 2019. Self-supervised contextual keyword and keyphrase retrieval with self-labelling.
- Generalized zero-shot intent detection via commonsense knowledge. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1925–1929.
- Attention is all you need. Advances in neural information processing systems, 30.
- Towards open intent discovery for conversational text.
- Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
- Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
- Zero-shot user intent detection via capsule neural networks. arXiv preprint arXiv:1809.00385.
- Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9):2251–2265.
- Unknown intent detection using gaussian mixture model with an application to zero-shot intent classification. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 1050–1060.
- Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161.
- Are pretrained transformers robust in intent classification? A missing ingredient in evaluation of out-of-scope intent detection. CoRR, abs/2106.04564.
- Daniele Comi (1 paper)
- Dimitrios Christofidellis (5 papers)
- Pier Francesco Piazza (1 paper)
- Matteo Manica (28 papers)