Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering (2310.09536v1)

Published 14 Oct 2023 in cs.CL, cs.IR, and cs.LG

Abstract: LLMs have demonstrated remarkable performance by following natural language instructions without fine-tuning them on domain-specific tasks and data. However, leveraging LLMs for domain-specific question answering suffers from severe limitations. The generated answer tends to hallucinate due to the training data collection time (when using off-the-shelf), complex user utterance and wrong retrieval (in retrieval-augmented generation). Furthermore, due to the lack of awareness about the domain and expected output, such LLMs may generate unexpected and unsafe answers that are not tailored to the target domain. In this paper, we propose CarExpert, an in-car retrieval-augmented conversational question-answering system leveraging LLMs for different tasks. Specifically, CarExpert employs LLMs to control the input, provide domain-specific documents to the extractive and generative answering components, and controls the output to ensure safe and domain-specific answers. A comprehensive empirical evaluation exhibits that CarExpert outperforms state-of-the-art LLMs in generating natural, safe and car-specific answers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Probabilistic FastText for multi-sense word embeddings. In Proc. of ACL, pages 1–11, Melbourne, Australia. Association for Computational Linguistics.
  2. Constitutional ai: Harmlessness from ai feedback. ArXiv preprint, abs/2212.08073.
  3. Satanjeev Banerjee and Alon Lavie. 2005. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72.
  4. Improving language models by retrieving from trillions of tokens. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 2206–2240. PMLR.
  5. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  6. Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers. In The Semantic Web: 18th International Conference, ESWC 2021, Virtual Event, June 6–10, 2021, Proceedings 18, pages 323–339. Springer.
  7. Palm: Scaling language modeling with pathways. ArXiv preprint, abs/2204.02311.
  8. Complex qa and language models hybrid architectures, survey. ArXiv preprint, abs/2302.09051.
  9. FiDO: Fusion-in-decoder optimized for stronger performance and faster inference. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11534–11547, Toronto, Canada. Association for Computational Linguistics.
  10. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. of NAACL-HLT, pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  11. Language model cascades. ArXiv preprint, abs/2207.10342.
  12. From distillation to hard negative sampling: Making sparse neural ir models more effective.
  13. Splade: Sparse lexical and expansion model for first stage ranking.
  14. More than you’ve asked for: A comprehensive analysis of novel prompt injection threats to application-integrated large language models. CoRR, abs/2302.12173.
  15. Survey of hallucination in natural language generation. ACM Computing Surveys, 55:1 – 38.
  16. Dense passage retrieval for open-domain question answering. In Proc. of EMNLP, pages 6769–6781, Online. Association for Computational Linguistics.
  17. Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive nlp. ArXiv preprint, abs/2212.14024.
  18. Document-grounded goal-oriented dialogue systems on pre-trained language model with diverse input representation. In Proceedings of the 1st Workshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc 2021), pages 98–102, Online. Association for Computational Linguistics.
  19. ALBERT: A lite BERT for self-supervised learning of language representations. In Proc. of ICLR. OpenReview.net.
  20. Jey Han Lau and Timothy Baldwin. 2016. An empirical evaluation of doc2vec with practical insights into document embedding generation. In Proceedings of the 1st Workshop on Representation Learning for NLP, pages 78–86, Berlin, Germany. Association for Computational Linguistics.
  21. Holistic evaluation of language models. Annals of the New York Academy of Sciences.
  22. OpenAI. 2023. Gpt-4 technical report.
  23. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems.
  24. Fábio Perez and Ian Ribeiro. 2022. Ignore previous prompt: Attack techniques for language models. ArXiv, abs/2211.09527.
  25. GrIPS: Gradient-free, edit-based instruction search for prompting large language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3845–3864, Dubrovnik, Croatia. Association for Computational Linguistics.
  26. SQuAD: 100,000+ questions for machine comprehension of text. In Proc. of EMNLP, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.
  27. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proc. of EMNLP, pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  28. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389.
  29. Joshua Robinson and David Wingate. 2023. Leveraging large language models for multiple choice question answering. In The Eleventh International Conference on Learning Representations.
  30. Tree-kgqa: An unsupervised approach for question answering over knowledge graphs. IEEE Access, 10:50467–50478.
  31. DialoKG: Knowledge-structure aware task-oriented dialogue generation. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2557–2571, Seattle, United States. Association for Computational Linguistics.
  32. Climate bot: A machine reading comprehension system for climate change question answering. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 5249–5252. International Joint Conferences on Artificial Intelligence Organization. AI for Good - Demos.
  33. Timo Schick and Hinrich Schütze. 2021. Few-shot text generation with natural language instructions. In Proc. of EMNLP, pages 390–402, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  34. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
  35. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  36. Lamda: Language models for dialog applications. ArXiv preprint, abs/2201.08239.
  37. Llama: Open and efficient foundation language models.
  38. Llama 2: Open foundation and fine-tuned chat models.
  39. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008.
  40. Finetuned language models are zero-shot learners. In Proc. of ICLR. OpenReview.net.
  41. A universal discriminator for zero-shot generalization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10559–10575, Toronto, Canada. Association for Computational Linguistics.
  42. Conversational question answering: a survey. Knowledge and Information Systems, 64:3151 – 3195.
  43. Uni-retriever: Towards learning the unified embedding based retriever in bing sponsored search. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4493–4501.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Md Rashad Al Hasan Rony (10 papers)
  2. Christian Suess (1 paper)
  3. Sinchana Ramakanth Bhat (3 papers)
  4. Viju Sudhi (3 papers)
  5. Julia Schneider (8 papers)
  6. Maximilian Vogel (2 papers)
  7. Roman Teucher (2 papers)
  8. Ken E. Friedl (3 papers)
  9. Soumya Sahoo (1 paper)
Citations (7)