Papers
Topics
Authors
Recent
2000 character limit reached

Verif.ai: Towards an Open-Source Scientific Generative Question-Answering System with Referenced and Verifiable Answers (2402.18589v1)

Published 9 Feb 2024 in cs.IR, cs.AI, cs.CL, and cs.LG

Abstract: In this paper, we present the current progress of the project Verif.ai, an open-source scientific generative question-answering system with referenced and verified answers. The components of the system are (1) an information retrieval system combining semantic and lexical search techniques over scientific papers (PubMed), (2) a fine-tuned generative model (Mistral 7B) taking top answers and generating answers with references to the papers from which the claim was derived, and (3) a verification engine that cross-checks the generated claim and the abstract or paper from which the claim was derived, verifying whether there may have been any hallucinations in generating the claim. We are reinforcing the generative model by providing the abstract in context, but in addition, an independent set of methods and models are verifying the answer and checking for hallucinations. Therefore, we believe that by using our method, we can make scientists more productive, while building trust in the use of generative LLMs in scientific environments, where hallucinations and misinformation cannot be tolerated.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. OpenAI. GPT-4 Technical Report; 2023.
  2. Mistral 7B. arXiv preprint arXiv:231006825. 2023.
  3. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:230312712. 2023.
  4. Generative agents: Interactive simulacra of human behavior. In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology; 2023. p. 1-22.
  5. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:230709288. 2023.
  6. Gpt-4 passes the bar exam. Available at SSRN 4389233. 2023.
  7. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems. 2020;33:9459-74.
  8. Fact or fiction: Verifying scientific claims. arXiv preprint arXiv:200414974. 2020.
  9. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:231105232. 2023.
  10. Expertqa: Expert-curated questions and attributed answers. arXiv preprint arXiv:230907852. 2023.
  11. An Interdisciplinary Outlook on Large Language Models for Scientific Research. arXiv preprint arXiv:231104929. 2023.
  12. Ms marco: Benchmarking ranking models in the large-data regime. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2021. p. 1566-76.
  13. Pubmedqa: A dataset for biomedical research question answering. arXiv preprint arXiv:190906146. 2019.
  14. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:230514314. 2023.
  15. Adaptive mixtures of local experts. Neural computation. 1991;3(1):79-87.
  16. Mixtral of Experts. arXiv preprint arXiv:240104088. 2024.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 3 tweets with 0 likes about this paper.