Papers
Topics
Authors
Recent
Search
2000 character limit reached

DaVinci at SemEval-2024 Task 9: Few-shot prompting GPT-3.5 for Unconventional Reasoning

Published 19 May 2024 in cs.CL and cs.AI | (2405.11559v1)

Abstract: While significant work has been done in the field of NLP on vertical thinking, which involves primarily logical thinking, little work has been done towards lateral thinking, which involves looking at problems from an unconventional perspective and defying existing conceptions and notions. Towards this direction, SemEval 2024 introduces the task of BRAINTEASER, which involves two types of questions -- Sentence Puzzles and Word Puzzles that defy conventional common-sense reasoning and constraints. In this paper, we tackle both types of questions using few-shot prompting on GPT-3.5 and gain insights regarding the difference in the nature of the two types. Our prompting strategy placed us 26th on the leaderboard for the Sentence Puzzle and 15th on the Word Puzzle task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Evaluating chatgpt as a question answering system: A comprehensive analysis and comparison with existing models.
  2. Piqa: Reasoning about physical commonsense in natural language. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 7432–7439.
  3. Language models are few-shot learners.
  4. Scaling instruction-finetuned language models.
  5. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  6. Semeval-2024 task 9: Brainteaser: A novel task defying common sense. In Proceedings of the 18th International Workshop on Semantic Evaluation. Association for Computational Linguistics.
  7. BRAINTEASER: Lateral thinking puzzles for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14317–14332, Singapore. Association for Computational Linguistics.
  8. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35.
  9. Sengjie Liu and Christopher G. Healey. 2023. Abstractive summarization of large document collections using gpt.
  10. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. arXiv preprint arXiv:2104.08786.
  11. Fairness-guided few-shot prompting for large language models. Advances in Neural Information Processing Systems, 36.
  12. OpenAI. 2022. Chatgpt: A language model by openai. https://www.openai.com.
  13. Leveraging large language models for multiple choice question answering. arXiv preprint arXiv:2210.12353.
  14. Multitask prompted training enables zero-shot task generalization.
  15. Social IQa: Commonsense reasoning about social interactions. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4463–4473, Hong Kong, China. Association for Computational Linguistics.
  16. Commonsenseqa: A question answering challenge targeting commonsense knowledge. arXiv preprint arXiv:1811.00937.
  17. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  18. Shlomo Waks. 1997. Lateral thinking and technology education. Journal of Science Education and Technology, 6:245–255.
  19. Towards understanding chain-of-thought prompting: An empirical study of what matters. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2717–2739, Toronto, Canada. Association for Computational Linguistics.
  20. A comprehensive capability analysis of gpt-3 and gpt-3.5 series models.
  21. On large language models’ selection bias in multi-choice questions. arXiv preprint arXiv:2309.03882.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.