Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models as Sous Chefs: Revising Recipes with GPT-3 (2306.13986v1)

Published 24 Jun 2023 in cs.CL

Abstract: With their remarkably improved text generation and prompting capabilities, LLMs can adapt existing written information into forms that are easier to use and understand. In our work, we focus on recipes as an example of complex, diverse, and widely used instructions. We develop a prompt grounded in the original recipe and ingredients list that breaks recipes down into simpler steps. We apply this prompt to recipes from various world cuisines, and experiment with several LLMs, finding best results with GPT-3.5. We also contribute an Amazon Mechanical Turk task that is carefully designed to reduce fatigue while collecting human judgment of the quality of recipe revisions. We find that annotators usually prefer the revision over the original, demonstrating a promising application of LLMs in serving as digital sous chefs for recipes and beyond. We release our prompt, code, and MTurk template for public use.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. Satanjeev Banerjee and Alon Lavie. 2005. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72.
  2. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  4. Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York.
  5. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  6. Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images. IEEE Trans. Pattern Anal. Mach. Intell.
  7. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
  8. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.
  9. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
  10. Bidirectional language models are also few-shot learners.
  11. Learning cross-modal embeddings for cooking recipes and food images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  12. Bleurt: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696.
  13. Bartscore: Evaluating generated text as text generation. Advances in Neural Information Processing Systems, 34:27263–27277.
  14. Reasoning about goals, steps, and temporal ordering with WikiHow. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4630–4639, Online. Association for Computational Linguistics.
  15. Causal reasoning of entities and events in procedural texts. In Findings of the Association for Computational Linguistics: EACL 2023, Dubrovnik, Croatia. Association for Computational Linguistics.
  16. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
  17. Temporal reasoning on implicit events from distant supervision. ArXiv, abs/2010.12753.
  18. Show me more details: Discovering hierarchies of procedures from semi-structured web data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2998–3012, Dublin, Ireland. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Alyssa Hwang (10 papers)
  2. Bryan Li (17 papers)
  3. Zhaoyi Hou (3 papers)
  4. Dan Roth (222 papers)
Citations (1)