Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications (2306.11892v1)

Published 20 Jun 2023 in cs.CL

Abstract: This paper explores new frontiers in agricultural natural language processing by investigating the effectiveness of using food-related text corpora for pretraining transformer-based LLMs. In particular, we focus on the task of semantic matching, which involves establishing mappings between food descriptions and nutrition data. To accomplish this, we fine-tune a pre-trained transformer-based LLM, AgriBERT, on this task, utilizing an external source of knowledge, such as the FoodOn ontology. To advance the field of agricultural NLP, we propose two new avenues of exploration: (1) utilizing GPT-based models as a baseline and (2) leveraging ChatGPT as an external source of knowledge. ChatGPT has shown to be a strong baseline in many NLP tasks, and we believe it has the potential to improve our model in the task of semantic matching and enhance our model's understanding of food-related concepts and relationships. Additionally, we experiment with other applications, such as cuisine prediction based on food ingredients, and expand the scope of our research to include other NLP tasks beyond semantic matching. Overall, this paper provides promising avenues for future research in this field, with potential implications for improving the performance of agricultural NLP applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (81)
  1. S. Rezayi, Z. Liu, Z. Wu, C. Dhakal, B. Ge, C. Zhen, T. Liu, and S. Li, “Agribert: knowledge-infused agricultural language models for matching food and nutrition,” in IJCAI, 2022.
  2. T. Kenter and M. De Rijke, “Short text similarity with word embeddings,” in CIKM, 2015.
  3. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in NAACL, 2019.
  4. D. Jin, Z. Jin, J. T. Zhou, and P. Szolovits, “Is bert really robust? a strong baseline for natural language attack on text classification and entailment,” in AAAI, vol. 34, 2020.
  5. Y. Shi, H. Ma, W. Zhong, G. Mai, X. Li, T. Liu, and J. Huang, “Chatgraph: Interpretable text classification by converting chatgpt knowledge to graphs,” arXiv preprint arXiv:2305.03513, 2023.
  6. W. Yang, Y. Xie, A. Lin, X. Li, L. Tan, K. Xiong, M. Li, and J. Lin, “End-to-end open-domain question answering with bertserini,” in NAACL, 2019.
  7. G. Mai, W. Huang, J. Sun, S. Song, D. Mishra, N. Liu, S. Gao, T. Liu, G. Cong, Y. Hu et al., “On the opportunities and challenges of foundation models for geospatial artificial intelligence,” arXiv preprint arXiv:2304.06798, 2023.
  8. J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, 2020.
  9. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, “Dbpedia: A nucleus for a web of open data,” in The semantic web.   Springer, 2007, pp. 722–735.
  10. D. Vrandečić, “Wikidata: A new platform for collaborative data collection,” in Proceedings of the 21st international conference on world wide web, 2012, pp. 1063–1064.
  11. N. Noy, Y. Gao, A. Jain, A. Narayanan, A. Patterson, and J. Taylor, “Industry-scale knowledge graphs: Lessons and challenges: Five diverse technology companies show how it’s done,” Queue, vol. 17, no. 2, pp. 48–75, 2019.
  12. K. Janowicz, P. Hitzler, W. Li, D. Rehberger, M. Schildhauer, R. Zhu, C. Shimizu, C. K. Fisher, L. Cai, G. Mai et al., “Know, know where, knowwheregraph: A densely connected, cross-domain knowledge graph and geo-enrichment service stack for applications in environmental intelligence,” AI Magazine, vol. 43, no. 1, pp. 30–39, 2022.
  13. G. Mai, Y. Hu, S. Gao, L. Cai, B. Martins, J. Scholz, J. Gao, and K. Janowicz, “Symbolic and subsymbolic geoai: Geospatial knowledge graphs and spatially explicit machine learning,” Transactions in GIS, 2022.
  14. Y. Qi, G. Mai, R. Zhu, and M. Zhang, “Evkg: An interlinked and interoperable electric vehicle knowledge graph for smart transportation system,” arXiv preprint arXiv:2304.04893, 2023.
  15. W. Liu, P. Zhou, Z. Zhao, Z. Wang, Q. Ju, H. Deng, and P. Wang, “K-bert: Enabling language representation with knowledge graph,” in AAAI, vol. 34, 2020.
  16. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” NIPS, vol. 26, 2013.
  17. J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in EMNLP, 2014.
  18. B. McCann, J. Bradbury, C. Xiong, and R. Socher, “Learned in translation: Contextualized word vectors,” in NIPS, 2017.
  19. I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, and I. Androutsopoulos, “Legal-bert: The muppets straight out of law school,” arXiv preprint arXiv:2010.02559, 2020.
  20. Z. Liu, X. He, L. Liu, T. Liu, and X. Zhai, “Context matters: A strategy to pre-train language model for science education,” arXiv preprint arXiv:2301.12031, 2023.
  21. W. Wang, S. J. Pan, D. Dahlmeier, and X. Xiao, “Coupled multi-layer attentions for co-extraction of aspect and opinion terms,” in Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017.
  22. S. Rezayi, H. Dai, Z. Liu, Z. Wu, A. Hebbar, A. H. Burns, L. Zhao, D. Zhu, Q. Li, W. Liu et al., “Clinicalradiobert: Knowledge-infused few shot learning for clinical notes named entity recognition,” in Machine Learning in Medical Imaging: 13th International Workshop, MLMI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings.   Springer, 2022, pp. 269–278.
  23. Z. Liu, M. He, Z. Jiang, Z. Wu, H. Dai, L. Zhang, S. Luo, T. Han, X. Li, X. Jiang et al., “Survey on natural language processing in medical image analysis.” Zhong nan da xue xue bao. Yi xue ban= Journal of Central South University. Medical Sciences, vol. 47, no. 8, pp. 981–993, 2022.
  24. Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon, “Domain-specific language model pretraining for biomedical natural language processing,” ACM Transactions on Computing for Healthcare, vol. 3, no. 1, 2021.
  25. W. Liao, Z. Liu, H. Dai, Z. Wu, Y. Zhang, X. Huang, Y. Chen, X. Jiang, D. Zhu, T. Liu et al., “Mask-guided bert for few shot text classification,” arXiv preprint arXiv:2302.10447, 2023.
  26. H. Cai, W. Liao, Z. Liu, X. Huang, Y. Zhang, S. Ding, S. Li, Q. Li, T. Liu, and X. Li, “Coarse-to-fine knowledge graph domain adaptation based on distantly-supervised iterative training,” arXiv preprint arXiv:2211.02849, 2022.
  27. M. Koroteev, “Bert: a review of applications in natural language processing and understanding,” arXiv preprint arXiv:2103.11943, 2021.
  28. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  29. Y. Liu, T. Han, S. Ma, J. Zhang, Y. Yang, J. Tian, H. He, A. Li, M. He, Z. Liu et al., “Summary of chatgpt/gpt-4 research and perspective towards the future of large language models,” arXiv preprint arXiv:2304.01852, 2023.
  30. J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler et al., “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022.
  31. T. Zhong, Y. Wei, L. Yang, Z. Wu, Z. Liu, X. Wei, W. Li, J. Yao, C. Ma, X. Li et al., “Chatabl: Abductive learning via natural language interaction with chatgpt,” arXiv preprint arXiv:2304.11107, 2023.
  32. OpenAI, “Gpt-4 technical report,” arXiv, 2023.
  33. H. Nori, N. King, S. M. McKinney, D. Carignan, and E. Horvitz, “Capabilities of gpt-4 on medical challenge problems,” arXiv preprint arXiv:2303.13375, 2023.
  34. J. Holmes, Z. Liu, L. Zhang, Y. Ding, T. T. Sio, L. A. McGee, J. B. Ashman, X. Li, T. Liu, J. Shen et al., “Evaluating large language models on a highly-specialized topic, radiation oncology physics,” arXiv preprint arXiv:2304.01938, 2023.
  35. T. L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilić, D. Hesslow, R. Castagné, A. S. Luccioni, F. Yvon, M. Gallé et al., “Bloom: A 176b-parameter open-access multilingual language model,” arXiv preprint arXiv:2211.05100, 2022.
  36. S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen, C. Dewan, M. Diab, X. Li, X. V. Lin et al., “Opt: Open pre-trained transformer language models,” arXiv preprint arXiv:2205.01068, 2022.
  37. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
  38. R. Anil, A. M. Dai, O. Firat, M. Johnson, D. Lepikhin, A. Passos, S. Shakeri, E. Taropa, P. Bailey, Z. Chen et al., “Palm 2 technical report,” arXiv preprint arXiv:2305.10403, 2023.
  39. C. Ma, Z. Wu, J. Wang, S. Xu, Y. Wei, Z. Liu, L. Guo, X. Cai, S. Zhang, T. Zhang et al., “Impressiongpt: an iterative optimizing framework for radiology report summarization with chatgpt,” arXiv preprint arXiv:2304.08448, 2023.
  40. Z. Wu, L. Zhang, C. Cao, X. Yu, H. Dai, C. Ma, Z. Liu, L. Zhao, G. Li, W. Liu et al., “Exploring the trade-offs: Unified large language models vs local fine-tuned models for highly-specific radiology nli task,” arXiv preprint arXiv:2304.09138, 2023.
  41. H. Dai, Z. Liu, W. Liao, X. Huang, Z. Wu, L. Zhao, W. Liu, N. Liu, S. Li, D. Zhu et al., “Chataug: Leveraging chatgpt for text data augmentation,” arXiv preprint arXiv:2302.13007, 2023.
  42. S. Wang, Z. Zhao, X. Ouyang, Q. Wang, and D. Shen, “Chatcad: Interactive computer-aided diagnosis on medical image using large language models,” arXiv preprint arXiv:2302.07257, 2023.
  43. W. Liao, Z. Liu, H. Dai, S. Xu, Z. Wu, Y. Zhang, X. Huang, D. Zhu, H. Cai, T. Liu et al., “Differentiate chatgpt-generated and human-written medical texts,” arXiv preprint arXiv:2304.11567, 2023.
  44. Z. Liu, X. Yu, L. Zhang, Z. Wu, C. Cao, H. Dai, L. Zhao, W. Liu, D. Shen, Q. Li et al., “Deid-gpt: Zero-shot medical text de-identification by gpt-4,” arXiv preprint arXiv:2303.11032, 2023.
  45. L. Zhao, L. Zhang, Z. Wu, Y. Chen, H. Dai, X. Yu, Z. Liu, T. Zhang, X. Hu, X. Jiang et al., “When brain-inspired ai meets agi,” arXiv preprint arXiv:2303.15935, 2023.
  46. J. Wang, E. Shi, S. Yu, Z. Wu, C. Ma, H. Dai, Q. Yang, Y. Kang, J. Wu, H. Hu et al., “Prompt engineering for healthcare: Methodologies and applications,” arXiv preprint arXiv:2304.14670, 2023.
  47. G. Lu, S. Li, G. Mai, J. Sun, D. Zhu, L. Chai, H. Sun, X. Wang, H. Dai, N. Liu et al., “Agi for agriculture,” arXiv preprint arXiv:2304.06136, 2023.
  48. S. Y. Feng, V. Gangal, J. Wei, S. Chandar, S. Vosoughi, T. Mitamura, and E. Hovy, “A survey of data augmentation approaches for nlp,” in ACL-IJCNLP, 2021.
  49. M. Xia, X. Kong, A. Anastasopoulos, and G. Neubig, “Generalized data augmentation for low-resource translation,” in ACL, 2019.
  50. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, 2002.
  51. J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang, “Gender bias in coreference resolution: Evaluation and debiasing methods,” in NAACL, 2018.
  52. J. Wei, C. Huang, S. Vosoughi, Y. Cheng, and S. Xu, “Few-shot text classification with triplet networks, data augmentation, and curriculum learning,” in NAACL, 2021.
  53. S. Y. Feng, V. Gangal, D. Kang, T. Mitamura, and E. Hovy, “Genaug: Data augmentation for finetuning text generators,” in DeepIO, 2020.
  54. G. A. Miller, “Wordnet: a lexical database for english,” Communications of the ACM, vol. 38, no. 11, 1995.
  55. R. Grundkiewicz, M. Junczys-Dowmunt, and K. Heafield, “Neural grammatical error correction systems with unsupervised pre-training on synthetic data,” in ACL BEA, 2019.
  56. S. Longpre, Y. Wang, and C. DuBois, “How effective is task-agnostic data augmentation for pretrained transformers?” in EMNLP, 2020.
  57. J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performance on text classification tasks,” in EMNLP-IJCNLP, 2019.
  58. H. Shi, K. Livescu, and K. Gimpel, “Substructure substitution: Structured data augmentation for nlp,” in ACL-IJCNLP, 2021.
  59. F. Dernoncourt and J. Y. Lee, “Pubmed 200k rct: a dataset for sequential sentence classification in medical abstracts,” arXiv preprint arXiv:1710.06071, 2017.
  60. S. W.-t. Yih, M.-W. Chang, C. Meek, and A. Pastusiak, “Question answering using enhanced lexical semantic models,” in ACL, 2013.
  61. A. Rücklé, N. S. Moosavi, and I. Gurevych, “Coala: A neural coverage-based approach for long answer selection with small data,” in AAAI, vol. 33, 2019.
  62. M. T. R. Laskar, X. Huang, and E. Hoque, “Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task,” in LREC, 2020, pp. 5505–5514.
  63. A. H. Huang, H. Wang, and Y. Yang, “Finbert: A large language model for extracting information from financial text,” Contemporary Accounting Research, 2022.
  64. S. Rezayi, H. Zhao, S. Kim, R. Rossi, N. Lipka, and S. Li, “Edge: Enriching knowledge graph embeddings with external text,” in NAACL, 2021.
  65. H. Sun, B. Dhingra, M. Zaheer, K. Mazaitis, R. Salakhutdinov, and W. Cohen, “Open domain question answering using early fusion of knowledge bases and text,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 4231–4242.
  66. M. Gritta, M. T. Pilehvar, and N. Collier, “Which melbourne? augmenting geocoding with maps,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 1285–1296.
  67. W. Xiong, J. Du, W. Y. Wang, and V. Stoyanov, “Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model,” in International Conference on Learning Representations, 2020.
  68. M. Yasunaga, A. Bosselut, H. Ren, X. Zhang, C. D. Manning, P. S. Liang, and J. Leskovec, “Deep bidirectional language-knowledge graph pretraining,” Advances in Neural Information Processing Systems, vol. 35, pp. 37 309–37 323, 2022.
  69. C. Liang, J. Berant, Q. Le, K. D. Forbus, and N. Lao, “Neural symbolic machines: Learning semantic parsers on freebase with weak supervision,” in 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017.   Association for Computational Linguistics (ACL), 2017, pp. 23–33.
  70. V. Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense passage retrieval for open-domain question answering,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 6769–6781.
  71. A. X. Chang, V. I. Spitkovsky, E. Yeh, E. Agirre, and C. D. Manning, “Stanford-ubc entity linking at tac-kbp,” in Third Text Analysis Conference (TAC 2010), 2010.
  72. P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer, “Dbpedia spotlight: shedding light on the web of documents,” in Proceedings of the 7th international conference on semantic systems, 2011, pp. 1–8.
  73. F. Piccinno and P. Ferragina, “From tagme to wat: a new entity annotator,” in Proceedings of the first international workshop on Entity recognition & disambiguation, 2014, pp. 55–62.
  74. D. M. Dooley and E. J. e. a. Griffiths, “Foodon: a harmonized food ontology to increase global food traceability, quality control and data integration,” Science of Food, vol. 2, no. 1, 2018.
  75. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” OpenAI, Tech. Rep., 2018.
  76. OpenAI, “Introducing chatgpt,” 2022. [Online]. Available: https://openai.com/blog/chatgpt
  77. M. Marcus, G. Kim, M. A. Marcinkiewicz, R. MacIntyre, A. Bies, M. Ferguson, K. Katz, and B. Schasberger, “The penn treebank: Annotating predicate argument structure,” in Human Language Technology, 1994.
  78. Y. Deng, Y. Xie, Y. Li, M. Yang, N. Du, W. Fan, K. Lei, and Y. Shen, “Multi-task learning with multi-view attention for answer selection and knowledge base question answering,” in AAAI, vol. 33, 2019.
  79. L. Wu, F. Petroni, M. Josifoski, S. Riedel, and L. Zettlemoyer, “Scalable zero-shot entity linking with dense entity retrieval,” in EMNLP, 2020, pp. 6397–6407.
  80. F. Petroni, T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller, “Language models as knowledge bases?” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 2463–2473.
  81. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
Citations (25)

Summary

We haven't generated a summary for this paper yet.