AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media (2401.08825v1)
Abstract: Online reviews in the form of user-generated content (UGC) significantly impact consumer decision-making. However, the pervasive issue of not only human fake content but also machine-generated content challenges UGC's reliability. Recent advances in LLMs may pave the way to fabricate indistinguishable fake generated content at a much lower cost. Leveraging OpenAI's GPT-4-Turbo and DALL-E-2 models, we craft AiGen-FoodReview, a multi-modal dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated. We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA. We use attributes from readability and photographic theories to score reviews and images, respectively, demonstrating their utility as hand-crafted features in scalable and interpretable detection models, with comparable performance. The paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.
- Online review helpfulness: Role of qualitative factors. Psychology & Marketing, 33(11): 1006–1017.
- GPT-NeoX-20B: An Open-Source Autoregressive Language Model. arXiv:2204.06745.
- Language Models are Few-Shot Learners. arXiv:2005.14165.
- Evaluating large language models trained on code. arXiv:2107.03374.
- Survey of review spam detection using machine learning techniques. Journal of Big Data, 2(23): 1–24.
- A formula for predicting readability: Instructions. Educational research bulletin, 37–54.
- Dean, G. 2021. Websites are selling fake reviews ’in bulk’ to Amazon merchants, a report found. One site offered 1,000 reviews for $11,000. https://www.businessinsider.com/fake-amazon-reviews-for-sale-buy-merchants-amazons-choice-2021-2. Accessed: 2024-01-03.
- How people evaluate online reviews. Communication Research, 45(5): 719–736.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805.
- Do online reviews matter?—An empirical investigation of panel data. Decision support systems, 45(4): 1007–1016.
- Why do travelers trust TripAdvisor? Antecedents of trust towards consumer-generated media and its influence on recommendation adoption and word of mouth. Tourism Management, 51: 174–185.
- Flesch, R. 1948. A new readability yardstick. Journal of Applied Psychology, 32(3): 221–233.
- Camera eats first: exploring food aesthetics portrayed on social media using deep learning. International Journal of Contemporary Hospitality Management, 34(9): 3300–3331.
- Dissecting AI-Generated Fake Reviews: Detection and Analysis of GPT-Based Restaurant Reviews on Social Media. In Proceedings of the International Conference on Information Systems, 8. Aisnet.
- Generative Adversarial Nets. In Ghahramani, Z.; Welling, M.; Cortes, C.; Lawrence, N.; and Weinberger, K., eds., Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc.
- The manager’s dilemma: a conceptualization of online review manipulation strategies. Current Issues in Tourism, 21(5): 484–503.
- The market for fake reviews. Marketing Science, 41(5): 896–921.
- Horbatko, L. 2023. AI Image Detector. https://github.com/guyfloki/ai-image-detector.
- Scaling up visual and vision-language representation learning with noisy text supervision. In International conference on machine learning, 4904–4916. PMLR.
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual differences, 103: 102274.
- Auto-Encoding Variational Bayes. arXiv:1312.6114.
- Generating Images with Multimodal Language Models. arXiv:2305.17216.
- Sentiment manipulation in online platforms: An analysis of movie tweets. Production and Operations Management, 27(3): 393–416.
- Align before Fuse: Vision and Language Representation Learning with Momentum Distillation. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 9694–9705. Curran Associates, Inc.
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692.
- Luca, M. 2016. Reviews, reputation, and revenue: The case of Yelp. com. Com (March 15, 2016). Harvard Business School NOM Unit Working Paper, (12-016).
- Fake it till you make it: Reputation, competition, and Yelp review fraud. Management Science, 62(12): 3412–3427.
- A Unified Approach to Interpreting Model Predictions. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems 30, 4765–4774. Curran Associates, Inc.
- McCluskey, M. 2022. Inside the War on Fake Consumer Reviews. https://time.com/6192933/fake-reviews-regulation. Accessed: 2024-01-07.
- Conditional Generative Adversarial Nets. arXiv:1411.1784.
- What yelp fake review filter might be doing? In Proceedings of the international AAAI conference on web and social media, volume 7, 409–418.
- Munzel, A. 2016. Assisting consumers in detecting fake reviews: The role of identity information disclosure and consensus. Journal of Retailing and Consumer Services, 32: 96–108.
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774.
- Fake Review Detection on Online E-Commerce Platforms: A Systematic Literature Review. Data Mining and Knowledge Discovery, 35(5): 1830–1881.
- Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8): 9.
- ArtiFact: A Large-Scale Dataset with Artificial and Factual Images for Generalizable and Robust Synthetic Image Detection. arXiv:2302.11970.
- Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv:2204.06125.
- Zero-shot text-to-image generation. In International Conference on Machine Learning, 8821–8831. PMLR.
- Automated readability index. Technical report, Amrl-Tr. Aerospace Medical Research Laboratories.
- FLAVA: A Foundational Language And Vision Alignment Model. arXiv:2112.04482.
- Release Strategies and the Social Impacts of Language Models. arXiv:1908.09203.
- A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions. arXiv:2312.08578.
- Pixel Recurrent Neural Networks. arXiv:1601.06759.
- Attention is All you Need. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science. arXiv:2305.15041.
- Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv:2306.07899.
- Examining the Impact of Yelp’s Elite Squad on Users’ Following Contribution. In CIS 2021 Proceedings. 23, 1–16.
- A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv:2302.11382.
- The economic value of online reviews. Marketing Science, 34(5): 739–754.
- Why is a picture ‘worth a thousand words’? Pictures as information in perceived helpfulness of online reviews. International Journal of Consumer Studies, 45(3): 364–378.
- A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica, 10(5): 1122–1136.
- Effects of online reviews and managerial responses from a review manipulation perspective. Current Issues in Tourism, 23(17): 2207–2222.
- Can consumer-posted photos serve as a leading indicator of restaurant survival? Evidence from Yelp. Management Science, 69(1): 25–50.
- A matter of reevaluation: incentivizing users to contribute reviews in online platforms. Decision Support Systems, 128: 113158.
- Welfare economics of review information: Implications for the online selling platform owner. International Journal of Production Economics, 184: 69–79.
- Modeling consumer learning from online product reviews. Marketing Science, 32(1): 153–169.
- Large Language Models Are Human-Level Prompt Engineers. arXiv:2211.01910.
- Manufactured opinions: The effect of manipulating online product reviews. Journal of Business Research, 87: 24–35.