Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative AI for Synthetic Data Generation: Methods, Challenges and the Future (2403.04190v1)

Published 7 Mar 2024 in cs.LG, cs.AI, and cs.CL
Generative AI for Synthetic Data Generation: Methods, Challenges and the Future

Abstract: The recent surge in research focused on generating synthetic data from LLMs, especially for scenarios with limited data availability, marks a notable shift in Generative AI. Their ability to perform comparably to real-world data positions this approach as a compelling solution to low-resource challenges. This paper delves into advanced technologies that leverage these gigantic LLMs for the generation of task-specific training data. We outline methodologies, evaluation techniques, and practical applications, discuss the current limitations, and suggest potential pathways for future research.

Generative AI for Synthetic Data Generation: A Professional Overview

The paper "Generative AI for Synthetic Data Generation: Methods, Challenges and the Future" by Xu Guo and Yiqiang Chen explores the domain of using Generative AI, specifically LLMs, to create synthetic data. This research is positioned at the intersection of data generation and AI, highlighting methodologies, challenges, and potential applications that leverage LLMs for improved synthetic data generation.

Methodologies

The paper details the advancements in generating synthetic data using LLMs by emphasizing several innovative methodologies. Central to these methods is prompt engineering, wherein prompts are refined to more effectively guide LLMs in producing task-specific data. Techniques such as attribute-controlled prompts and the use of verbalizers are employed to enhance the relevance and diversity of the generated data.

Another crucial aspect addressed is the parameter-efficient task adaptation. Methods like FewGen apply parameter-efficient tuning strategies to align a general-purpose LLM with specific tasks, utilizing few-shot data to retrain certain model components without altering the entire architecture. This approach allows for effective task adaptation while maintaining computational efficiency.

The paper also discusses methods for ensuring the quality of synthetic data by employing metrics for diversity, correctness, and naturalness. Approaches such as using quality estimation modules during the data generation process illustrate how models may prioritize high-fidelity synthetic outputs.

Finally, strategies for training models using synthetic data are considered. This includes the introduction of regularization techniques and other methodological innovations to mitigate noise and inherent biases in training data sets.

Applications and Implications

The paper outlines several applications of synthetic data generation, notably in addressing low-resource and long-tail problems where real data is sparse or unevenly distributed. Synthetic data serves as a valuable asset in these scenarios, providing robust training datasets that contribute to more generalized and accessible AI models.

In practical deployment contexts, synthetic data enables the training of lightweight models suitable for environments where computational resources are constrained. This capacity promotes faster inference and easier integration into real-world applications.

The use cases extend to specialized domains such as medicine, where data privacy concerns limit the availability of real data. Synthetic data facilitates meaningful advancements in medical AI tasks, offering enhanced training opportunities without compromising confidentiality.

Challenges and Future Directions

Despite the potential benefits, the paper rightly acknowledges ongoing challenges associated with synthetic data generation. Ensuring the quality and diversity of synthetic data remains an open problem, particularly when addressing hallucinations and inaccuracies inherent in LLM outputs. Furthermore, the ethical and privacy implications of synthetic data utilization are scrutinized, with calls for robust policy and technical frameworks to safeguard individual rights and data integrity.

The discussed implications prompt the need for further research to enhance the alignment of synthetic data with real-world requirements. The trajectory for future work involves developing more sophisticated techniques for data generation, addressing bias mitigation, and fostering inclusive AI development strategies.

Conclusion

The paper by Guo and Chen provides a comprehensive examination of the state of synthetic data generation using LLMs, presenting both the potential and the hurdles that define this evolving field. By outlining current methodologies and envisioning future directions, the authors contribute significantly to advancing understanding and fostering innovation in AI-driven synthetic data generation. This work underscores the necessity of collaborative efforts in bridging the gap between technological possibilities and practical implementations, ensuring ethical, efficient, and inclusive progress in AI research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (94)
  1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” 2023.
  2. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  3. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds.   Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423
  4. T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” 2023.
  5. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “Llama: Open and efficient foundation language models,” 2023.
  6. OpenAI, “Introducing chatgpt,” 2023.
  7. Y. Meng, M. Michalski, J. Huang, Y. Zhang, T. Abdelzaher, and J. Han, “Tuning language models as training data generators for augmentation-enhanced few-shot learning,” in International Conference on Machine Learning.   PMLR, 2023, pp. 24 457–24 477.
  8. Y. Meng, J. Huang, Y. Zhang, and J. Han, “Generating training data with language models: Towards zero-shot language understanding,” in Advances in Neural Information Processing Systems, A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022. [Online]. Available: https://openreview.net/forum?id=4G1Sfp_1sz7
  9. J. Ye, J. Gao, Q. Li, H. Xu, J. Feng, Z. Wu, T. Yu, and L. Kong, “ZeroGen: Efficient zero-shot learning via dataset generation,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 11 653–11 669. [Online]. Available: https://aclanthology.org/2022.emnlp-main.801
  10. J. Gao, R. Pi, L. Yong, H. Xu, J. Ye, Z. Wu, W. ZHANG, X. Liang, Z. Li, and L. Kong, “Self-guided noise-free data generation for efficient zero-shot learning,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=h5OpjGd_lo6
  11. J. Ye, J. Gao, Z. Wu, J. Feng, T. Yu, and L. Kong, “ProGen: Progressive zero-shot dataset generation via in-context feedback,” in Findings of the Association for Computational Linguistics: EMNLP 2022, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 3671–3683. [Online]. Available: https://aclanthology.org/2022.findings-emnlp.269
  12. Y. Yu, Y. Zhuang, R. Zhang, Y. Meng, J. Shen, and C. Zhang, “ReGen: Zero-shot text classification via training data generation with progressive dense retrieval,” in Findings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds.   Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 11 782–11 805. [Online]. Available: https://aclanthology.org/2023.findings-acl.748
  13. D. Chen, C. Lee, Y. Lu, D. Rosati, and Z. Yu, “Mixture of soft prompts for controllable data generation,” in Findings of the Association for Computational Linguistics: EMNLP 2023, H. Bouamor, J. Pino, and K. Bali, Eds.   Singapore: Association for Computational Linguistics, Dec. 2023, pp. 14 815–14 833. [Online]. Available: https://aclanthology.org/2023.findings-emnlp.988
  14. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” 2014.
  15. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” 2022.
  16. Y. Wu, J. Donahue, D. Balduzzi, K. Simonyan, and T. Lillicrap, “Logan: Latent optimisation for generative adversarial networks,” 2020.
  17. K. Huang, J. Altosaar, and R. Ranganath, “Clinicalbert: Modeling clinical notes and predicting hospital readmission,” arXiv preprint arXiv:1904.05342, 2019.
  18. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  19. Y. Zhu, R. Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, and S. Fidler, “Aligning books and movies: Towards story-like visual explanations by watching movies and reading books,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 19–27.
  20. C. Peng, X. Yang, A. Chen, K. E. Smith, N. PourNejatian, A. B. Costa, C. Martin, M. G. Flores, Y. Zhang, T. Magoc et al., “A study of generative large language model for medical research and healthcare,” arXiv preprint arXiv:2305.13523, 2023.
  21. S. Moore, R. Tong, A. Singh, Z. Liu, X. Hu, Y. Lu, J. Liang, C. Cao, H. Khosravi, P. Denny et al., “Empowering education with llms-the next-gen interface and content generation,” in International Conference on Artificial Intelligence in Education.   Springer, 2023, pp. 32–37.
  22. N. Rane, “Role and challenges of chatgpt and similar generative artificial intelligence in business management,” Available at SSRN 4603227, 2023.
  23. Y. Cao, S. Li, Y. Liu, Z. Yan, Y. Dai, P. S. Yu, and L. Sun, “A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt,” arXiv preprint arXiv:2303.04226, 2023.
  24. A. Bauer, S. Trapp, M. Stenger, R. Leppich, S. Kounev, M. Leznik, K. Chard, and I. Foster, “Comprehensive exploration of synthetic data generation: A survey,” arXiv preprint arXiv:2401.02524, 2024.
  25. C. Zhang, C. Zhang, M. Zhang, and I. S. Kweon, “Text-to-image diffusion model in generative ai: A survey,” arXiv preprint arXiv:2303.07909, 2023.
  26. C. Zhang, C. Zhang, S. Zheng, M. Zhang, M. Qamar, S.-H. Bae, and I. S. Kweon, “A survey on audio diffusion models: Text to speech synthesis and enhancement in generative ai,” arXiv preprint arXiv:2303.13336, vol. 2, 2023.
  27. D. Baidoo-Anu and L. O. Ansah, “Education in the era of generative artificial intelligence (ai): Understanding the potential benefits of chatgpt in promoting teaching and learning,” Journal of AI, vol. 7, no. 1, pp. 52–62, 2023.
  28. P. Yu, H. Xu, X. Hu, and C. Deng, “Leveraging generative ai and large language models: A comprehensive roadmap for healthcare integration,” in Healthcare, vol. 11, no. 20.   MDPI, 2023, p. 2776.
  29. X. Qiu, T. Sun, Y. Xu, Y. Shao, N. Dai, and X. Huang, “Pre-trained models for natural language processing: A survey,” Science China Technological Sciences, vol. 63, no. 10, pp. 1872–1897, 2020.
  30. B. Min, H. Ross, E. Sulem, A. P. B. Veyseh, T. H. Nguyen, O. Sainz, E. Agirre, I. Heintz, and D. Roth, “Recent advances in natural language processing via large pre-trained language models: A survey,” ACM Computing Surveys, vol. 56, no. 2, pp. 1–40, 2023.
  31. X. Guo, “Data-efficient domain adaptation for pretrained language models,” 2023.
  32. X. Guo and H. Yu, “On the domain adaptation and generalization of pretrained language models: A survey,” arXiv preprint arXiv:2211.03154, 2022.
  33. J. Li, T. Tang, W. X. Zhao, J.-Y. Nie, and J.-R. Wen, “Pretrained language models for text generation: A survey,” arXiv preprint arXiv:2201.05273, 2022.
  34. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 5485–5551, 2020.
  35. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  36. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  37. R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, D. Yarowsky, T. Baldwin, A. Korhonen, K. Livescu, and S. Bethard, Eds.   Seattle, Washington, USA: Association for Computational Linguistics, Oct. 2013, pp. 1631–1642. [Online]. Available: https://aclanthology.org/D13-1170
  38. A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, D. Lin, Y. Matsumoto, and R. Mihalcea, Eds.   Portland, Oregon, USA: Association for Computational Linguistics, Jun. 2011, pp. 142–150. [Online]. Available: https://aclanthology.org/P11-1015
  39. P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, J. Su, K. Duh, and X. Carreras, Eds.   Austin, Texas: Association for Computational Linguistics, Nov. 2016, pp. 2383–2392. [Online]. Available: https://aclanthology.org/D16-1264
  40. V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” 2020.
  41. I. Dagan, O. Glickman, and B. Magnini, “The pascal recognising textual entailment challenge,” in Machine learning challenges workshop.   Springer, 2005, pp. 177–190.
  42. M. Bartolo, A. Roberts, J. Welbl, S. Riedel, and P. Stenetorp, “Beat the AI: Investigating adversarial human annotation for reading comprehension,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 662–678, 2020. [Online]. Available: https://aclanthology.org/2020.tacl-1.43
  43. J. McAuley and J. Leskovec, “Hidden factors and hidden topics: understanding rating dimensions with review text,” in Proceedings of the 7th ACM conference on Recommender systems, 2013, pp. 165–172.
  44. B. Pang and L. Lee, “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales,” in Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), K. Knight, H. T. Ng, and K. Oflazer, Eds.   Ann Arbor, Michigan: Association for Computational Linguistics, Jun. 2005, pp. 115–124. [Online]. Available: https://aclanthology.org/P05-1015
  45. X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in neural information processing systems, vol. 28, 2015.
  46. B. Pang and L. Lee, “A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts,” in Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, Spain, Jul. 2004, pp. 271–278. [Online]. Available: https://aclanthology.org/P04-1035
  47. N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher, “Ctrl: A conditional transformer language model for controllable generation,” arXiv preprint arXiv:1909.05858, 2019.
  48. Y. Meng, C. Xiong, P. Bajaj, P. Bennett, J. Han, X. Song et al., “Coco-lm: Correcting and contrasting text sequences for language model pretraining,” Advances in Neural Information Processing Systems, vol. 34, pp. 23 102–23 114, 2021.
  49. A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman, “GLUE: A multi-task benchmark and analysis platform for natural language understanding,” in Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, T. Linzen, G. Chrupała, and A. Alishahi, Eds.   Brussels, Belgium: Association for Computational Linguistics, Nov. 2018, pp. 353–355. [Online]. Available: https://aclanthology.org/W18-5446
  50. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  51. L. Gao and J. Callan, “Condenser: a pre-training architecture for dense retrieval,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, M.-F. Moens, X. Huang, L. Specia, and S. W.-t. Yih, Eds.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 981–993. [Online]. Available: https://aclanthology.org/2021.emnlp-main.75
  52. Y. Meng, J. Shen, C. Zhang, and J. Han, “Weakly-supervised hierarchical text classification,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 6826–6833, Jul. 2019. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/4658
  53. Y. Yu, Y. Zhuang, J. Zhang, Y. Meng, A. Ratner, R. Krishna, J. Shen, and C. Zhang, “Large language model as attributed training data generator: A tale of diversity and bias,” in Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. [Online]. Available: https://openreview.net/forum?id=6hZIfAY9GD
  54. J. Blitzer, M. Dredze, and F. Pereira, “Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification,” in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, A. Zaenen and A. van den Bosch, Eds.   Prague, Czech Republic: Association for Computational Linguistics, Jun. 2007, pp. 440–447. [Online]. Available: https://aclanthology.org/P07-1056
  55. G. Geigle, N. Reimers, A. Rücklé, and I. Gurevych, “Tweac: Transformer with extendable qa agent classifiers,” 2021.
  56. H. W. Chung, L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, Y. Li, X. Wang, M. Dehghani, S. Brahma, A. Webson, S. S. Gu, Z. Dai, M. Suzgun, X. Chen, A. Chowdhery, A. Castro-Ros, M. Pellat, K. Robinson, D. Valter, S. Narang, G. Mishra, A. Yu, V. Zhao, Y. Huang, A. Dai, H. Yu, S. Petrov, E. H. Chi, J. Dean, J. Devlin, A. Roberts, D. Zhou, Q. V. Le, and J. Wei, “Scaling instruction-finetuned language models,” 2022.
  57. B. Peng, M. Galley, P. He, C. Brockett, L. Liden, E. Nouri, Z. Yu, B. Dolan, and J. Gao, “Godel: Large-scale pre-training for goal-directed dialog,” 2022.
  58. I. Casanueva, I. Vulić, G. Spithourakis, and P. Budzianowski, “NLU++: A multi-label, slot-rich, generalisable dataset for natural language understanding in task-oriented dialogue,” in Findings of the Association for Computational Linguistics: NAACL 2022, M. Carpuat, M.-C. de Marneffe, and I. V. Meza Ruiz, Eds.   Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 1998–2013. [Online]. Available: https://aclanthology.org/2022.findings-naacl.154
  59. X. Chen, A. Ghoshal, Y. Mehdad, L. Zettlemoyer, and S. Gupta, “Low-resource domain adaptation for compositional task-oriented semantic parsing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y. He, and Y. Liu, Eds.   Online: Association for Computational Linguistics, Nov. 2020, pp. 5090–5100. [Online]. Available: https://aclanthology.org/2020.emnlp-main.413
  60. Z. Liu, Y. Xu, T. Yu, W. Dai, Z. Ji, S. Cahyawijaya, A. Madotto, and P. Fung, “Crossner: Evaluating cross-domain named entity recognition,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 15, pp. 13 452–13 460, May 2021. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17587
  61. G. Cui, S. Hu, N. Ding, L. Huang, and Z. Liu, “Prototypical verbalizer for prompt-based few-shot tuning,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 7014–7024. [Online]. Available: https://aclanthology.org/2022.acl-long.483
  62. S. Hu, N. Ding, H. Wang, Z. Liu, J. Wang, J. Li, W. Wu, and M. Sun, “Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification,” 2022.
  63. L. Reynolds and K. McDonell, “Prompt programming for large language models: Beyond the few-shot paradigm,” 2021.
  64. E. Ben Zaken, Y. Goldberg, and S. Ravfogel, “BitFit: Simple parameter-efficient fine-tuning for transformer-based masked language-models,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 1–9. [Online]. Available: https://aclanthology.org/2022.acl-short.1
  65. N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for NLP,” in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97.   PMLR, 09–15 Jun 2019, pp. 2790–2799. [Online]. Available: https://proceedings.mlr.press/v97/houlsby19a.html
  66. M. Liu, X. Guo, H. Jiakai, J. Chen, F. Zhou, and S. Hui, “InteMATs: Integrating granularity-specific multilingual adapters for cross-lingual transfer,” in Findings of the Association for Computational Linguistics: EMNLP 2023, H. Bouamor, J. Pino, and K. Bali, Eds.   Singapore: Association for Computational Linguistics, Dec. 2023, pp. 5035–5049. [Online]. Available: https://aclanthology.org/2023.findings-emnlp.335
  67. B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, M.-F. Moens, X. Huang, L. Specia, and S. W.-t. Yih, Eds.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 3045–3059. [Online]. Available: https://aclanthology.org/2021.emnlp-main.243
  68. X. Guo, B. Li, and H. Yu, “Improving the sample efficiency of prompt tuning with domain adaptation,” in Findings of the Association for Computational Linguistics: EMNLP 2022, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 3523–3537. [Online]. Available: https://aclanthology.org/2022.findings-emnlp.258
  69. X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds.   Online: Association for Computational Linguistics, Aug. 2021, pp. 4582–4597. [Online]. Available: https://aclanthology.org/2021.acl-long.353
  70. E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=nZeVKeeFYf9
  71. N. Ding, Y. Qin, G. Yang, F. Wei, Z. Yang, Y. Su, S. Hu, Y. Chen, C.-M. Chan, W. Chen, J. Yi, W. Zhao, X. Wang, Z. Liu, H.-T. Zheng, J. Chen, Y. Liu, J. Tang, J. Li, and M. Sun, “Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models,” 2022.
  72. S. Laine and T. Aila, “Temporal ensembling for semi-supervised learning,” arXiv preprint arXiv:1610.02242, 2016.
  73. R. Müller, S. Kornblith, and G. Hinton, “When does label smoothing help?” 2020.
  74. Z. Du, Y. Li, X. Guo, Y. Sun, and B. Li, “Training multimedia event extraction with generated images and captions,” arXiv preprint arXiv:2306.08966, 2023.
  75. A. M. H. Tiong, J. Li, G. Lin, B. Li, C. Xiong, and S. C. H. Hoi, “Improving tail-class representation with centroid contrastive learning,” 2023.
  76. X. Guo, B. Li, H. Yu, and C. Miao, “Latent-optimized adversarial neural transfer for sarcasm detection,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, S. Bethard, R. Cotterell, T. Chakraborty, and Y. Zhou, Eds.   Online: Association for Computational Linguistics, Jun. 2021, pp. 5394–5407. [Online]. Available: https://aclanthology.org/2021.naacl-main.425
  77. Z. Du, H. Li, X. Guo, and B. Li, “Training on synthetic data beats real data in multimodal relation extraction,” 2023.
  78. H. Huang, O. Zheng, D. Wang, J. Yin, Z. Wang, S. Ding, H. Yin, C. Xu, R. Yang, Q. Zheng et al., “Chatgpt for shaping the future of dentistry: the potential of multi-modal large language model,” International Journal of Oral Science, vol. 15, no. 1, p. 29, 2023.
  79. K. Packhäuser, L. Folle, F. Thamm, and A. Maier, “Generation of anonymous chest radiographs using latent diffusion models for training thoracic abnormality classification systems,” in 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI).   IEEE, 2023, pp. 1–5.
  80. K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl et al., “Large language models encode clinical knowledge,” Nature, vol. 620, no. 7972, pp. 172–180, 2023.
  81. A. J. Thirunavukarasu, D. S. J. Ting, K. Elangovan, L. Gutierrez, T. F. Tan, and D. S. W. Ting, “Large language models in medicine,” Nature medicine, vol. 29, no. 8, pp. 1930–1940, 2023.
  82. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” 2022.
  83. R. Tang, X. Han, X. Jiang, and X. Hu, “Does synthetic data generation of llms help clinical text mining?” arXiv preprint arXiv:2303.04360, 2023.
  84. M. Liu, S. Li, H. Yuan, M. E. H. Ong, Y. Ning, F. Xie, S. E. Saffari, Y. Shang, V. Volovici, B. Chakraborty et al., “Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques,” Artificial Intelligence in Medicine, p. 102587, 2023.
  85. F. Luo, H. Qian, D. Wang, X. Guo, Y. Sun, E. S. Lee, H. H. Teong, R. T. R. Lai, and C. Miao, “Missing value imputation for diabetes prediction,” in 2022 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2022, pp. 1–8.
  86. M. Özbey, O. Dalmaz, S. U. Dar, H. A. Bedel, Ş. Özturk, A. Güngör, and T. Çukur, “Unsupervised medical image translation with adversarial diffusion models,” IEEE Transactions on Medical Imaging, 2023.
  87. P. P. Liang, C. Wu, L.-P. Morency, and R. Salakhutdinov, “Towards understanding and mitigating social biases in language models,” in International Conference on Machine Learning.   PMLR, 2021, pp. 6565–6576.
  88. H. Kotek, R. Dockum, and D. Sun, “Gender bias and stereotypes in large language models,” in Proceedings of The ACM Collective Intelligence Conference, 2023, pp. 12–24.
  89. Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023.
  90. Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y. Zhang, Y. Chen et al., “Siren’s song in the ai ocean: A survey on hallucination in large language models,” arXiv preprint arXiv:2309.01219, 2023.
  91. W. Xu, S. Agrawal, E. Briakou, M. J. Martindale, and M. Carpuat, “Understanding and detecting hallucinations in neural machine translation via model introspection,” Transactions of the Association for Computational Linguistics, vol. 11, pp. 546–564, 2023.
  92. M. Fang, M. Huber, and N. Damer, “Synthaspoof: Developing face presentation attack detection based on privacy-friendly synthetic data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1061–1070.
  93. N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson, A. Oprea, and C. Raffel, “Extracting training data from large language models,” 2021.
  94. R. T. McCoy, P. Smolensky, T. Linzen, J. Gao, and A. Celikyilmaz, “How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven,” Transactions of the Association for Computational Linguistics, vol. 11, pp. 652–670, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Xu Guo (85 papers)
  2. Yiqiang Chen (44 papers)
Citations (18)
Youtube Logo Streamline Icon: https://streamlinehq.com