Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review (2402.10350v1)

Published 15 Feb 2024 in cs.LG and cs.AI
Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review

Abstract: This systematic literature review comprehensively examines the application of LLMs in forecasting and anomaly detection, highlighting the current state of research, inherent challenges, and prospective future directions. LLMs have demonstrated significant potential in parsing and analyzing extensive datasets to identify patterns, predict future events, and detect anomalous behavior across various domains. However, this review identifies several critical challenges that impede their broader adoption and effectiveness, including the reliance on vast historical datasets, issues with generalizability across different contexts, the phenomenon of model hallucinations, limitations within the models' knowledge boundaries, and the substantial computational resources required. Through detailed analysis, this review discusses potential solutions and strategies to overcome these obstacles, such as integrating multimodal data, advancements in learning methodologies, and emphasizing model explainability and computational efficiency. Moreover, this review outlines critical trends that are likely to shape the evolution of LLMs in these fields, including the push toward real-time processing, the importance of sustainable modeling practices, and the value of interdisciplinary collaboration. Conclusively, this review underscores the transformative impact LLMs could have on forecasting and anomaly detection while emphasizing the need for continuous innovation, ethical considerations, and practical solutions to realize their full potential.

Leveraging LLMs for Forecasting and Anomaly Detection: A Comprehensive Review

LLMs have increasingly become central to advancements in forecasting and anomaly detection across various domains. These models, which have initially been honed for a vast array of natural language processing tasks, are now being applied to parse extensive datasets, predict future events, and pinpoint anomalies with significant accuracy. This systematic literature review explores the current state of LLM applications in these areas, shedding light on methodologies, inherent challenges, and the promising horizon that lies ahead.

Current Methodologies and Applications

LLMs offer a robust framework for understanding and generating predictions based on historical data. In areas like time series forecasting, event sequence prediction, traffic flow forecasting, and healthcare clinical prediction, LLMs have shown their prowess. These applications leverage LLMs' ability to process and analyze massive amounts of data, identifying patterns and deviations that might elude traditional analysis methods.

Challenges Facing LLM Adoption

Despite their potential, the deployment of LLMs in forecasting and anomaly detection faces several hurdles. A significant dependency on extensive historical datasets presents challenges in data availability, quality, and inherent biases. Moreover, ensuring the generalizability of these models across diverse contexts remains a formidable task. Issues such as model hallucinations, where LLMs generate plausible but incorrect or misleading information, along with concerns over the robustness and computational efficiency of these models, further complicate their widespread adoption.

The Road Ahead: Future Directions

Emerging trends promise to address these challenges, broadening the scope and enhancing the performance of LLMs in forecasting and anomaly detection. Notably, the integration of multimodal data sources and advancements in transfer and meta-learning are poised to improve model adaptability and learning efficiency. Emphasis on model explainability and the push towards real-time processing underscore the growing need for LLMs that are not only accurate but are also transparent and capable of operating in dynamic environments. Furthermore, sustainable modeling practices highlight an increasing awareness of the environmental and ethical considerations surrounding LLM deployment.

Conclusion

The application of LLMs in forecasting and anomaly detection offers a glimpse into a future where predictive analytics is more sophisticated, accurate, and impactful. While challenges remain, the pathways to overcoming these obstacles are becoming increasingly clear, thanks to ongoing research and technological innovations. As we move forward, it is imperative that the development and deployment of LLMs continue to be guided by principles of ethical consideration, ensuring that these advances benefit society as a whole.

This comprehensive review underscores the transformative potential of LLMs in forecasting and anomaly detection, marking a significant step toward harnessing the full capabilities of advanced computational models to navigate the complexities of the modern world.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (241)
  1. Alan M. Turing. Computing machinery and intelligence. Mind, 59(October):433–60, 1950.
  2. BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
  3. GPT-4 technical report, 2023.
  4. Bert for stock market sentiment analysis. In 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), pages 1597–1601. IEEE, 2019.
  5. Understand legal documents with contextualized large language models. arXiv preprint arXiv:2303.12135, 2023.
  6. Identifying flakiness in quantum programs. arXiv preprint arXiv:2302.03256, 2023.
  7. Deep learning. nature, 521(7553):436–444, 2015.
  8. Self-supervised frontalization and rotation gan with random swap for pose-invariant face recognition. In 2022 IEEE International Conference on Image Processing (ICIP), pages 911–915. IEEE, 2022.
  9. TEMPO: Prompt-based generative pre-trained transformer for time series forecasting. In The Twelfth International Conference on Learning Representations, 2024.
  10. BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model. Applied Artificial Intelligence, 36(1):2145642, December 2022.
  11. Q-pac: Automated detection of quantum bug-fix patterns. arXiv preprint arXiv:2311.17705, 2023.
  12. Immutable log storage as a service on private and public blockchains. IEEE Transactions on Services Computing, 2021.
  13. I’m spartacus, no, i’m spartacus: Proactively protecting users from phishing by intentionally triggering cloaking behavior. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 3165–3179, 2022.
  14. The teaching design methods under educational psychology based on deep learning and artificial intelligence. Frontiers in Psychology, 12:711489, 2021.
  15. Time-series forecasting with deep learning: a survey. Philosophical Transactions of the Royal Society A, 379(2194):20200209, 2021.
  16. Smva: a stable mean value analysis algorithm for closed systems with load-dependent queues. Systems Modeling: Methodologies and Tools, pages 11–28, 2019.
  17. Optimal resource allocation in sdn/nfv-enabled networks via deep reinforcement learning. In 2022 IEEE Ninth International Conference on Communications and Networking (ComNet), pages 1–7. IEEE, 2022.
  18. Financial time series forecasting model based on ceemdan and lstm. Physica A: Statistical mechanics and its applications, 519:127–139, 2019.
  19. Financial time-series forecasting: Towards synergizing performance and interpretability within a hybrid machine learning approach. arXiv preprint arXiv:2401.00534, 2023.
  20. Evaluation of deep learning with long short-term memory networks for time series forecasting in supply chain management. Procedia CIRP, 99:604–609, 2021.
  21. On time series analysis of public health and biomedical data. Annu. Rev. Public Health, 27:57–79, 2006.
  22. The impact of position errors on crowd simulation. Simulation Modelling Practice and Theory, 90:45–63, 2019.
  23. Time Series Anomaly Detection Based on Language Model. In Proceedings of the Eleventh ACM International Conference on Future Energy Systems, E-Energy ’20, pages 544–547, New York, NY, USA, June 2020. Association for Computing Machinery.
  24. TS-Bert: Time Series Anomaly Detection via Pre-training Model Bert. In Maciej Paszynski, Dieter Kranzlmüller, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M. A. Sloot, editors, Computational Science – ICCS 2021, Lecture Notes in Computer Science, pages 209–223, Cham, 2021. Springer International Publishing.
  25. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, 55:278–288, 2016.
  26. Design of an artificial immune system as a novel anomaly detector for combating financial fraud in the retail sector. In The 2003 Congress on Evolutionary Computation, 2003. CEC’03., volume 1, pages 405–412. IEEE, 2003.
  27. Edge security: Challenges and issues. arXiv preprint arXiv:2206.07164, 2022.
  28. Understanding and predicting private interactions in underground forums. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, pages 303–314, 2019.
  29. Enhanced e-commerce customer engagement: A comprehensive three-tiered recommendation system. Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online), 2(3):348–359, 2023.
  30. Having your cake and eating it: An analysis of {{\{{Concession-Abuse-as-a-Service}}\}}. In 30th USENIX Security Symposium (USENIX Security 21), pages 4169–4186, 2021.
  31. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. arXiv preprint arXiv:2312.02003, 2023.
  32. Prometheus: Infrastructure security posture analysis with ai-generated attack graphs. arXiv preprint arXiv:2312.13119, 2023.
  33. The prediction trend of enterprise financial risk based on machine learning arima model. Journal of Theory and Practice of Engineering Science, 4(01):65–71, 2024.
  34. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  35. Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288, 2023.
  36. Mixtral of experts. ArXiv, abs/2401.04088, 2024.
  37. Barbara A. Kitchenham. Systematic review in software engineering: Where we are and where we should be going. In Proceedings of the 2nd International Workshop on Evidential Assessment of Software Technologies, EAST ’12, pages 1–2, New York, NY, USA, 2012. Association for Computing Machinery.
  38. Systematic literature reviews in software engineering – A systematic literature review. Information and Software Technology, 51(1):7–15, 2009.
  39. Lessons from applying the systematic literature review process within the software engineering domain. Journal of systems and software, 80(4):571–583, 2007.
  40. Systematic literature reviews in software engineering–a tertiary study. Information and software technology, 52(8):792–805, 2010.
  41. Software architecture optimization methods: A systematic literature review. IEEE Transactions on Software Engineering, 39(5):658–683, 2012.
  42. Annalisa Cocchia. Smart and digital city: A systematic literature review. Smart city: How to create public and economic value with high technology in urban space, pages 13–43, 2014.
  43. Chaomei Chen. Science mapping: A systematic review of the literature. Journal of data and information science, 2(2):1–40, 2017.
  44. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
  45. Large Language Models Are Zero-Shot Time Series Forecasters. In Thirty-Seventh Conference on Neural Information Processing Systems, November 2023.
  46. One Fits All: Power General Time Series Analysis by Pretrained LM. In Thirty-Seventh Conference on Neural Information Processing Systems, November 2023.
  47. Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning. In Thirty-Seventh Conference on Neural Information Processing Systems, November 2023.
  48. PromptCast: A new prompt-based learning paradigm for time series forecasting. IEEE Transactions on Knowledge and Data Engineering, pages 1–14, 2023.
  49. LAnoBERT: System log anomaly detection based on BERT masked language model. Applied Soft Computing, 146:110689, October 2023.
  50. Leveraging language foundation models for human mobility forecasting. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems, SIGSPATIAL ’22, pages 1–9, New York, NY, USA, November 2022. Association for Computing Machinery.
  51. Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models. In 2021 IEEE/ACM International Workshop on Cloud Intelligence (CloudIntelligence), pages 19–24, May 2021.
  52. Time-LLM: Time series forecasting by reprogramming large language models. In The Twelfth International Conference on Learning Representations, 2024.
  53. Evaluating BERT on cloud-edge time series forecasting and sentiment analysis via prompt learning. In 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCit. IEEE, 2022.
  54. LogPrompt: A Log-based Anomaly Detection Framework Using Prompts. In 2023 International Joint Conference on Neural Networks (IJCNN). IEEE, 2023.
  55. TrafficBERT: Pre-trained model with large-scale data for long-range traffic flow forecasting. Expert Systems with Applications, 186:115738, December 2021.
  56. Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers. IEEE Transactions on Computers, 72(9):2656–2667, September 2023.
  57. Learning Representations on Logs for AIOps. In 2023 IEEE 16th International Conference on Cloud Computing (CLOUD). IEEE, 2023.
  58. Log Anomaly Detection method based on BERT model optimization. In 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA). IEEE, 2022.
  59. Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2023.
  60. Exploring semantic vs. Syntactic features for unsupervised learning on application log files. In 2023 7th Cyber Security in Networking Conference (CSNet), pages 219–225, 2023.
  61. SimMTM: A simple pre-training framework for masked time-series modeling. In Thirty-Seventh Conference on Neural Information Processing Systems, 2023.
  62. Research on log anomaly detection based on sentence-BERT. Electronicsweek, 12(3580), 2023.
  63. LogFiT: Log anomaly detection using fine-tuned language models. IEEE Transactions on Network and Service Management, pages 1–1, 2024.
  64. LogST: Log semi-supervised anomaly detection based on sentence-BERT. In 2022 7th International Conference on Signal and Image Processing (ICSIP), pages 356–361, 2022.
  65. Log-based anomaly detection without log parsing. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 492–504, 2021.
  66. HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log. IEEE Transactions on Network and Service Management, 17(4):2064–2076, December 2020.
  67. A review of recurrent neural networks: Lstm cells and network architectures. Neural computation, 31(7):1235–1270, 2019.
  68. Recent advances in convolutional neural networks. Pattern recognition, 77:354–377, 2018.
  69. Implementation of computer vision technology based on artificial intelligence for medical image analysis. International Journal of Computer Science and Information Technology, 1(1):69–76, 2023.
  70. Fully automatic segmentation on prostate mr images based on cascaded fully convolution network. Journal of Magnetic Resonance Imaging, 49(4):1149–1156, 2019.
  71. Application of graph convolutional network in the construction of knowledge graph for higher mathematics teaching. Sensors and Materials, 35(12):4269–4290, 2023.
  72. Review of deep learning algorithms and architectures. IEEE access, 7:53040–53065, 2019.
  73. Conceptbert: Concept-aware representation for visual question answering. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 489–498, 2020.
  74. Anterior topographic limbal demarcation with ultrawide-field oct. Investigative Ophthalmology & Visual Science, 63(7):1195–A0195, 2022.
  75. Towards burmese (myanmar) morphological analysis: Syllable-based tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19(1):1–34, 2019.
  76. Part-of-speech tagging for twitter with adversarial neural networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2411–2420, 2017.
  77. Named entity recognition in tweets: an experimental study. In Proceedings of the 2011 conference on empirical methods in natural language processing, pages 1524–1534, 2011.
  78. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 34(1):50–70, 2020.
  79. Improving language understanding by generative pre-training, 2018.
  80. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), 3(1):1–23, 2021.
  81. Enriching pre-trained language model with entity information for relation classification. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 2361–2364, 2019.
  82. Pre-training with whole word masking for chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:3504–3514, 2021.
  83. Will_go at semeval-2020 task 3: An accurate model for predicting the (graded) effect of context in word similarity based on bert. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 301–306, 2020.
  84. Ruber: An unsupervised method for automatic evaluation of open-domain dialog systems. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
  85. The refinedweb dataset for falcon llm: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116, 2023.
  86. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597, 2023.
  87. Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models. PLoS digital health, 2(2):e0000198, 2023.
  88. Decoding sentiments: Enhancing covid-19 tweet analysis through bert-rcnn fusion. Journal of Theory and Practice of Engineering Science, 4(01):86–93, 2024.
  89. S2e: Towards an end-to-end entity resolution solution from acoustic signal. In ICASSP 2024, 2024.
  90. Layoutllm-t2i: Eliciting layout guidance from llm for text-to-image generation. In Proceedings of the 31st ACM International Conference on Multimedia, pages 643–654, 2023.
  91. More human than human: Llm-generated narratives outperform human-llm interleaved narratives. In Proceedings of the 15th Conference on Creativity and Cognition, pages 368–370, 2023.
  92. Synthetic data generation with large language models for text classification: Potential and limitations. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10443–10461, Singapore, December 2023. Association for Computational Linguistics.
  93. Conversational agents and language models that learn from human dialogues to support design thinking. In International Conference on Intelligent Tutoring Systems, pages 691–700. Springer, 2023.
  94. Are large language models all you need for task-oriented dialogue? In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 216–228, 2023.
  95. When large language models meet vector databases: A survey, 2024.
  96. Vocabulary adaptation for domain adaptation in neural machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4269–4279, 2020.
  97. Optimizing science question ranking through model and retrieval-augmented generation. International Journal of Computer Science and Information Technology, 1(1):124–130, 2023.
  98. What does bert look at? an analysis of bert’s attention. arXiv preprint arXiv:1906.04341, 2019.
  99. What does bert learn about the structure of language? In ACL 2019-57th Annual Meeting of the Association for Computational Linguistics, 2019.
  100. Semantics-aware bert for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9628–9635, 2020.
  101. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
  102. Evaluating commonsense in pre-trained language models. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 9733–9740, 2020.
  103. Anastasia Chan. Gpt-3 and instructgpt: Technological dystopianism, utopianism, and “contextual” perspectives in ai ethics and industry. AI and Ethics, 3(1):53–64, 2023.
  104. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  105. Explainability for large language models: A survey. ACM Transactions on Intelligent Systems and Technology, 2023.
  106. A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
  107. Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, 33:22243–22255, 2020.
  108. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  109. Tomáš Mikolov et al. Statistical language models based on neural networks. Presentation at Google, Mountain View, 2nd April, 80(26), 2012.
  110. Faster and smaller n-gram language models. In Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies, pages 258–267, 2011.
  111. The effects of n-gram probabilistic measures on the recognition and production of four-word sequences. The Mental Lexicon, 6(2):302–324, 2011.
  112. Sparse coding for n-gram feature extraction and training for file fragment classification. IEEE Transactions on Information Forensics and Security, 13(10):2553–2562, 2018.
  113. Backoff parameter estimation for the dop model. In European Conference on Machine Learning, pages 373–384. Springer, 2003.
  114. Competitive distribution estimation: Why is good-turing good. Advances in Neural Information Processing Systems, 28, 2015.
  115. Rebekah George Benjamin. Reconstructing readability: Recent developments and recommendations in the analysis of text difficulty. Educational Psychology Review, 24:63–88, 2012.
  116. Can artificial neural networks learn language models? In Interspeech, 2000.
  117. A neural probabilistic language model. Advances in neural information processing systems, 13, 2000.
  118. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
  119. Recurrent neural network based language model. In Interspeech, volume 2, pages 1045–1048. Makuhari, 2010.
  120. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  121. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
  122. Convolutional networks for images, speech, and time series. Citeseer, 1995.
  123. Language modeling with gated convolutional networks. In International Conference on Machine Learning, pages 933–941. PMLR, 2017.
  124. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256. JMLR Workshop and Conference Proceedings, 2010.
  125. Generating text with recurrent neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 1017–1024, 2011.
  126. Unveiling the future navigating next-generation ai frontiers and innovations in application. International Journal of Computer Science and Information Technology, 1(1):147–156, 2023.
  127. Sepp Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(02):107–116, 1998.
  128. Enhancing computer digital signal processing through the utilization of rnn sequence algorithms. International Journal of Computer Science and Information Technology, 1(1):60–68, 2023.
  129. Deep contextualized word representations. In Marilyn Walker, Heng Ji, and Amanda Stent, editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
  130. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Research, 304:114135, 2021.
  131. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
  132. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.
  133. Attention-based models for speech recognition. Advances in neural information processing systems, 28, 2015.
  134. News recommendation with attention mechanism. arXiv preprint arXiv:2402.07422, 2024.
  135. Mrmrp: multi-source review-based model for rating prediction. In Database Systems for Advanced Applications: 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24–27, 2020, Proceedings, Part II 25, pages 20–35. Springer, 2020.
  136. Emrm: Enhanced multi-source review-based model for rating prediction. In Knowledge Science, Engineering and Management: 14th International Conference, KSEM 2021, Tokyo, Japan, August 14–16, 2021, Proceedings, Part III 14, pages 487–499. Springer, 2021.
  137. Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6077–6086, 2018.
  138. Attention mechanisms in computer vision: A survey. Computational visual media, 8(3):331–368, 2022.
  139. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  140. Eca-net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11534–11542, 2020.
  141. Bubble-wave-mitigation algorithm and transformer-based neural network demodulator for water-air optical camera communications. IEEE Photonics Journal, 2023.
  142. An empirical study of spatial attention mechanisms in deep networks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6688–6697, 2019.
  143. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
  144. Binary code summarization: Benchmarking chatgpt/gpt-4 and other large language models. arXiv preprint arXiv:2312.09601, 2023.
  145. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  146. Multi-task deep neural networks for natural language understanding. In Anna Korhonen, David Traum, and Lluís Màrquez, editors, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4487–4496, Florence, Italy, July 2019. Association for Computational Linguistics.
  147. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020.
  148. Training language models to follow instructions with human feedback. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27730–27744. Curran Associates, Inc., 2022.
  149. Gpt-4 passes the bar exam. Available at SSRN 4389233, 2023.
  150. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375, 2023.
  151. Knowledge graph prompting for multi-document question answering. arXiv preprint arXiv:2308.11730, 2023.
  152. Gpt understands, too. AI Open, 2023.
  153. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2023.
  154. GPT is an effective tool for multilingual psychological text analysis. PsyArXiv, May 2023.
  155. AI21 Studio. Jurassic-2 models. https://docs.ai21.com/docs/jurassic-2-models, 2023. Accessed: 12th Feb 2024.
  156. Anthropic. Making ai systems you can rely on. https://www.anthropic.com/company, 2023. Accessed: 12th Feb 2024.
  157. Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073, 2022.
  158. Bloom: A 176b-parameter open-access multilingual language model, 2023.
  159. Huggingface’s transformers: State-of-the-art natural language processing, 2020.
  160. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics.
  161. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980, 2020.
  162. Zeroprompt: Streaming acoustic encoders are zero-shot masked lms. arXiv preprint arXiv:2305.10649, 2023.
  163. Making pre-trained language models better few-shot learners. arXiv preprint arXiv:2012.15723, 2020.
  164. Adversarial reprogramming of neural networks. arXiv preprint arXiv:1806.11146, 2018.
  165. Pin-Yu Chen. Model reprogramming: Resource-efficient cross-domain machine learning. arXiv preprint arXiv:2202.10629, 2022.
  166. Fairness reprogramming. Advances in Neural Information Processing Systems, 35:34347–34362, 2022.
  167. Improved few-shot visual classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14493–14502, 2020.
  168. Accelerating the characterization of dynamic dna origami devices with deep neural networks. Scientific Reports, 13, 2023.
  169. Computational operations research exchange (core): A cyber-infrastructure for analytics. In 2019 Winter Simulation Conference (WSC), pages 3447–3456. IEEE, 2019.
  170. Decision intelligence for nationwide ventilator allocation during the covid-19 pandemic. SN Computer Science, 2(6):423, 2021.
  171. Ensemble variance reduction methods for stochastic mixed-integer programming and their application to the stochastic facility location problem. INFORMS Journal on Computing, 2023.
  172. Compromise policy for multi-stage stochastic linear programming: Variance and bias reduction. Computers & Operations Research, 153:106132, 2023.
  173. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 188–197, Hong Kong, China, November 2019. Association for Computational Linguistics.
  174. Darts: User-friendly modern machine learning for time series. The Journal of Machine Learning Research, 23(1):124:5442–124:5447, January 2022.
  175. Artur Trindade. ElectricityLoadDiagrams20112014. UCI Machine Learning Repository, 2015.
  176. ICEWS coded event data, 2015.
  177. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11106–11115, 2021.
  178. The M3-Competition: Results, conclusions and implications. International Journal of Forecasting, 16(4):451–476, 2000.
  179. The M4 Competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(1):54–74, 2020.
  180. Monash time series forecasting archive. In Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  181. What supercomputers say: A study of five system logs. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pages 575–584, 2007.
  182. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP ’09, pages 117–132, New York, NY, USA, 2009. Association for Computing Machinery.
  183. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, pages 1285–1298, New York, NY, USA, 2017. Association for Computing Machinery.
  184. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, pages 2828–2837, New York, NY, USA, 2019. Association for Computing Machinery.
  185. Jinghua Sheng. An augmentable domain-specific models for financial analysis. In 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pages 1–4, 2023.
  186. Symlm: Predicting function names in stripped binaries via context-sensitive execution-aware code embeddings. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 1631–1645, 2022.
  187. Understanding iot security from a market-scale perspective. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 1615–1629, 2022.
  188. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29–36, 1982.
  189. LLM Multimodal Traffic Accident Forecasting. Sensors, 23(22):9225, January 2023.
  190. Transformer hawkes process. In International conference on machine learning, pages 11692–11702. PMLR, 2020.
  191. Self-attentive hawkes process. In International conference on machine learning, pages 11183–11193. PMLR, 2020.
  192. Transformer embeddings of irregularly spaced events and their participants. In Proceedings of the Tenth International Conference on Learning Representations (ICLR), 2022.
  193. Prompt-augmented temporal point process for streaming event sequence. In Thirty-Seventh Conference on Neural Information Processing Systems, 2023.
  194. Using llm for improving key event discovery: Temporal-guided news stream clustering with event summaries. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4162–4173, 2023.
  195. Learning human driving behaviors with sequential causal imitation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4583–4592, 2022.
  196. Cvlight: Decentralized learning for adaptive traffic signal control with connected vehicles. Transportation research part C: emerging technologies, 141:103728, 2022.
  197. Adam Szirmai. The dynamics of socio-economic development: an introduction. Cambridge University Press, 2005.
  198. Dual-graph learning convolutional networks for interpretable alzheimer’s disease diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 406–415. Springer, 2022.
  199. Graph convolutional network with sample and feature weights for alzheimer’s disease diagnosis. Information Processing & Management, 59(4):102952, 2022.
  200. Health system-scale language models are all-purpose prediction engines. Nature, 619(7969):357–362, July 2023.
  201. Log Parsing: How Far Can ChatGPT Go? In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2023.
  202. Anomaly Detection Between Judicial Text-Based Documents. In 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT). IEEE, 2020.
  203. ADARMA auto-detection and auto-remediation of microservice anomalies by leveraging large language models. In Proceedings of the 33rd Annual International Conference on Computer Science and Software Engineering, CASCON ’23, pages 200–205, USA, 2023. IBM Corp.
  204. Generative AI for self-healing systems. In 2023 18th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), pages 1–6, 2023.
  205. Automated data validation: An industrial experience report. Journal of Systems and Software, 197:111573, 2023.
  206. Making existing software quantum safe: A case study on ibm db2. Information and Software Technology, 161:107249, 2023.
  207. Migrating gis big data computing from hadoop to spark: an exemplary study using twitter. In 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pages 351–358. IEEE, 2016.
  208. Modeling human trust and reliance in ai-assisted decision making: a markovian approach. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 2023.
  209. Variational autoencoder for anti-cancer drug response prediction, 2021.
  210. Deepgi: An automated approach for gastrointestinal tract segmentation in mri scans. arXiv preprint arXiv:2401.15354, 2024.
  211. Junhong Lin. Deep-learning Enabled Accurate Bruch’s Membrane Segmentation in Ultrahigh-Resolution Spectral Domain and Ultrahigh-Speed Swept Source Optical Coherence Tomography. PhD thesis, Massachusetts Institute of Technology, 2022.
  212. High-speed, long-range swept-source optical coherence tomography for the anterior segment of the eye. Investigative Ophthalmology & Visual Science, 62(11):75–75, 2021.
  213. High speed, long range, deep penetration swept source oct for structural and angiographic imaging of the anterior eye. Scientific reports, 12(1):992, 2022.
  214. Ultrahigh resolution oct markers of normal aging and early age-related macular degeneration. Ophthalmology Science, 3(3):100277, 2023.
  215. Prior: Prototype representation joint learning from medical images and reports. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  216. Unveiling patterns: A study on semi-supervised classification of strip surface defects. IEEE Access, 11:119933–119946, 2023.
  217. Edgegym: A reinforcement learning environment for constraint-aware nfv resource allocation. In 2023 IEEE 2nd International Conference on AI in Cybersecurity (ICAIC), pages 1–7. IEEE, 2023.
  218. Indoor localization based on weighted surfacing from crowdsourced samples. Sensors, 18(9):2990, 2018.
  219. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  220. Segment anything. arXiv:2304.02643, 2023.
  221. Self-supervised random mask attention gan in tackling pose-invariant face recognition. Available at SSRN 4583223, 2023.
  222. Dog image generation using deep convolutional generative adversarial networks. In 2020 5th international conference on universal village (UV), pages 1–6. IEEE, 2020.
  223. Approximate mean value analysis for multi-core systems. In 2015 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), pages 1–8. IEEE, 2015.
  224. Hengyi Zang. Precision calibration of industrial 3d scanners: An ai-enhanced approach for improved measurement accuracy. Global Academic Frontiers, 2(1):27–37, 2024.
  225. Low-cost sensor-enabled freehand 3d ultrasound. In 2019 IEEE International Ultrasonics Symposium (IUS), pages 498–501. IEEE, 2019.
  226. Particle filter slam for vehicle localization. arXiv preprint arXiv:2402.07429, 2024.
  227. Are two heads better than one in ai-assisted decision making? comparing the behavior and performance of groups and individuals in human-ai collaborative recidivism risk assessment. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–18, 2023.
  228. Smartfix: Leveraging machine learning for proactive equipment maintenance in industry 4.0. In The 2nd International scientific and practical conference “Innovations in education: prospects and challenges of today”(January 16-19, 2024) Sofia, Bulgaria. International Science Group. 2024. 389 p., page 313, 2024.
  229. Application of machine learning in financial risk early warning and regional prevention and control: A systematic analysis based on shap. WORLD TRENDS, REALITIES AND ACCOMPANYING PROBLEMS OF DEVELOPMENT, page 331, 2023.
  230. Illumicore: Optimization modeling and implementation for efficient vnf placement. In 2021 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pages 1–7. IEEE, 2021.
  231. Strategic application of ai intelligent algorithm in network threat detection and defense. Journal of Theory and Practice of Engineering Science, 4(01):49–57, 2024.
  232. M-gcn: Multi-scale graph convolutional network for 3d point cloud classification. In 2023 IEEE International Conference on Multimedia and Expo (ICME), pages 924–929. IEEE, 2023.
  233. Effects on performance of analytical tools for visually demanding tasks through direct and indirect touch interaction in an immersive visualization. In 2014 International Conference on Virtual Reality and Visualization, pages 186–193. IEEE, 2014.
  234. Strategic adversarial attacks in ai-assisted decision making to reduce human trust and reliance. In Edith Elkind, editor, Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pages 3020–3028. International Joint Conferences on Artificial Intelligence Organization, 8 2023. Main Track.
  235. Evaluating the social impact of ai in manufacturing: A methodological framework for ethical production. Academic Journal of Sociology and Management, 2(1):21–25, 2024.
  236. What makes a turing award winner? In Social, Cultural, and Behavioral Modeling: 14th International Conference, SBP-BRiMS 2021, Virtual Event, July 6–9, 2021, Proceedings 14, pages 310–320. Springer, 2021.
  237. A survey of large language models, 2023.
  238. Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond, 2023.
  239. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, January 2024.
  240. Language model behavior: A comprehensive survey, 2023.
  241. Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 1049–1065, Toronto, Canada, July 2023. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Jing Su (47 papers)
  2. Chufeng Jiang (5 papers)
  3. Xin Jin (285 papers)
  4. Yuxin Qiao (10 papers)
  5. Tingsong Xiao (10 papers)
  6. Hongda Ma (1 paper)
  7. Rong Wei (10 papers)
  8. Zhi Jing (6 papers)
  9. Jiajun Xu (16 papers)
  10. Junhong Lin (29 papers)
Citations (57)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com