Papers
Topics
Authors
Recent
Search
2000 character limit reached

Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced Data and Large-Language Models

Published 4 Dec 2023 in cs.CL, cs.AI, cs.CY, and cs.LG | (2312.03755v1)

Abstract: When a damaging earthquake occurs, immediate information about casualties is critical for time-sensitive decision-making by emergency response and aid agencies in the first hours and days. Systems such as Prompt Assessment of Global Earthquakes for Response (PAGER) by the U.S. Geological Survey (USGS) were developed to provide a forecast within about 30 minutes of any significant earthquake globally. Traditional systems for estimating human loss in disasters often depend on manually collected early casualty reports from global media, a process that's labor-intensive and slow with notable time delays. Recently, some systems have employed keyword matching and topic modeling to extract relevant information from social media. However, these methods struggle with the complex semantics in multilingual texts and the challenge of interpreting ever-changing, often conflicting reports of death and injury numbers from various unverified sources on social media platforms. In this work, we introduce an end-to-end framework to significantly improve the timeliness and accuracy of global earthquake-induced human loss forecasting using multi-lingual, crowdsourced social media. Our framework integrates (1) a hierarchical casualty extraction model built upon LLMs, prompt design, and few-shot learning to retrieve quantitative human loss claims from social media, (2) a physical constraint-aware, dynamic-truth discovery model that discovers the truthful human loss from massive noisy and potentially conflicting human loss claims, and (3) a Bayesian updating loss projection model that dynamically updates the final loss estimation using discovered truths. We test the framework in real-time on a series of global earthquake events in 2021 and 2022 and show that our framework streamlines casualty data retrieval, achieving speed and accuracy comparable to manual methods by USGS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Sajjad Ahadzadeh and Mohammad Reza Malek. 2021. Earthquake damage assessment based on user generated data in social networks. Sustainability 13, 9 (2021), 4814.
  2. Domain adaptation with adversarial araining and araph ambeddings. Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (ACL).
  3. Reem ALRashdi and Simon O’Keefe. 2019. Deep learning and word embeddings for tweet classification for crisis response. arXiv preprint arXiv:1903.11024 (2019).
  4. Preliminary report of the 5 September 2022 MS 6.8 Luding earthquake, Sichuan, China. Earthquake Research Advances (2022), 100184.
  5. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877–1901.
  6. Classifying text messages for the Haiti earthquake. In ISCRAM. Citeseer.
  7. Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019).
  8. A new crowdsourcing model to assess disaster using microblog data in typhoon Haiyan. Natural Hazards 84, 2 (2016), 1241–1256.
  9. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  10. Integrating conflicting data: the role of source dependence. Proceedings of the VLDB Endowment 2, 1 (2009), 550–561.
  11. Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems 26, 3 (2011), 10–14.
  12. Lexical normalization for social media text. ACM Transactions on Intelligent Systems and Technology (TIST) 4, 1 (2013), 1–27.
  13. Haiyan Hao and Yan Wang. 2020. Leveraging multimodal social media data for rapid disaster damage assessment. International Journal of Disaster Risk Reduction 51 (2020), 101760.
  14. Social media and disasters: a functional framework for social media use in disaster planning, response, and research. Disasters 39, 1 (2015), 1–22.
  15. Processing social media messages in mass emergency: A survey. ACM Computing Surveys (CSUR) 47, 4 (2015), 1–38.
  16. Practical extraction of disaster-relevant information from social media. In Proceedings of the 22nd international conference on World Wide Web companion. International World Wide Web Conferences Steering Committee, 1021–1024.
  17. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (Portoroz, Slovenia, 23-28). European Language Resources Association (ELRA), Paris, France.
  18. Kishor S Jaiswal and David J Wald. 2010. Development of a semi-empirical loss model within the USGS Prompt Assessment of Global Earthquakes for Response (PAGER) System. In Proceedings of the 9th US and 10th Canadian Conference on Earthquake Engineering: reaching beyond borders. 25–29.
  19. Stephanie Lackner. [n. d.]. Earthquakes and Economic Growth. https://www.econstor.eu/bitstream/10419/194225/1/1043719490.pdf.
  20. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
  21. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In Proceedings of the international AAAI conference on web and social media. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15655
  22. An efficient Bayesian framework for updating PAGER loss estimates. Earthquake Spectra 36, 4 (2020), 1719–1742.
  23. Jeff Pasternack and Dan Roth. 2010. Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). 877–885.
  24. Improving language understanding by generative pre-training. (2018).
  25. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  26. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980 (2020).
  27. Identifying and categorizing disaster-related tweets. In Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media. 1–6.
  28. EMTerms 1.0: A Terminological Resource for Crisis Tweets. In ISCRAM.
  29. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).
  30. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  31. Rapid estimation of an earthquake impact area using a spatial logistic growth model based on social media data. International Journal of Digital Earth 12, 11 (2019), 1265–1284.
  32. Zheye Wang and Xinyue Ye. 2018. Social media analytics for natural disaster management. International Journal of Geographical Information Science 32, 1 (2018), 49–72.
  33. Max Wyss. 2017. Report estimated quake death tolls to save lives. Nature 545, 7653 (2017), 151–153.
  34. Truth discovery with multiple conflicting information providers on the web. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. 1048–1052.
Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.