CReMa: Crisis Response through Computational Identification and Matching of Cross-Lingual Requests and Offers Shared on Social Media (2405.11897v2)
Abstract: During times of crisis, social media platforms play a crucial role in facilitating communication and coordinating resources. In the midst of chaos and uncertainty, communities often rely on these platforms to share urgent pleas for help, extend support, and organize relief efforts. However, the overwhelming volume of conversations during such periods can escalate to unprecedented levels, necessitating the automated identification and matching of requests and offers to streamline relief operations. Additionally, there is a notable absence of studies conducted in multi-lingual settings, despite the fact that any geographical area can have a diverse linguistic population. Therefore, we propose CReMa (Crisis Response Matcher), a systematic approach that integrates textual, temporal, and spatial features to address the challenges of effectively identifying and matching requests and offers on social media platforms during emergencies. Our approach utilizes a crisis-specific pre-trained model and a multi-lingual embedding space. We emulate human decision-making to compute temporal and spatial features and non-linearly weigh the textual features. The results from our experiments are promising, outperforming strong baselines. Additionally, we introduce a novel multi-lingual dataset simulating help-seeking and offering assistance on social media in 16 languages and conduct comprehensive cross-lingual experiments. Furthermore, we analyze a million-scale geotagged global dataset to understand patterns in seeking help and offering assistance on social media. Overall, these contributions advance the field of crisis informatics and provide benchmarks for future research in the area.
- M. Imran, C. Castillo, F. Diaz, and S. Vieweg, “Processing social media messages in mass emergency: A survey,” ACM Computing Surveys (CSUR), vol. 47, no. 4, pp. 1–38, 2015.
- R. Lamsal, A. Harwood, and M. R. Read, “Socially enhanced situation awareness from microblogs using artificial intelligence: A survey,” ACM Computing Surveys, vol. 55, no. 4, pp. 1–38, 2022.
- H. Purohit, C. Castillo, F. Diaz, A. Sheth, and P. Meier, “Emergency-relief coordination on social media: Automatically matching resource requests and offers,” First Monday, 2014.
- U. Qazi, M. Imran, and F. Ofli, “Geocov19: a dataset of hundreds of millions of multilingual covid-19 tweets with location information,” SIGSPATIAL Special, vol. 12, no. 1, pp. 6–15, 2020.
- R. Lamsal, M. R. Read, and S. Karunasekera, “Billioncov: An enriched billion-scale collection of covid-19 tweets for efficient hydration,” Data in Brief, vol. 48, p. 109229, 2023.
- T. H. Nazer, F. Morstatter, H. Dani, and H. Liu, “Finding requests in social media for disaster relief,” in 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 2016, pp. 1410–1413.
- I. Ullah, S. Khan, M. Imran, and Y.-K. Lee, “Rweetminer: Automatic identification and categorization of help requests on twitter during disasters,” Expert Systems with Applications, vol. 176, p. 114787, 2021.
- R. Dutt, M. Basu, K. Ghosh, and S. Ghosh, “Utilizing microblogs for assisting post-disaster relief operations via matching resource needs and availabilities,” Information Processing & Management, vol. 56, no. 5, pp. 1680–1697, 2019.
- R. Lamsal, M. R. Read, and S. Karunasekera, “Crisistransformers: Pre-trained language models and sentence encoders for crisis-related social media texts,” arXiv preprint arXiv:2309.05494, 2023.
- R. Lamsal, M. R. Rodriguez, and S. Karunasekera, “Semantically enriched cross-lingual sentence embeddings for crisis-related social media texts,” 2024, (ISCRAM 2024, In press).
- A. Devaraj, D. Murthy, and A. Dontula, “Machine-learning methods for identifying social media-based requests for urgent help during hurricanes,” International Journal of Disaster Risk Reduction, vol. 51, p. 101757, 2020.
- J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
- X. He, D. Lu, D. Margolin, M. Wang, S. E. Idrissi, and Y.-R. Lin, “The signals and noise: actionable information in improvised social media channels during a disaster,” in Proceedings of the 2017 ACM on web science conference, 2017, pp. 33–42.
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
- B. Zhou, L. Zou, A. Mostafavi, B. Lin, M. Yang, N. Gharaibeh, H. Cai, J. Abedin, and D. Mandal, “Victimfinder: Harvesting rescue requests in disaster response from social media with bert,” Computers, Environment and Urban Systems, vol. 95, p. 101824, 2022.
- M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the NACL for Computational Linguistics, 2018, pp. 2227–2237.
- K. Song, X. Tan, T. Qin, J. Lu, and T.-Y. Liu, “Mpnet: Masked and permuted pre-training for language understanding,” Advances in Neural Information Processing Systems, vol. 33, pp. 16 857–16 867, 2020.
- D. Q. Nguyen, T. Vu, and A. Tuan Nguyen, “BERTweet: A pre-trained language model for English tweets,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics, Oct. 2020, pp. 9–14.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 NACL for Computational Linguistics, 2019, pp. 4171–4186.
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
- A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, Jul. 2020, pp. 8440–8451.
- Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019.
- Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances in neural information processing systems, vol. 32, 2019.
- K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, “Electra: Pre-training text encoders as discriminators rather than generators,” arXiv preprint arXiv:2003.10555, 2020.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint arXiv:1908.10084, 2019.
- H. Jegou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 1, pp. 117–128, 2010.
- Y. A. Malkov and D. A. Yashunin, “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 4, pp. 824–836, 2018.
- R. Lamsal, M. R. Rodriguez, and S. Karunasekera, “A twitter narrative of the covid-19 pandemic in australia,” in Proceedings of the International ISCRAM Conference, 2023, pp. 353–370.
- R. Lamsal, “Design and analysis of a large-scale covid-19 tweets dataset,” applied intelligence, vol. 51, pp. 2790–2804, 2021.
- R. Lamsal, A. Harwood, and M. R. Read, “Where did you tweet from? inferring the origin locations of tweets based on contextual information,” in 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022, pp. 3935–3944.
- Rabindra Lamsal (14 papers)
- Maria Rodriguez Read (13 papers)
- Shanika Karunasekera (33 papers)
- Muhammad Imran (116 papers)