Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CMNER: A Chinese Multimodal NER Dataset based on Social Media (2402.13693v2)

Published 21 Feb 2024 in cs.CL

Abstract: Multimodal Named Entity Recognition (MNER) is a pivotal task designed to extract named entities from text with the support of pertinent images. Nonetheless, a notable paucity of data for Chinese MNER has considerably impeded the progress of this natural language processing task within the Chinese domain. Consequently, in this study, we compile a Chinese Multimodal NER dataset (CMNER) utilizing data sourced from Weibo, China's largest social media platform. Our dataset encompasses 5,000 Weibo posts paired with 18,326 corresponding images. The entities are classified into four distinct categories: person, location, organization, and miscellaneous. We perform baseline experiments on CMNER, and the outcomes underscore the effectiveness of incorporating images for NER. Furthermore, we conduct cross-lingual experiments on the publicly available English MNER dataset (Twitter2015), and the results substantiate our hypothesis that Chinese and English multimodal NER data can mutually enhance the performance of the NER model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Multimodal named entity recognition with image attributes and image knowledge. In Database Systems for Advanced Applications: 26th International Conference, DASFAA 2021, Taipei, Taiwan, April 11–14, 2021, Proceedings, Part II 26, pages 186–201. Springer.
  2. Can images help recognize entities? a study of the role of images for multimodal ner. arXiv preprint arXiv:2010.12712.
  3. Data augmentation for cross-domain named entity recognition. arXiv preprint arXiv:2109.01758.
  4. Advpicker: Effectively leveraging unlabeled data via adversarial discriminator for cross-lingual ner. arXiv preprint arXiv:2106.02300.
  5. Hybrid transformer with multi-level fusion for multimodal knowledge graph completion. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 904–915.
  6. Word translation without parallel data. arXiv preprint arXiv:1710.04087.
  7. Angel Daza and Anette Frank. 2019. Translate and label! an encoder-decoder approach for cross-lingual semantic role labeling. arXiv preprint arXiv:1908.11326.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  9. Few-nerd: A few-shot named entity recognition dataset. arXiv preprint arXiv:2105.07464.
  10. Cross-lingual semantic role labeling with high-quality translated training corpus. arXiv preprint arXiv:2004.06295.
  11. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
  12. Few-shot named entity recognition: An empirical baseline study. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 10408–10423.
  13. Entity projection via machine translation for cross-lingual ner. arXiv preprint arXiv:1909.05356.
  14. Mner-qg: An end-to-end mrc framework for multimodal named entity recognition with query grounding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 8032–8040.
  15. Arzoo Katiyar and Claire Cardie. 2018. Nested named entity recognition revisited. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1.
  16. Adversarial learning with contextual embeddings for zero-resource cross-lingual classification and ner. arXiv preprint arXiv:1909.00153.
  17. Phong Le and Ivan Titov. 2018. Improving entity linking by modeling latent relations between mentions. arXiv preprint arXiv:1804.10637.
  18. A span-based model for joint overlapped and discontinuous named entity recognition. arXiv preprint arXiv:2106.14373.
  19. Unified named entity recognition as word-word relation classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10965–10973.
  20. Mrn: A locally and globally mention-based reasoning network for document-level relation extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1359–1370.
  21. An unsupervised multiple-task and multiple-teacher model for cross-lingual named entity recognition. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 170–179.
  22. Mulda: A multilingual data augmentation framework for low-resource cross-lingual ner. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5834–5846.
  23. Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
  24. Visual attention model for name tagging in multimodal social media. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1990–1999.
  25. Wei Lu and Dan Roth. 2015. Joint mention extraction and classification with mention hypergraphs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 857–867.
  26. Unified structure generation for universal information extraction. arXiv preprint arXiv:2203.12277.
  27. Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354.
  28. Multimodal named entity recognition for short social media posts. arXiv preprint arXiv:1802.07862.
  29. Promptner: Prompt locating and typing for named entity recognition. arXiv preprint arXiv:2305.17104.
  30. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  31. Mohammad Golam Sohrab and Makoto Miwa. 2018. Deep exhaustive model for nested named entity recognition. In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 2843–2849.
  32. Jörg Tiedemann. 2015. Improving the cross-lingual projection of syntactic dependencies. In Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015), pages 191–199.
  33. Attention is all you need. Advances in neural information processing systems, 30.
  34. Bailin Wang and Wei Lu. 2018. Neural segmental hypergraphs for overlapping mention recognition. arXiv preprint arXiv:1810.01817.
  35. Ita: image-text alignments for multi-modal named entity recognition. arXiv preprint arXiv:2112.06482.
  36. Promptmner: prompt-based entity-related visual clue extraction and integration for multimodal named entity recognition. In International Conference on Database Systems for Advanced Applications, pages 297–305. Springer.
  37. A novel cascade binary tagging framework for relational triple extraction. arXiv preprint arXiv:1909.03227.
  38. Single-/multi-source cross-lingual ner via teacher-student learning on unlabeled data in target language. arXiv preprint arXiv:2004.12440.
  39. Shijie Wu and Mark Dredze. 2019. Beto, bentz, becas: The surprising cross-lingual effectiveness of bert. arXiv preprint arXiv:1904.09077.
  40. Maf: a general matching and alignment framework for multimodal named entity recognition. In Proceedings of the fifteenth ACM international conference on web search and data mining, pages 1215–1223.
  41. Chinese clip: Contrastive vision-language pretraining in chinese. arXiv preprint arXiv:2211.01335.
  42. Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. Association for Computational Linguistics.
  43. Multi-modal graph fusion for named entity recognition with targeted visual guidance. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 14347–14355.
  44. Adaptive co-attention network for named entity recognition in tweets. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
  45. De-bias for generative extraction in unified ner task. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 808–818.
  46. Learning from different text-image pairs: A relation-enhanced graph convolutional network for multimodal ner. In Proceedings of the 30th ACM International Conference on Multimedia, pages 3983–3992.
  47. Object-aware multimodal named entity recognition in social media posts with adversarial learning. IEEE Transactions on Multimedia, 23:2520–2532.
  48. Improving self-training for cross-lingual named entity recognition with contrastive and prototype learning. arXiv preprint arXiv:2305.13628.
  49. Conner: Consistency training for cross-lingual named entity recognition. arXiv preprint arXiv:2211.09394.
  50. Melm: Data augmentation with masked entity language modeling for low-resource ner. arXiv preprint arXiv:2108.13655.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yuanze Ji (1 paper)
  2. Bobo Li (23 papers)
  3. Jun Zhou (370 papers)
  4. Fei Li (233 papers)
  5. Chong Teng (23 papers)
  6. Donghong Ji (50 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.