Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ultra Low-Cost Two-Stage Multimodal System for Non-Normative Behavior Detection (2403.16151v1)

Published 24 Mar 2024 in cs.MA and cs.IR

Abstract: The online community has increasingly been inundated by a toxic wave of harmful comments. In response to this growing challenge, we introduce a two-stage ultra-low-cost multimodal harmful behavior detection method designed to identify harmful comments and images with high precision and recall rates. We first utilize the CLIP-ViT model to transform tweets and images into embeddings, effectively capturing the intricate interplay of semantic meaning and subtle contextual clues within texts and images. Then in the second stage, the system feeds these embeddings into a conventional machine learning classifier like SVM or logistic regression, enabling the system to be trained rapidly and to perform inference at an ultra-low cost. By converting tweets into rich multimodal embeddings through the CLIP-ViT model and utilizing them to train conventional machine learning classifiers, our system is not only capable of detecting harmful textual information with near-perfect performance, achieving precision and recall rates above 99\% but also demonstrates the ability to zero-shot harmful images without additional training, thanks to its multimodal embedding input. This capability empowers our system to identify unseen harmful images without requiring extensive and costly image datasets. Additionally, our system quickly adapts to new harmful content; if a new harmful content pattern is identified, we can fine-tune the classifier with the corresponding tweets' embeddings to promptly update the system. This makes it well suited to addressing the ever-evolving nature of online harmfulness, providing online communities with a robust, generalizable, and cost-effective tool to safeguard their communities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. ArXiv abs/2305.10403 (2023). URL https://api.semanticscholar.org/CorpusID:258740735
  2. ArXiv abs/2006.11477 (2020). https://api.semanticscholar.org/CorpusID:219966759
  3. https://www.adept.ai/blog/fuyu-8b
  4. ArXiv abs/0912.3599 (2009)
  5. In: European Conference on Computer Vision. Springer (2020). https://arxiv.org/abs/2005.12872
  6. ArXiv abs/1803.11175 (2018). URL https://api.semanticscholar.org/CorpusID:4494896
  7. In: COIN@AAMAS (2020). URL https://api.semanticscholar.org/CorpusID:215745091
  8. Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering (2021). URL https://api.semanticscholar.org/CorpusID:235352820
  9. ArXiv abs/1705.02364 (2017). URL https://api.semanticscholar.org/CorpusID:28971531
  10. In: Proceedings of the 11th International AAAI Conference on Web and Social Media, ICWSM ’17, pp. 512–515 (2017). https://arxiv.org/abs/1703.04009
  11. In: International Conference on Web and Social Media (2017). URL https://api.semanticscholar.org/CorpusID:1733167
  12. In: NeurIPS Datasets and Benchmarks (2021). Https://arxiv.org/abs/2111.11431
  13. In: British Machine Vision Conference (2017). https://api.semanticscholar.org/CorpusID:6095318
  14. Communications of the ACM 63, 139–144 (2014). https://api.semanticscholar.org/CorpusID:1033682
  15. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 2700–2717. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.acl-long.210
  16. In: International Conference on Learning Representations (2016). URL https://api.semanticscholar.org/CorpusID:46798026
  17. ArXiv abs/2310.06825 (2023)
  18. Jolliffe, I.T.: Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2(4), 433–459 (2010). DOI 10.1002/wics.101
  19. ArXiv abs/1411.2539 (2014)
  20. URL https://arxiv.org/abs/2305.14791
  21. ArXiv abs/2310.03744 (2023)
  22. ArXiv abs/2304.08485 (2023)
  23. PLOS ONE (2023). DOI 10.1371/journal.pone.0278511
  24. J. Open Source Softw. 3, 861 (2018). URL https://api.semanticscholar.org/CorpusID:53244226
  25. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2156–2164 (2016). https://api.semanticscholar.org/CorpusID:945386
  26. Risk Analysis 42, 1155 – 1178 (2020). URL https://api.semanticscholar.org/CorpusID:211817846
  27. OpenAI: Gpt-4 technical report. ArXiv abs/2303.08774 (2023)
  28. ArXiv abs/2306.01116 (2023)
  29. In: M. Meila, T. Zhang (eds.) Machine Learning, Proceedings of the 38th International Conference on, Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR, Virtual Event (2021). URL https://proceedings.mlr.press/v139/radford21a.html
  30. ArXiv abs/2212.04356 (2022)
  31. In: Conference on Empirical Methods in Natural Language Processing (2019). https://api.semanticscholar.org/CorpusID:201646309
  32. International Journal of Computer Vision 77, 125–141 (2008). https://api.semanticscholar.org/CorpusID:1089627
  33. In: Proceedings of the 10th ACM Conference on Web Science, pp. 255–264 (2019). https://doi.org/10.1145/3292522.3326032
  34. In: International Conference on Artificial Neural Networks (1997). https://api.semanticscholar.org/CorpusID:7831590
  35. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3519–3524 (2018). https://github.com/facebookresearch/LASER
  36. ArXiv abs/2307.09288 (2023)
  37. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11,564–11,573 (2019). https://api.semanticscholar.org/CorpusID:145047863
  38. Neurocomputing 184, 232–242 (2016). https://api.semanticscholar.org/CorpusID:207111259
  39. In: A. Birch, A. Finch, H. Hayashi, I. Konstas, T. Luong, G. Neubig, Y. Oda, K. Sudoh (eds.) Proceedings of the 3rd Workshop on Neural Generation and Translation, pp. 215–220. Association for Computational Linguistics, Hong Kong (2019). DOI 10.18653/v1/D19-5623. URL https://aclanthology.org/D19-5623
  40. CoRR abs/2309.05519 (2023). URL https://arxiv.org/abs/2309.05519
  41. In: AAAI Conference on Artificial Intelligence (2017). URL https://api.semanticscholar.org/CorpusID:2060721
  42. In: AAAI Conference on Artificial Intelligence (2017). https://api.semanticscholar.org/CorpusID:2060721
  43. ArXiv abs/2310.07554 (2023)
  44. Journal of Computational and Graphical Statistics 15, 265–286 (2006). https://api.semanticscholar.org/CorpusID:5730904
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Albert Lu (7 papers)
  2. Stephen Cranefield (17 papers)

Summary

We haven't generated a summary for this paper yet.