Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mitigating Negative Transfer with Task Awareness for Sexism, Hate Speech, and Toxic Language Detection (2307.03377v1)

Published 7 Jul 2023 in cs.CL and cs.LG

Abstract: This paper proposes a novelty approach to mitigate the negative transfer problem. In the field of machine learning, the common strategy is to apply the Single-Task Learning approach in order to train a supervised model to solve a specific task. Training a robust model requires a lot of data and a significant amount of computational resources, making this solution unfeasible in cases where data are unavailable or expensive to gather. Therefore another solution, based on the sharing of information between tasks, has been developed: Multi-Task Learning (MTL). Despite the recent developments regarding MTL, the problem of negative transfer has still to be solved. Negative transfer is a phenomenon that occurs when noisy information is shared between tasks, resulting in a drop in performance. This paper proposes a new approach to mitigate the negative transfer problem based on the task awareness concept. The proposed approach results in diminishing the negative transfer together with an improvement of performance over classic MTL solution. Moreover, the proposed approach has been implemented in two unified architectures to detect Sexism, Hate Speech, and Toxic Language in text comments. The proposed architectures set a new state-of-the-art both in EXIST-2021 and HatEval-2019 benchmarks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. “Semi-supervised Multi-task Learning for Multi-label Fine-grained Sexism Classification” In Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 5810–5820
  2. “SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter” In Proceedings of the 13th International Workshop on Semantic Evaluation, 2019, pp. 54–63
  3. David M Blei, Andrew Y Ng and Michael I Jordan “Latent Dirichlet Allocation” In Journal of Machine Learning Research 3.Jan, 2003, pp. 993–1022
  4. Wayne D. Blizard “Multiset Theory” In Notre Dame Journal of Formal Logic 30.1 Duke University Press, 1988, pp. 36–66
  5. “Enriching Word Vectors with Subword Information” In Transactions of the Association for Computational Linguistics 5 MIT Press, 2017, pp. 135–146
  6. Bernhard E Boser, Isabelle M Guyon and Vladimir N Vapnik “A Training Algorithm for Optimal Margin Classifiers” In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 1992, pp. 144–152
  7. “Spanish Pre-trained Bert Model and Evaluation Data” In Practical Machine Learning for Developing Countries (PML4DC) at Eleventh International Conference on Learning Representations (ICLR) 2020, 2020, pp. 1–10
  8. Rich Caruana, Steve Lawrence and C Giles “Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping” In Advances in Neural Information Processing Systems 13, 2000, pp. 381–387
  9. “GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks” In Proc. ICML PMLR, 2018, pp. 794–803
  10. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, pp. 4171–4186
  11. Lixin Duan, Dong Xu and Ivor W. Tsang “Learning with Augmented Features for Heterogeneous Domain Adaptation” In Proceedings of the 29th International Coference on International Conference on Machine Learning (ICML ’12) Omnipress, 2012, pp. 667–674
  12. “A Comprehensive Survey of Clustering Algorithms: State-of-the-art Machine Learning Applications, Taxonomy, Challenges, and Future Research Prospects” In Engineering Applications of Artificial Intelligence 110 Elsevier, 2022, pp. 104743
  13. Lingyong Fang, Gongshen Liu and Ru Zhang “Sense-aware BERT and Multi-task Fine-tuning for Multimodal Sentiment Analysis” In Proc. IJCNN, 2022, pp. 1–8
  14. “Compressed Hierarchical Representations for Multi-task Learning and Task Clustering” In Proc. IJCNN, 2022, pp. 01–08
  15. “Dynamic Task Prioritization for Multitask Learning” In Proc. ECCV, 2018, pp. 270–287
  16. “Task Aware Multi-Task Learning for Speech to Text Tasks” In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 7723–7727
  17. Alex Kendall, Yarin Gal and Roberto Cipolla “Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7482–7491
  18. Brian Kulis, Kate Saenko and Trevor Darrell “What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms” In CVPR 2011, 2011, pp. 1785–1792 IEEE
  19. Ivano Lauriola, Alberto Lavelli and Fabio Aiolli “An Introduction to Deep Learning in Natural Language Processing: Models, Techniques, and Tools” In Neurocomputing 470 Elsevier, 2022, pp. 443–456
  20. “Learning Multiple Tasks with Multilinear Relationship Networks” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 1593–1602
  21. “Decoupled Weight Decay Regularization” In Proc. ICLR, 2019
  22. “Fully-adaptive Feature Sharing in Multi-task Networks with Applications in Person Attribute Classification” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5334–5343
  23. Angel Felipe Magnossão de Paula, Roberto Fray Silva and Ipek Baris Schlicht “Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models” In Proc. IberLEF’21, 2021, pp. 356–373
  24. “Efficient Estimation of Word Representations in Vector Space” In Proc. ICLR, 2013
  25. Dubravko Miljković “Brief Review of Self-organizing Maps” In 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2017, pp. 1061–1066
  26. Daniel W Otter, Julian R Medina and Jugal K Kalita “A Survey of the Usages of Deep Learning for Natural Language Processing” In IEEE Transactions on Neural Networks and Learning Systems 32.2 IEEE, 2020, pp. 604–624
  27. Sinno Jialin Pan and Qiang Yang “A Survey on Transfer Learning” In IEEE Transactions on Knowledge and Data Engineering 22.10 IEEE, 2009, pp. 1345–1359
  28. Juan Manuel Pérez and Franco M. Luque “Atalaya at SemEval 2019 Task 5: Robust Embeddings for Tweet Classification” In Proceedings of the 13th International Workshop on Semantic Evaluation Association for Computational Linguistics, 2019, pp. 64–69
  29. “Overview of EXIST 2023: sEXism Identification in Social NeTworks” In Proc. ECIR Springer Nature Switzerland, 2023, pp. 593–599
  30. Flor Miriam Plaza-del-Arco, M Dolores Molina-González and L Alfonso “SINAI at IberLEF-2021 DETOXIS Task: Exploring Features as Tasks in a Multi-task Learning Approach to Detecting Toxic Comments” In Proc. IberLEF’21, 2021, pp. 580–590
  31. “Cross-language Text Classification Using Structural Correspondence Learning” In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010, pp. 1118–1127
  32. “Overview of EXIST 2021: sEXism Identification in Social neTworks” In Procesamiento del Lenguaje Natural 67, 2021, pp. 195–207
  33. Sebastian Ruder “An Overview of Multi-Task Learning in Deep Neural Networks” In CoRR abs/1706.05098, 2017 arXiv:1706.05098
  34. “Transfer Learning in Natural Language Processing” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, 2019, pp. 15–18
  35. “Multi-Task Learning as Multi-Objective Optimization” In Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 525–536
  36. “Gradient Adversarial Training of Neural Networks” US Patent App. 17/051,982 Google Patents, 2021
  37. “Overview of DETOXIS at IberLEF 2021: DEtection of TOxicity in comments In Spanish” In Procesamiento del Lenguaje Natural 67, 2021, pp. 209–221
  38. “Branched Multi-task Networks: Deciding What Layers to Share” In Proceedings of the 31st British Machine Vision Conference (BMVC ’20) BMVA Press, 2020
  39. “Multi-task Learning for Dense Prediction Tasks: A Survey”, 2022, pp. 3614–3633
  40. “Attention is All You Need” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 6000–6010
  41. “Deep Learning for Computer Vision: A Brief Review” In Computational Intelligence and Neuroscience 2018 London, GBR: Hindawi Limited, 2018
  42. “Heterogeneous Domain Adaptation Using Manifold Alignment” In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Volume Two, 2011, pp. 1541–1546
  43. “Exploring Topic Supervision with BERT for Text Matching” In Proc. IJCNN, 2022, pp. 1–7
  44. Karl Weiss, Taghi M Khoshgoftaar and DingDing Wang “A Survey of Transfer Learning” In Journal of Big Data 3.1 SpringerOpen, 2016, pp. 1–40
  45. “Multi-task Learning for Natural Language Processing in the 2020s: Where are We Going?” In Pattern Recognition Letters 136, 2020, pp. 120–126
  46. Sen Wu, Hongyang R. Zhang and Christopher Ré “Understanding and Improving Information Transfer in Multi-task Learning” In Proc. ICLR, 2020
  47. Shengqiong Wu, Hao Fei and Donghong Ji “Aggressive Language Detection with Joint Text Normalization via Adversarial Multi-task Learning” In CCF International Conference on Natural Language Processing and Chinese Computing, 2020, pp. 683–696
  48. “Pad-net: Multi-tasks Guided Prediction-and-distillation Network for Simultaneous Depth Estimation and Scene Parsing” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 675–684
  49. “A Survey on Multi-task Learning” In IEEE Transactions on Knowledge and Data Engineering 34.12, 2022, pp. 5586–5609
  50. “Joint Task-recursive Learning for Semantic Segmentation and Depth Estimation” In Proc. ECCV, 2018, pp. 235–251
  51. “Pattern-affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4106–4115
  52. “A Modulation Module for Multi-task Learning with Applications in Image Retrieval” In Proc. ECCV, 2018, pp. 401–416
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Angel Felipe Magnossão de Paula (10 papers)
  2. Paolo Rosso (41 papers)
  3. Damiano Spina (29 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.