Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 51 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 90 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 440 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Unfair TOS: An Automated Approach using Customized BERT (2401.11207v2)

Published 20 Jan 2024 in cs.CL and cs.CY

Abstract: Terms of Service (ToS) form an integral part of any agreement as it defines the legal relationship between a service provider and an end-user. Not only do they establish and delineate reciprocal rights and responsibilities, but they also provide users with information on essential aspects of contracts that pertain to the use of digital spaces. These aspects include a wide range of topics, including limitation of liability, data protection, etc. Users tend to accept the ToS without going through it before using any application or service. Such ignorance puts them in a potentially weaker situation in case any action is required. Existing methodologies for the detection or classification of unfair clauses are however obsolete and show modest performance. In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented custom BERT Fine-tuning in conjunction with SVC(Support Vector Classifier). The study shows proficient performance with a macro F1-score of 0.922 at unfair clause detection, and superior performance is also shown in the classification of unfair clauses by each tag. Further, a comparative analysis is performed by answering research questions on the Transformer models utilized. In order to further research and experimentation the code and results are made available on https://github.com/batking24/Unfair-TOS-An-Automated-Approach-based-on-Fine-tuning-BERT-in-conjunction-with-ML.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Dogu Araci. 2019. Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019).
  2. Finding a choice in a haystack: Automatic extraction of opt-out statements from privacy policy text. In Proceedings of The Web Conference 2020. 1943–1954.
  3. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.
  4. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020).
  5. Commerce. 2010. Commercial data privacy and innovation in the Internet economy: A dynamic policy framework. Department of Commerce (2010).
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  7. Challenges in classifying privacy policies by machine learning with word-based features. In proceedings of the 2nd international conference on cryptography, security and privacy. 62–66.
  8. A machine learning-based approach to identify unlawful practices in online terms of service: analysis, implementation and evaluation. Neural Computing and Applications 33 (2021), 17569–17587.
  9. Polisis: Automated analysis and presentation of privacy policies using deep learning. In 27th USENIX Security Symposium (USENIX Security 18). 531–548.
  10. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654 (2020).
  11. What does BERT learn about the structure of language?. In ACL 2019-57th Annual Meeting of the Association for Computational Linguistics.
  12. Revealing the dark secrets of BERT. arXiv preprint arXiv:1908.08593 (2019).
  13. CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artificial Intelligence and Law 27 (2019), 117–139.
  14. Marco Loos and Joasia Luzak. 2016. Wanted: a bigger stick. On unfair terms in consumer contracts with online service providers. Journal of consumer policy 39 (2016), 63–90.
  15. Processing long legal documents with pre-trained transformers: Modding legalbert and longformer. arXiv preprint arXiv:2211.00974 (2022).
  16. Cranor L McDonald A. 2008. The cost of reading privacy policies. A Journal of Law and Policy for the Information Society (2008).
  17. Mitchell. 1983. George Mitchell (Chesterhall) Ltd. v. Finney Lock Seeds Ltd. 3 WLR 163 (1983).
  18. Pittalis. 1986. Pittalis v. Sherefettin. 1 QB 868 (1986).
  19. Detecting and explaining unfairness in consumer contracts through memory networks. Artificial Intelligence and Law 30, 1 (2022), 59–92.
  20. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
  21. Securicor. 1980. Photo Production Ltd. v. Securicor Transport Ltd. 2 WLR 283, pp. 288, 289 (1980).
  22. Probing BERT for ranking abilities. In European Conference on Information Retrieval. Springer, 255–273.
  23. Attack on Unfair ToS Clause Detection: A Case Study using Universal Adversarial Triggers. arXiv preprint arXiv:2211.15556 (2022).
  24. Multilingual universal sentence encoder for semantic retrieval. arXiv preprint arXiv:1907.04307 (2019).
  25. When does pretraining help? assessing self-supervised learning for law and the casehold dataset of 53,000+ legal holdings. In Proceedings of the eighteenth international conference on artificial intelligence and law. 159–168.
  26. Revisiting Token Dropping Strategy in Efficient BERT Pretraining. arXiv preprint arXiv:2305.15273 (2023).

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube