Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reducing Privacy Risks in Online Self-Disclosures with Language Models (2311.09538v3)

Published 16 Nov 2023 in cs.CL and cs.HC

Abstract: Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a LLM for detection, achieving over 65% partial span F$_1$. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is rephrasing disclosures into less specific terms while preserving their utility, e.g., "Im 16F" to "I'm a teenage girl". We explore various fine-tuning strategies, and our best model can generate diverse abstractions that moderately reduce privacy risks while maintaining high utility according to human evaluation. To help users in deciding which disclosures to abstract, we present a task of rating their importance for context understanding. Our fine-tuned model achieves 80% accuracy, on-par with GPT-3.5. Given safety and privacy considerations, we will only release our corpus and models to researcher who agree to the ethical guidelines outlined in Ethics Statement.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. AnonyMate: A toolkit for anonymizing unstructured chat data. In Proceedings of the Workshop on NLP and Pseudonymisation, pages 1–7.
  2. A semantics-based approach to disclosure classification in user-generated online content. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3490–3499.
  3. Understanding social media disclosures of sexual abuse through the lenses of support seeking and anonymity. In Proceedings of the 2016 CHI conference on human factors in computing systems, pages 3906–3918.
  4. Dhananjay Ashok and Zachary C Lipton. 2023. Promptner: Prompting for named entity recognition. arXiv preprint arXiv:2305.15444.
  5. AWS. 2023. Amazon comprehend.
  6. Azure. 2023. Azure ai language.
  7. Self-disclosure topic model for classifying and analyzing twitter conversations. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1986–1996.
  8. Sairam Balani and Munmun De Choudhury. 2015. Detecting and characterizing mental health related self-disclosure in social media. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems, pages 1373–1378.
  9. Multitask learning for mental health conditions with limited social media data. In 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017. Proceedings of Conference. Association for Computational Linguistics.
  10. Privacy in crisis: A study of self-disclosure during the coronavirus pandemic. arXiv preprint arXiv:2004.09717.
  11. Privacy detective: Detecting private information and collective privacy behavior in a large social network. In Proceedings of the 13th Workshop on Privacy in the Electronic Society, pages 35–46.
  12. Quantifying memorization across neural language models. arXiv preprint arXiv:2202.07646.
  13. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650.
  14. Can language models be instructed to protect personal information? arXiv preprint arXiv:2310.02224.
  15. Assessing how users display self-disclosure and authenticity in conversation with human-like agents: A case study of luda lee. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 145–152.
  16. Speak up, fight back! detection of social media disclosures of sexual harassment. In Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: Student research workshop, pages 136–146.
  17. All that’s ‘human’ is not gold: Evaluating human evaluation of generated text. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7282–7296, Online. Association for Computational Linguistics.
  18. Paul C. Cozby. 1973. Self-disclosure: a literature review. Psychological bulletin, 79 2:73–91.
  19. Discovering shifts to suicidal ideation from mental health content in social media. In Proceedings of the 2016 CHI conference on human factors in computing systems, pages 2098–2110.
  20. De-identification of patient notes with recurrent neural networks. Journal of the American Medical Informatics Association, 24(3):596–606.
  21. Multitalk: A highly-branching dialog testbed for diverse conversations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12760–12767.
  22. Is GPT-3 text indistinguishable from human text? scarecrow: A framework for scrutinizing machine text. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7250–7274, Dublin, Ireland. Association for Computational Linguistics.
  23. Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056.
  24. An automatic mechanism to provide privacy awareness and control over unwittingly dissemination of online private information. Computer Networks, 202:108614.
  25. A privacy-preserving approach to extraction of personal information through automatic annotation and federated learning. In Proceedings of the Third Workshop on Privacy in Natural Language Processing, pages 36–45.
  26. Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv preprint arXiv:2111.09543.
  27. Dancing between success and failure: Edit-level simplification evaluation using SALSA. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore. Association for Computational Linguistics.
  28. spaCy: Industrial-strength natural language processing in python.
  29. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
  30. Are large pre-trained language models leaking your personal information? In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2038–2047.
  31. Preventing verbatim memorization in language models gives a false sense of privacy. arXiv preprint arXiv:2210.17546.
  32. Shotaro Ishihara. 2023. Training data extraction from pre-trained language models: A survey. arXiv preprint arXiv:2305.16157.
  33. Automatic identification and classification of bragging in social media. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3945–3959.
  34. Fasttext.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651.
  35. Sidney M Jourard. 1971. Self-disclosure: An experimental analysis of the transparent self.
  36. ProPILE: Probing privacy leakage in large language models. Advances in Neural Information Processing Systems 36 (NeurIPS 2023).
  37. Detecting personal medication intake in twitter: an annotated corpus and baseline classification system. In BioNLP 2017, pages 136–142.
  38. Differential privacy in natural language processing the story so far. In Proceedings of the Fourth Workshop on Privacy in Natural Language Processing, pages 1–11.
  39. Klaus Krippendorff. 2018. Content analysis: An introduction to its methodology. Sage Publications.
  40. Harold W Kuhn. 1955. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97.
  41. Online self-disclosure, social support, and user engagement during the covid-19 pandemic. ACM Transactions on Social Computing.
  42. Privacy and activism in the transgender community. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–13.
  43. Large language models can be strong differentially private learners. In International Conference on Learning Representations.
  44. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  45. Anonymisation models for text data: State of the art, challenges and future directions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4188–4203, Online. Association for Computational Linguistics.
  46. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  47. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  48. Analyzing leakage of personally identifiable information in language models. arXiv preprint arXiv:2302.00539.
  49. Mufan Luo and Jeffrey T Hancock. 2020. Self-disclosure and social media: motivations, mechanisms and psychological well-being. Current opinion in psychology, 31:110–115.
  50. LENS: A learnable evaluation metric for text simplification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16383–16408, Toronto, Canada. Association for Computational Linguistics.
  51. Behind the mask: Demographic bias in name detection for pii masking. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 76–89.
  52. Loose tweets: an analysis of privacy leaks on twitter. In Proceedings of the 10th annual ACM workshop on Privacy in the electronic society, pages 1–12.
  53. Can llms keep a secret? testing privacy implications of language models via contextual integrity theory. arXiv preprint arXiv:2310.17884.
  54. Unsupervised text deidentification. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4777–4788.
  55. Crowdsourcing on sensitive data with privacy-preserving text rewriting. In Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII), pages 73–84, Toronto, Canada. Association for Computational Linguistics.
  56. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.
  57. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
  58. The text anonymization benchmark (tab): A dedicated corpus and evaluation framework for text anonymization. Computational Linguistics, 48(4):1053–1101.
  59. How to dp-fy ml: A practical guide to machine learning with differential privacy. Journal of Artificial Intelligence Research, 77:1113–1201.
  60. Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Brussels, Belgium. Association for Computational Linguistics.
  61. An analysis of the user occupational class through twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1754–1764.
  62. Protection Regulation. 2016. Regulation (eu) 2016/679 of the european parliament and of the council. Regulation (eu), 679:2016.
  63. CometKiwi: IST-unbabel 2022 submission for the quality estimation shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 634–645, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  64. Measuring the language of self-disclosure across corpora. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1035–1047.
  65. An analysis of domestic abuse discourse on reddit. In Proceedings of the 2015 conference on empirical methods in natural language processing, pages 2577–2583.
  66. Manya Sleeper. 2016. Everyday online sharing. Ph.D. thesis, Carnegie Mellon University.
  67. Identifying and mitigating privacy risks stemming from language models: A survey. arXiv preprint arXiv:2310.01424.
  68. Beyond memorization: Violating privacy via inference with large language models. arXiv preprint arXiv:2310.07298.
  69. Detecting personal information in training corpora: an analysis. In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023), pages 208–220.
  70. Does fine-tuning gpt-3 with the openai api leak personally-identifiable information? arXiv preprint arXiv:2307.16382.
  71. Multilingual detection of personal employment status on twitter. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6564–6587.
  72. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  73. Detection and analysis of self-disclosure in online news commentaries. In The World Wide Web Conference, pages 3272–3278.
  74. What clued the ai doctor in? on the influence of data source and quality for transformer-based medical self-disclosure detection. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1193–1208.
  75. Identifying medical self-disclosure in online communities. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4398–4408.
  76. Identifying medical self-disclosure in online communities. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4398–4408, Online. Association for Computational Linguistics.
  77. Modeling self-disclosure in social networking sites. In Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing, pages 74–85.
  78. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  79. Rachel Wicks and Matt Post. 2021. A unified approach to sentence segmentation of punctuated text in many languages. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3995–4007, Online. Association for Computational Linguistics.
  80. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
  81. LUKE: Deep contextualized entity representations with entity-aware self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442–6454, Online. Association for Computational Linguistics.
  82. Self-disclosure and channel difference in online health support groups. In Proceedings of the international AAAI conference on web and social media, volume 11, pages 704–707.
  83. Depression and self-harm risk assessment in online forums. In Conference on Empirical Methods in Natural Language Processing, pages 2958–2968. ACL.
  84. Xiang Yue and Shuang Zhou. 2020. PHICON: Improving generalization of clinical text de-identification models via data augmentation. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, pages 209–214.
  85. Siren’s song in the ai ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.
  86. Identifying adverse drug events mentions in tweets using attentive, collocated, and aggregated medical representation. In Proceedings of the Fourth Social Media Mining for Health Applications (# SMM4H) Workshop & Shared Task, pages 62–70.
Citations (13)

Summary

We haven't generated a summary for this paper yet.