A Study of the Landscape of Privacy Policies of Smart Devices (2308.05890v2)
Abstract: As the adoption of smart devices continues to permeate all aspects of our lives, user privacy concerns have become more pertinent than ever. Privacy policies outline the data handling practices of these devices. Prior work in the domains of websites and mobile apps has shown that privacy policies are rarely read and understood by users. In these domains, automatic analysis of privacy policies has been shown to help give users appropriate insights. However, there is a lack of such an analysis in the domain of smart device privacy policies. This paper presents a comprehensive study of the landscape of privacy policies of smart devices. We introduce a methodology that addresses the unique challenges of smart devices, by finding information about them, their manufacturers, and their privacy policies on the Web. Our methodology utilizes state-of-the-art analysis techniques to assess readability and privacy of smart device policies and compares it policies of e-commerce websites and mobile applications. Overall, we analyzed 4,556 smart devices, 2,211 manufacturers, and 819 privacy policies. Despite smart devices having access to more intrusive data about their users (using sensors such as cameras and microphones), more than 1,167 of the analyzed manufacturers did not have policies available. The study highlights that significant improvement is required on communicating the data management practices of smart devices.
- Smartness as a continuous variable: identifying dimensions of intelligent environments. Frontiers in psychology, 7, 2016.
- Oberlo. Smart home statistics. https://www.oberlo.com/statistics/smart-home-statistics#:~:text=Smart%20home%20statistics%20show%20that%20in%202018%2C%2029.5,the%20coming%20years%2C%20reaching%2064.1%20million%20by%202025., 2018. Accessed: May 24, 2023.
- Always on (even when we’re off the grid): Privacy risks and conservation benefits associated with the internet of things. In IEEE Symposium on Security and Privacy (SP), 2017.
- Rolf H Weber. Internet of Things. Springer, 2010.
- Internet of things (iot): A vision, architectural elements, and future directions. Future Generation Computer Systems, 29(7), 2013.
- A study of privacy policies across smart home companies. In An Interactive Workshop on the Human aspects of Smarthome Security and Privacy (WSSP 2018), Symposium on Usable Privacy and Security (SOUPS), 2018.
- The internet of things: A survey. Computer Networks, 54(15), 2010.
- Alessandro Acquisti. Privacy and data protection in the age of big data: A time for big decisions. Computer Law & Security Review, 31(6), 2015.
- Large-scale readability analysis of privacy policies. In International Conference on Web Intelligence, 2017.
- The scoring of digital rights: a preliminary analysis. International Journal of Law and Information Technology, 17(2), 2009.
- Readability of privacy policies. In Data and Applications Security and Privacy XXXIV: 34th Annual IFIP WG 11.3 Conference, pages 388–399. Springer, 2020.
- Mozilla Foundation. Privacy not included. https://foundation.mozilla.org/en/privacynotincluded/, 2021. Accessed: 29th April 2023.
- Internet Archive. Wayback machine. https://archive.org/web/, 1996.
- Privacy policies of iot devices: Collection and analysis. Sensors, 22(5), 2022.
- When changing the look of privacy policies affects user trust: An experimental study. Computers in Human Behavior, 58:368–379, 2016.
- A data purpose case study of privacy policies. In 2017 IEEE 25th International Requirements Engineering Conference (RE), pages 394–399, 2017.
- Availability and quality of mobile health app privacy policies. Journal of the American Medical Informatics Association, 22(e1):e28–e33, 2015.
- Automated extraction and presentation of data practices in privacy policies. Proceedings on Privacy Enhancing Technologies, 2021, 2021.
- Towards usable privacy policies: Semi-automatically extracting data practices from websites’ privacy policies. In Symposium on Usable Privacy and Security (SOUPS 2014), 2014.
- Polisis: Automated analysis and presentation of privacy policies using deep learning. USENIX Security Symposium, 27(3), 2018.
- “why should i read the privacy policy, i just need the service”: A study on attitudes and perceptions toward privacy policies. IEEE access, 9:166465–166487, 2021.
- A human-in-the-loop approach for information extraction from privacy policies under data scarcity. arXiv preprint arXiv:2305.15006, 2023.
- PoliGraph: Automated privacy policy analysis using knowledge graphs. In 32nd USENIX Security Symposium (USENIX Security 23), pages 1037–1054, Anaheim, CA, August 2023. USENIX Association.
- Maps: Scaling privacy compliance analysis to a million apps. Proceedings on Privacy Enhancing Technologies, 2019:66 – 86, 2019.
- Pattern-matching-the gestalt approach. Dr Dobbs Journal, 13(7):46, 1988.
- Unifying privacy policy detection. In Proceedings on Privacy Enhancing Technologies, volume 2021, 06 2021.
- Peter Voigt and Arndt von dem Bussche. The eu general data protection regulation (gdpr): A practical guide. Springer, 2017.
- California consumer privacy act (ccpa) of 2018, 2018.
- The creation and analysis of a website privacy policy corpus. In 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016.
- A machine-learning based approach for measuring the completeness of online privacy policies. In 14th international conference on machine learning and applications, pages 289–294. IEEE, 12 2015.
- Sok: Context sensing for access control in the adversarial home iot. In IEEE European Symposium on Security and Privacy, pages 37–53, 2021.
- Umap: Uniform manifold approximation and projection for dimension reduction. ArXiv, abs/1802.03426, 2018.
- hdbscan: Hierarchical density based clustering. J. Open Source Softw., 2:205, 2017.
- David Banisar. National comprehensive data protection/privacy laws and bills 2023. Privacy Laws and Bills, 2023.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the ACL, pages 4171–4186. ACL, June 2019.
- Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. In 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017.
- Claude E Shannon. Prediction and entropy of printed english. Bell system technical journal, 30(1), 1951.
- The cost of reading privacy policies. Isjlp, 4:543, 2008.
- Marc Brysbaert. How many words do we read per minute? a review and meta-analysis of reading rate. PsyArXiv Preprints, 2019.
- Elfrieda H Hiebert. Unique words require unique instruction. TextMatters, 2012.
- Adam Geitgey. Natural language processing is fun! Medium, July, 18, 2018.
- spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. https://spacy.io/, 2017.
- Full-text or abstract? examining topic coherence scores using latent dirichlet allocation. In IEEE International conference on data science and advanced analytics (DSAA), 2017.
- A direct lda algorithm for high-dimensional data—with application to face recognition. Pattern recognition, 34(10), 2001.
- Daniel Naber. A rule-based style and grammar checker. PhD thesis, University of Bielefeld, 2003.
- Julien B Kouame. Using readability tests to improve the accuracy of evaluation documents intended for low-literate participants. Journal of MultiDisciplinary Evaluation, 6(14), 2010.
- Rudolf Flesch. Flesch-kincaid readability test. Retrieved October, 26(3), 2007.
- Joel R Reidenberg. Privacy policies as decision-making tools: An evaluation of ayres &&# x2010; braithwaite, irobot, and norton. Journal of Empirical Legal Studies, 13(4), 2016.
- The effect of text ambiguity on creating policy knowledge graphs. In 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 2021.
- Leo Breiman. Random forests. Machine learning, 45(1), 2001.
- Support-vector networks. Machine learning, 20(3):273–297, 1995.
- Fakespot. https://www.fakespot.com/.
- Deep contextualized word representations. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018.
- Aamir Hamid (2 papers)
- Hemanth Reddy Samidi (2 papers)
- Tim Finin (25 papers)
- Primal Pappachan (7 papers)
- Roberto Yus (6 papers)