Re-pseudonymization Strategies for Smart Meter Data Are Not Robust to Deep Learning Profiling Attacks (2404.03948v1)
Abstract: Smart meters, devices measuring the electricity and gas consumption of a household, are currently being deployed at a fast rate throughout the world. The data they collect are extremely useful, including in the fight against climate change. However, these data and the information that can be inferred from them are highly sensitive. Re-pseudonymization, i.e., the frequent replacement of random identifiers over time, is widely used to share smart meter data while mitigating the risk of re-identification. We here show how, in spite of re-pseudonymization, households' consumption records can be pieced together with high accuracy in large-scale datasets. We propose the first deep learning-based profiling attack against re-pseudonymized smart meter data. Our attack combines neural network embeddings, which are used to extract features from weekly consumption records and are tailored to the smart meter identification task, with a nearest neighbor classifier. We evaluate six neural networks architectures as the embedding model. Our results suggest that the Transformer and CNN-LSTM architectures vastly outperform previous methods as well as other architectures, successfully identifying the correct household 73.4% of the time among 5139 households based on electricity and gas consumption records (54.5% for electricity only). We further show that the features extracted by the embedding model maintain their effectiveness when transferred to a set of users disjoint from the one used to train the model. Finally, we extensively evaluate the robustness of our results. Taken together, our results strongly suggest that even frequent re-pseudonymization strategies can be reversed, strongly limiting their ability to prevent re-identification in practice.
- Citizens Advice. Save money on your gas and electricity. [Online]. Available: https://www.citizensadvice.org.uk/consumer/energy/energy-supply/get-a-better-energy-deal/save-money-on-your-gas-and-electricity/. Accessed: 14th December 2022.
- Benchmarking smart metering deployment in the eu-28. Final Report, 2020.
- Hybrid cnn-lstm model for short-term individual household load forecasting. IEEE Access, 8:180544–180557, 2020.
- IOT Analytics. Smart Meter Market 2019: Global penetration reached 14% – North America, Europe ahead. [Online]. Available: https://iot-analytics.com/smart-meter-market-2019-global-penetration-reached-14-percent/. Accessed: 10th January 2021.
- Article 29 Data Protection Working Party. Opinion 12/2011 on smart metering. https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2011/wp183_en.pdf, 2011.
- Article 29 Data Protection Working Party. Opinion 05/2014 on anonymisation techniques. https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf, 2014.
- An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018.
- Baringa. Access to data arrangements: evaluation. [Online]. Available: https://www.ofgem.gov.uk/sites/default/files/docs/2018/07/baringa_report_for_ofgem_enhanced_privacy_evaluation_for_hhs_published_version_2.0_0.pdf. Accessed: 31st January 2021.
- Automatic socio-economic classification of households using electricity consumption data. In Proceedings of the fourth international conference on Future energy systems, pages 75–86, 2013.
- Revealing household characteristics from smart meter data. Energy, 78:397–410, 2014.
- Behavioural biometrics using electricity load profiles. In 2014 22nd International Conference on Pattern Recognition, pages 1764–1769. IEEE, 2014.
- Domestic energy prices. [Online]. Available: https://commonslibrary.parliament.uk/research-briefings/cbp-9491/. [Online] Accessed: 14th December 2022.
- Optimal deep learning lstm model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches. Energies, 11(7):1636, 2018.
- A systematic literature review of deep learning approaches in smart meter data analytics. In 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), pages 1337–1342. IEEE, 2022.
- Re-identification of smart meter data. Personal and ubiquitous computing, 17:653–662, 2013.
- On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
- De-pseudonymization of smart metering data: Analysis and countermeasures. In 2018 Global Internet of Things Summit (GIoTS), pages 1–6. IEEE, 2018.
- CORDIS, European Comission. Smart grid management increases penetration of renewable energy. [Online]. Available: https://cordis.europa.eu/article/id/418252-smart-grid-management-increases-penetration-of-renewable-energy. Accessed: 17th May 2021.
- Interaction data are identifiable even across long periods of time. Nature Communications, 13(1):313, 2022.
- Energy Department for Business and Industrial Strategy. Smart meter statistics in great britain: Quarterly report to end june 2022. [Online]. Available: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1099629/Q2_2022_Smart_Meters_Statistics_Report.pdf. Accessed: 14th December 2022.
- Department for Business, Energy & Industrial Energy. Smart Meter Statistics in Great Britain: Quarterly Report to end March 2021. [Online]. Available: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/988831/Q1_2021_Smart_Meters_Statistics_Report.pdf. Accessed: 4th September 2021.
- Department for Business, Energy and Industrial Strategy. Appendix ii: Smart systems and flexibility plan monitoring framework. [Online]. Available: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1003793/smart-systems-appendix-ii-smart-systems-and-flexibility-plan-monitoring-framework.pdf. Accessed 31st January 2021.
- Bertaa: Bert fine-tuning for authorship attribution. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 127–137, 2020.
- Analyzing load profiles of energy consumption to infer household characteristics using smart meters. Energies, 12(5):773, 2019.
- How the quantity and quality of training data impacts re-identification of smart meter users? In 2015 IEEE International Conference on Smart Grid Communications (SmartGridComm), pages 31–36. IEEE, 2015.
- Commission for Energy Regulation (CER). Cer smart metering project-electricity customer behaviour trial, 2009-2010, 2012.
- OCHA The Centre for Humanitarian Data. Guidance note: Statistical disclosure control, 2019. https://data.humdata.org/dataset/2048a947-5714-4220-905b-e662cbcd14c8/resource/487f6d90-d879-4c59-8439-f2d08633f357/download/guidance_note_sdc.pdf.
- Deep learning for time series forecasting: The electric load case. CAAI Transactions on Intelligence Technology, 7(1):1–25, 2022.
- Multimedia content identification through smart meter power usage profiles. In Proceedings of the International Conference on Information and Knowledge Engineering (IKE), page 1. The Steering Committee of The World Congress in Computer Science, Computer …, 2012.
- Artificial neural network approach for short term load forecasting for illam region. World Academy of Science, Engineering and Technology, 28:280–284, 2007.
- In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
- Electricity load forecasting for residential customers: Exploiting aggregation and correlation between households. In 2013 Sustainable internet and ICT for sustainability (SustainIT), pages 1–6. IEEE, 2013.
- Smart metering de-pseudonymization. In Proceedings of the 27th annual computer security applications conference, pages 227–236, 2011.
- Outage management of distribution systems incorporating information from smart meters. IEEE Transactions on power systems, 31(5):4144–4154, 2015.
- Annstlf-a neural-network-based electric load forecasting system. IEEE Transactions on Neural networks, 8(4):835–846, 1997.
- Privacy concerns in upcoming residential and commercial demand-response systems. IEEE Proceedings on Power Systems, 1(1):1–10, 2008.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Privacy and data sharing in smart local energy systems: Insights and recommendations. University of Strathclyde Publishing, Glasgow, UK, 2020. [Online]. Available: https://www.energyrev.org.uk/media/1481/energyrev_privacyinsights_report_202011.pdf. Accessed: 18th May 2021].
- Detecting individual decision-making style: Exploring behavioral stylometry in chess. Advances in Neural Information Processing Systems, 34:24482–24497, 2021.
- Casimiro A Curbelo Montañez and William Hurst. A machine learning approach for detecting unemployment using the smart metering infrastructure. IEEE Access, 8:22525–22536, 2020.
- On the feasibility of internet-scale author identification. In 2012 IEEE Symposium on Security and Privacy, pages 300–314. IEEE, 2012.
- Council of the European Union. Infographic - energy price rise since 2021. [Online]. Available: https://www.consilium.europa.eu/en/infographics/energy-prices-2021/. [Online] Accessed: 14th December 2022.
- Office of Gas and Electricity Markets. Upgrading our energy system. [Online]. Available: https://www.ofgem.gov.uk/sites/default/files/docs/2017/07/upgrading_our_energy_system_-_smart_systems_and_flexibility_plan.pdf. Accessed 31st January 2021.
- Hristo Spassimirov Paskov. A regularization framework for active learning from imbalanced data. PhD thesis, Massachusetts Institute of Technology, 2010.
- Short-term residential load forecasting based on smart meter data using temporal convolutional networks. In 2020 39th Chinese Control Conference (CCC), pages 5423–5428. IEEE, 2020.
- Occupancy detection of residential buildings using smart meter data: A large-scale study. Energy and Buildings, 183:195–208, 2019.
- An optimisation-based energy disaggregation algorithm for low frequency smart meter data. Energy Informatics, 2(1):1–11, 2019.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
- Deep directed information-based learning for privacy-preserving smart meter data release. In 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pages 1–7. IEEE, 2019.
- Learning sparse privacy-preserving representations for smart meters data. In 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pages 333–338. IEEE, 2021.
- Deep learning for household load forecasting—a novel pooling deep rnn. IEEE Transactions on Smart Grid, 9(5):5271–5280, 2017.
- European Data Protection Supervisor. Opinion of the european data protection supervisor on the commission recommendation on preparations for the roll-out of smart metering systems. https://edps.europa.eu/sites/default/files/publication/12-06-08_smart_metering_en.pdf, 2012.
- Celia Topping. Smart meter data collection: how does it all work? [Online]. Available: https://www.ovoenergy.com/guides/energy-guides/a-guide-to-smart-meter-data.html. Accessed: 11th June 2021.
- Expanding the attack surface: Robust profiling attacks threaten the privacy of sparse behavioral data. Science Advances, 8(33):eabl6464, 2022.
- Analysis of the impact of data granularity on privacy for the smart grid. In Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society, pages 61–70, 2013.
- A study on data de-pseudonymization in the smart grid. In Proceedings of the Eighth European Workshop on System Security, pages 1–6, 2015.
- The influence of dataset characteristics on privacy preserving methods in the advanced metering infrastructure. Computers & Security, 76:178–196, 2018.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Unique in the smart grid-the privacy cost of fine-grained electrical consumption data. arXiv preprint arXiv:2211.07205, 2022.
- Deep learning-based socio-demographic information identification from smart meter data. IEEE Transactions on Smart Grid, 10(3):2593–2602, 2018.
- Short-term load forecasting with multi-source data using gated recurrent unit neural networks. Energies, 11(5):1138, 2018.
- Short-term load forecasting based on the transformer model. Information, 12(12):516, 2021.
- Electric load forecasting in smart grids using long-short-term-memory based recurrent neural network. In 2017 51st Annual conference on information sciences and systems (CISS), pages 1–6. IEEE, 2017.
- Ana-Maria Cretu (13 papers)
- Miruna Rusu (2 papers)
- Yves-Alexandre de Montjoye (33 papers)