Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 64 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 102 tok/s Pro
Kimi K2 206 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Secrets Revealed in Container Images: An Internet-wide Study on Occurrence and Impact (2307.03958v1)

Published 8 Jul 2023 in cs.CR and cs.NI

Abstract: Containerization allows bundling applications and their dependencies into a single image. The containerization framework Docker eases the use of this concept and enables sharing images publicly, gaining high momentum. However, it can lead to users creating and sharing images that include private keys or API secrets-either by mistake or out of negligence. This leakage impairs the creator's security and that of everyone using the image. Yet, the extent of this practice and how to counteract it remains unclear. In this paper, we analyze 337,171 images from Docker Hub and 8,076 other private registries unveiling that 8.5% of images indeed include secrets. Specifically, we find 52,107 private keys and 3,158 leaked API secrets, both opening a large attack surface, i.e., putting authentication and confidentiality of privacy-sensitive data at stake and even allow active attacks. We further document that those leaked keys are used in the wild: While we discovered 1,060 certificates relying on compromised keys being issued by public certificate authorities, based on further active Internet measurements, we find 275,269 TLS and SSH hosts using leaked private keys for authentication. To counteract this issue, we discuss how our methodology can be used to prevent secret leakage and reuse.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (88)
  1. Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. In ACM CCS.
  2. A security analysis of Amazon’s Elastic Compute Cloud service. IEEE/IFIP DSN (2012).
  3. Assessing the Use of Insecure ICS Protocols via IXP Network Traffic Analysis. In IEEE ICCCN.
  4. Docker Container Security in Cloud Computing. In IEEE CCWC.
  5. Stuart Burns. 2021. How to keep Docker secrets secret. https://www.techtarget.com/searchitoperations/tip/How-to-keep-Docker-secrets-secret. (Accessed on 06/13/2022).
  6. Online Discoverability and Vulnerabilities of ICS/SCADA Devices in the Netherlands. arXiv:2011.02019.
  7. Measuring and Applying Invalid SSL Certificates: The Silent Majority. In ACM IMC.
  8. To Docker or Not to Docker: A Security Perspective. IEEE Cloud Comp. 3, 5 (2016).
  9. COMSYS. 2023. Docker Secret Analysis Code. https://github.com/COMSYS/docker-secret-analysis.
  10. Ang Cui and Salvatore J. Stolfo. 2010. A Quantitative Analysis of the Insecurity of Embedded Network Devices: Results of a Wide-Area Scan. In ACM ACSAC.
  11. Easing the Conscience with OPC UA: An Internet-Wide Study on Insecure Deployments. In ACM IMC.
  12. Missed Opportunities: Measuring the Untapped TLS Support in the Industrial Internet of Things. In ACM ASIACCS. New York, NY, USA.
  13. Jean-Laurent de Morlhon. 2020. Scaling Docker’s Business to Serve Millions More Developers: Storage - Docker. https://www.docker.com/blog/scaling-dockers-business-to-serve-millions-more-developers-storage/. (Accessed on 08/17/2022).
  14. deepfence. 2022. SecretScanner. https://github.com/deepfence/SecretScanner. (Accessed on 10/11/2022).
  15. David Dittrich and Erin Kenneally. 2012. The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research. Technical Report. U.S. Department of Homeland Security.
  16. Docker Inc. 2022a. Docker Documentation: Best practices for writing Dockerfiles. https://docs.docker.com/develop/develop-images/dockerfile{_}best-practices/. (Accessed on 11/11/2022).
  17. Docker Inc. 2022b. Docker Documentation: Deploy a registry server. https://docs.docker.com/registry/deploying/. (Accessed on 11/30/2022).
  18. Docker Inc. 2022c. Docker Documentation: Dockerfile reference. https://docs.docker.com/engine/reference/builder/. (Accessed on 08/11/2022).
  19. Docker Inc. 2022d. Docker Documentation: HTTP API. https://docs.docker.com/registry/spec/api/. (Accessed on 08/09/2022).
  20. Docker Inc. 2022e. Docker Documentation: Image Manifest. https://docs.docker.com/registry/spec/manifest-v2-2/. (Accessed on 08/09/2022).
  21. Docker Inc. 2022f. Docker Hub Container Image Library. https://hub.docker.com/. (Accessed on 06/07/2022).
  22. Docker Inc. 2022g. Increase Rate Limits - Docker. https://www.docker.com/increase-rate-limits/. (Accessed on 08/17/2022).
  23. Docker Inc. 2022h. Manage sensitive data with Docker secrets. https://docs.docker.com/engine/swarm/secrets/. (Accessed on 06/15/2022).
  24. Docker Inc. 2022i. What is a Container? - Docker. https://www.docker.com/resources/what-container/. (Accessed on 08/09/2022).
  25. A Search Engine Backed by Internet-Wide Scanning. In ACM CCS.
  26. ZMap: Fast Internet-wide Scanning and Its Security Applications. In USENIX SEC.
  27. Git Leaks: Boosting Detection Effectiveness Through Endpoint Visibility. In IEEE TrustCom.
  28. Automated Detection of Password Leakage from Public GitHub Repositories. In ACM ICSE. New York, NY, USA.
  29. A deeper understanding of SSH: Results from Internet-wide scans. In IEEE NOMS.
  30. Béla Genge and Călin Enăchescu. 2016. ShoVAT: Shodan-Based Vulnerability Assessment Tool for Internet-Facing Services. Sec. and Commun. Netw. 9, 15 (2016).
  31. GitGuardian. 2022. Git Security Scanning & Secrets Detection. https://www.gitguardian.com/. (Accessed on 06/17/2022).
  32. Hidden in Plain Sight: Obfuscated Strings Threatening Your Privacy. In ACM ASIACCS. New York, NY, USA.
  33. Dan Goodin. 2013. PSA: Don’t upload your important passwords to GitHub. https://arstechnica.com/information-technology/2013/01/psa-dont-upload-your-important-passwords-to-github/. (Accessed on 06/13/2022).
  34. Dan Goodin. 2018. Thousands of servers found leaking 750MB worth of passwords and keys. https://arstechnica.com/information-technology/2018/03/thousands-of-servers-found-leaking-750-mb-worth-of-passwords-and-keys/. (Accessed on 06/13/2022).
  35. Analyzing Internet-connected industrial equipment. In IEEE ICSigSys.
  36. Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices. In USENIX SEC.
  37. Michael Henriksen. 2022. Reconnaissance tool for GitHub organizations. https://github.com/michenriksen/gitrob. (Accessed on 06/17/2022).
  38. The Boon and Bane of Cross-Signing: Shedding Light on a Common Practice in Public Key Infrastructures. In ACM CCS.
  39. TLS in the Wild: An Internet-wide Analysis of TLS-based Protocols for Electronic Communication. NDSS (2016).
  40. The SSL Landscape: A Thorough Analysis of the x.509 PKI Using Active and Passive Measurements. In ACM IMC.
  41. Tracking the Deployment of TLS 1.3 on the Web: A Story of Experimentation and Centralization. ACM SIGCOMM Comput. Commun. Rev. 50, 3 (2020).
  42. Security Analysis and Threats Detection Techniques on Docker Container. In IEEE ICCC.
  43. Henri Hubert. 2021. Secrets exposed in Docker images: Hunting for secrets in Docker Hub. https://blog.gitguardian.com/hunting-for-secrets-in-docker-hub/. (Accessed on 06/13/2022).
  44. Static Vulnerability Analysis of Docker Images. IOP: Mat. Sc. and Eng. 1131, 1 (apr 2021).
  45. Sabrina Kall and Slim Trabelsi. 2021. An Asynchronous Federated Learning Approach for a Security Source Code Scanner. In ICISSP, Paolo Mori, Gabriele Lenzini, and Steven Furnell (Eds.).
  46. Peeking Under the Skirts of a Nation: Finding ICS Vulnerabilities in the Critical Digital Infrastructure. In ECCWS.
  47. Poster: Committed by Accident —- Prevention and Remediation Strategies Against Secret Leakage. https://www.ieee-security.org/TC/SP2022/program-posters.html.
  48. Tracking Certificate Misissuance in the Wild. In IEEE SP.
  49. Mohit Kumar. 2013. Hundreds of SSH Private Keys exposed via GitHub Search. https://thehackernews.com/2013/01/hundreds-of-ssh-private-keys-exposed.html. (Accessed on 06/13/2022).
  50. Detectify Labs. 2016. Slack bot token leakage exposing business critical information. https://labs.detectify.com/2016/04/28/slack-bot-token-leakage-exposing-business-critical-information/. (Accessed on 06/15/2022).
  51. TLS 1.3 in Practice: How TLS 1.3 Contributes to the Internet. In ACM WWW. New York, NY, USA.
  52. Analyzing Spatial Differences in the TLS Security of Delegated Web Services. In ACM ASIACCS. New York, NY, USA.
  53. Éireann P. Leverett. 2011. Quantitatively Assessing and Visualising Industrial System Attack Surfaces. Master’s thesis. University of Cambridge.
  54. Exploring the Unchartered Space of Container Registry Typosquatting. In USENIX SEC.
  55. Understanding the Security Risks of Docker Hub. In ESORICS, Liqun Chen, Ninghui Li, Kaitai Liang, and Steve Schneider (Eds.). Cham.
  56. Optimizing Leak Detection in Open-Source Platforms with Machine Learning Techniques. In ICISSP.
  57. The Fragility of Industrial IoT’s Data Backbone: Security and Privacy Issues in MQTT and CoAP Protocols. Technical Report. Trend Micro Inc.
  58. How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories. NDSS (2019).
  59. An Internet-wide view of ICS devices. In IEEE PST.
  60. Uncovering Vulnerable Industrial Control Systems from the Internet Core. In IEEE/IFIP NOMS.
  61. Claus Pahl. 2015. Containerization and the PaaS Cloud. IEEE Cloud Comp. 2, 3 (2015).
  62. The Seven Sins: Security Smells in Infrastructure as Code Scripts. In ICSE.
  63. Security Smells in Ansible and Chef Scripts: A Replication Study. ACM Trans. Softw. Eng. Methodol. 30, 1 (jan 2021).
  64. Akond Rahman and Laurie Williams. 2021. Different Kind of Smells: Security Smells in Infrastructure as Code Scripts. IEEE S&P 19, 3 (2021).
  65. Share, But be Aware: Security Smells in Python Gists. In IEEE ICSME.
  66. RedHunt Labs. 2021. Scanning Millions Of Publicly Exposed Docker Containers — Thousands Of Secrets Leaked (Wave 5). https://redhuntlabs.com/blog/scanning-millions-of-publicly-exposed-docker-containers-thousands-of-secrets-leaked.html. (Accessed on 06/13/2022).
  67. Secrets in Source Code: Reducing False Positives using Machine Learning. In IEEE COMSNETS.
  68. Impact of Evolving Protocols and COVID-19 on Internet Traffic Shares. https://arxiv.org/abs/2201.00142.
  69. SecurityFail. 2022. kompromat. https://github.com/SecurityFail/kompromat. (Accessed on 11/09/2022).
  70. Matías Sequeira. 2020. Low-hanging Secrets in Docker Hub and a Tool to Catch Them All. https://ioactive.com/guest-blog-docker-hub-scanner-matias-sequeira/. (Accessed on 06/13/2022).
  71. Shodan. 2013. Shodan. https://www.shodan.io.
  72. Detecting and Mitigating Secret-Key Leaks in Source Code Repositories. In IEEE/ACM MSR.
  73. Measuring the Security Harm of TLS Crypto Shortcuts. In ACM IMC.
  74. Stack Overflow. 2022. Developer Survey 2021. https://insights.stackoverflow.com/survey/2021. (Accessed on 07/11/2022).
  75. The Linux Foundation. 2022. Kubernetes - Production-Grade Container Orchestration. https://kubernetes.io/. (Accessed on 11/12/2022).
  76. TruffleSecurity. 2022. TruffleHog. https://github.com/trufflesecurity/trufflehog. (Accessed on 06/17/2022).
  77. Itamar Turner-Trauring. 21. Don’t leak your Docker image’s build secrets. https://pythonspeed.com/articles/docker-build-secrets/. (Accessed on 06/13/2022).
  78. An Internet-Wide View of Connected Cars: Discovery of Exposed Automotive Devices. In ACM ARES. New York, NY, USA.
  79. Large-Scale Cluster Management at Google with Borg. In ACM EuroSys. New York, NY, USA.
  80. Managing Security of Virtual Machine Images in a Cloud Environment. In ACM CCSW.
  81. Jonathan Codi West and Tyler Moore. 2022. Longitudinal Study of Internet-Facing OpenSSH Update Patterns. In PAM, Oliver Hohlfeld, Giovane Moura, and Cristel Pelsser (Eds.). Cham.
  82. Jordan Writght. 2014. Why Deleting Sensitive Information from Github Doesn’t Save You. https://jordan-wright.com/blog/2014/12/30/why-deleting-sensitive-information-from-github-doesnt-save-you/. (Accessed on 06/13/2022).
  83. The Landscape of Industrial Control Systems (ICS) Devices on the Internet. In IEEE Cyber SA.
  84. On the Relation between Outdated Docker Containers, Severity Vulnerabilities, and Bugs. In IEEE SANER.
  85. On the usage of JavaScript, Python and Ruby packages in Docker Hub images. Sc. of Comp. Prog. 207 (2021).
  86. Large-Scale Analysis of the Docker Hub Dataset. In IEEE CLUSTER.
  87. Slimmer: Weight Loss Secrets for Docker Registries. In IEEE CLOUD.
  88. Zeljka Zorz. 2014. 10,000 GitHub users inadvertently reveal their AWS secret access keys. https://www.helpnetsecurity.com/2014/03/24/10000-github-users-inadvertently-reveal-their-aws-secret-access-keys/. (Accessed on 06/13/2022).
Citations (4)

Summary

  • The paper reveals that 8.5% of scanned Docker images contain sensitive information like private keys and API secrets, indicating widespread secret leakage.
  • The study utilized a method scanning over 337,000 images from public and private registries, using regular expressions and filtering, validated by static analysis and limited ethical testing.
  • Findings underscore significant security risks from these leaks and recommend integrating secret-scanning tools into Docker pipelines and increasing user awareness to mitigate the problem.

Internet-wide Analysis of Secret Leakage in Container Images

The proliferation of containerization technologies, particularly Docker, has significantly facilitated application deployment by encapsulating all necessary software dependencies within single images. This advancement has also inadvertently introduced critical security issues related to the inclusion of sensitive data in these images. The in-depth analysis conducted by Dahlmanns et al. explores the scale and impact of secret leakage in Docker images, providing important insights into a largely underexplored vector of vulnerability in modern software distribution.

Key Findings and Statistical Insights

The paper's empirical investigation of 337,171 Docker images and 8,076 private registries revealed that a substantial 8.5% of these images contained sensitive information, including 52,107 private keys and 3,158 API secrets. This broad examination highlights a significant security exposure, suggesting a widespread negligence or oversight among image creators. Notably, the authors identified that compromised secrets are actively being used: 1,060 certificates were issued using compromised keys, and over 275,269 Internet-facing hosts authenticated with such keys, underscoring both the prevalence and potential exploitation of these secrets in the wild.

Methodological Approach

The methodology employed by the authors involves scanning publicly available Docker images from Docker Hub and other Internet-accessible registries to identify embedded sensitive data. Their approach utilizes regular expression-based matching, along with extensive filtering to discern genuine leaks from test data commonly present in software libraries. This discriminative technique was further validated through static analysis of reliably parsable key formats and by verifying the functionality of potential API keys to the extent ethically permissible.

Implications and Recommendations

The findings of this research raise significant concerns for developers and security professionals alike. Secrets embedded in Docker images can serve as launching pads for attackers to compromise systems, allowing unauthorized access and potentially leading to data breaches. The paper's revelation that both Docker Hub and private registries are affected suggests systemic issues that must be addressed through better practices and tooling improvements.

Dahlmanns et al. propose several mitigation strategies, emphasizing the need for increased awareness among Docker users about the inclusion of sensitive information in images. They advocate for integrating secret-scanning tools into the Docker pipeline to catch and prevent such mistakes at image creation or upload phases. Additionally, enhancing Docker's ecosystem by providing robust features that facilitate the secure handling of secrets within image filesystems could be pivotal.

Future Directions

The research opens pathways for future work in several directions. A notable area for development is the refinement and broad implementation of tools capable of detecting and managing secrets efficiently across various stages of the Docker container lifecycle. Further, exploring automated methods for correcting identified vulnerabilities without impeding operational functions could significantly advance container security. Another promising avenue is the exploration of alternative containerization paradigms that inherently separate secret management from application code and configurations, reducing the likelihood of accidental exposure.

Conclusion

This comprehensive paper on the inadvertent leakage of secrets in Docker images uncovers crucial vulnerabilities and provides a foundation for improving container security. As Docker and containerized applications become increasingly entrenched in modern software practices, addressing these challenges through enhanced security measures and user education will be essential to mitigate risks and safeguard sensitive information. The findings underscore the urgent need for a collective response from the developer community to integrate security considerations into the fundamental design and operational practices of containerization frameworks.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com