Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DePLOI: Applying NL2SQL to Synthesize and Audit Database Access Control (2402.07332v4)

Published 11 Feb 2024 in cs.DB and cs.CR

Abstract: In every enterprise database, administrators must define an access control policy that specifies which users have access to which tables. Access control straddles two worlds: policy (organization-level principles that define who should have access) and process (database-level primitives that actually implement the policy). Assessing and enforcing process compliance with a policy is a manual and ad-hoc task. This paper introduces a new access control model called Intent-Based Access Control for Databases (IBAC-DB). In IBAC-DB, access control policies are expressed using abstractions that scale to high numbers of database objects, and are traceable with respect to implementations. This paper proposes DePLOI (Deployment Policy Linter for Organization Intents), a LLM-backed system leveraging access control-specific task decompositions to accurately synthesize and audit access control implementation from IBAC-DB abstractions. As DePLOI is the first system of its kind to our knowledge, this paper further proposes IBACBench, the first benchmark for evaluating the synthesis and auditing capabilities of DePLOI. IBACBench leverages a combination of current NL2SQL benchmarks, real-world role hierarchies and access control policies, and LLM-generated data. We find that DePLOI achieves high synthesis accuracies and auditing F1 scores overall, and greatly outperforms other LLM prompting strategies (e.g., by 10 F1 points).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. [n.d.]. Access Control Policy and Implementation Guides. https://csrc.nist.gov/projects/access-control-policy-and-implementation-guides.
  2. [n.d.]. Database Security Policies: Examples and Creation. https://study.com/academy/lesson/database-security-policies-examples-and-creation.html.
  3. [n.d.]. OASIS eXtensible Access Control Markup Language (XACML) TC. https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml. Accessed 12-10-23.
  4. [n.d.]. SOC 2 Compliance. https://www.aicpa-cima.com/topic/audit-assurance/audit-and-assurance-greater-than-soc-2.
  5. 2020. Access Control Policy. https://www.luc.edu/its/aboutus/itspoliciesguidelines/accesscontrolpolicy/.
  6. 2022. IT Access Control and User Access Management Policy. https://www.nwpolytech.ca/about/administration/policies/fetch.php?ID=320.
  7. 2023. What is the Purpose of a Data Access Control Policy? https://satoricyber.com/data-access-control/what-is-the-purpose-of-a-data-access-control-policy/.
  8. A temporal access control mechanism for database systems. IEEE Transactions on Knowledge and Data Engineering 8, 1 (1996), 67–80. https://doi.org/10.1109/69.485637
  9. TRBAC: a temporal role-based access control model. In Proceedings of the Fifth ACM Workshop on Role-Based Access Control (Berlin, Germany) (RBAC ’00). Association for Computing Machinery, New York, NY, USA, 21–30. https://doi.org/10.1145/344287.344298
  10. Access control for databases: Concepts and systems. Foundations and Trends® in Databases 3, 1–2 (2011), 1–148.
  11. Elisa Bertino and Ravi Sandhu. 2005. Database security-concepts, approaches, and challenges. IEEE Transactions on Dependable and secure computing 2, 1 (2005), 2–19.
  12. Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness. arXiv:2301.08881 [cs.CL]
  13. Database access control and privacy: Is there a common ground?. In CIDR. Citeseer, 96–103.
  14. Democratizing data science. In Proceedings of the KDD 2014 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA. 24–27.
  15. C3: Zero-shot Text-to-SQL with ChatGPT. arXiv:2307.07306 [cs.CL]
  16. Data Governance: The Definitive Guide. O’Reilly Media, Inc.
  17. Bridget A Fahey. 2021. Data federalism. Harv. L. Rev. 135 (2021), 1007.
  18. CatSQL: Towards Real World Natural Language to SQL Applications. Proc. VLDB Endow. 16, 6 (feb 2023), 1534–1547. https://doi.org/10.14778/3583140.3583165
  19. Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. arXiv:2308.15363 [cs.DB]
  20. Yaga Dylan Hu Vincent, Kuhn Richard. [n.d.]. Verification and Test Methods for Access Control Policies/Models. https://csrc.nist.gov/pubs/sp/800/192/final.
  21. Data governance: Organizing data for trustworthy Artificial Intelligence. Government Information Quarterly 37, 3 (2020), 101493.
  22. Wolfgang Kerber. 2020. From (horizontal and sectoral) data access solutions towards data governance systems. (2020).
  23. Waking up to Marginalization: Public Value Failures in Artificial Intelligence and Data Science. In Proceedings of 2nd Workshop on Diversity in Artificial Intelligence (AIDBEI) (Proceedings of Machine Learning Research), Deepti Lamba and William H. Hsu (Eds.), Vol. 142. PMLR, 7–21. https://proceedings.mlr.press/v142/monroe-white21a.html
  24. LEVER: Learning to Verify Language-to-Code Generation with Execution. arXiv:2302.08468 [cs.LG]
  25. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv:1908.10084 [cs.CL]
  26. Role-Based Access Control Models, 1996.
  27. Healthcare data breaches: insights and implications. In Healthcare, Vol. 8. MDPI, 133.
  28. ATHENA++: Natural Language Querying for Complex Nested SQL Queries. Proc. VLDB Endow. 13, 11 (2020), 2747–2759.
  29. Fix Me Up: Repairing Access-Control Bugs in Web Applications.. In NDSS. Citeseer.
  30. RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 7567–7578. https://doi.org/10.18653/v1/2020.acl-main.677
  31. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL]
  32. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii (Eds.). Association for Computational Linguistics, Brussels, Belgium, 3911–3921. https://doi.org/10.18653/v1/D18-1425
  33. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR abs/1709.00103 (2017).
  34. Fine-Grained, Language-Based Access Control for Database-Backed Applications. The Art, Science, and Engineering of Programming 4, 2 (Sept. 2019). https://doi.org/10.22152/programming-journal.org/2020/4/3
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com