Intent-Based Access Control: Using LLMs to Intelligently Manage Access Control (2402.07332v3)
Abstract: In every enterprise database, administrators must define an access control policy that specifies which users have access to which assets. Access control straddles two worlds: policy (organization-level principles that define who should have access) and process (database-level primitives that actually implement the policy). Assessing and enforcing process compliance with a policy is a manual and ad-hoc task. This paper introduces a new paradigm for access control called Intent-Based Access Control for Databases (IBAC-DB). In IBAC-DB, access control policies are expressed more precisely using a novel format, the natural language access control matrix (NLACM). Database access control primitives are synthesized automatically from these NLACMs. These primitives can be used to generate new DB configurations and/or evaluate existing ones. This paper presents a reference architecture for an IBAC-DB interface, an initial implementation for PostgreSQL (which we call LLM4AC), and initial benchmarks that evaluate the accuracy and scope of such a system. We further describe how to extend LLM4AC to handle other types of database deployment requirements, including temporal constraints and role hierarchies. We propose RHieSys, a requirement-specific method of extending LLM4AC, and DePLOI, a generalized method of extending LLM4AC. We find that our chosen implementation, LLM4AC, vastly outperforms other baselines, achieving high accuracies and F1 scores on our initial Dr. Spider benchmark. On all systems, we find overall high performance on expanded benchmarks, which include state-of-the-art NL2SQL data requiring external knowledge, and real-world role hierarchies from the Amazon Access dataset.
- [n.d.]. Access Control Policy and Implementation Guides. https://csrc.nist.gov/projects/access-control-policy-and-implementation-guides.
- [n.d.]. Database Security Policies: Examples and Creation. https://study.com/academy/lesson/database-security-policies-examples-and-creation.html.
- [n.d.]. OASIS eXtensible Access Control Markup Language (XACML) TC. https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml. Accessed 12-10-23.
- [n.d.]. SOC 2 Compliance. https://www.aicpa-cima.com/topic/audit-assurance/audit-and-assurance-greater-than-soc-2.
- 2020. Access Control Policy. https://www.luc.edu/its/aboutus/itspoliciesguidelines/accesscontrolpolicy/.
- 2022. IT Access Control and User Access Management Policy. https://www.nwpolytech.ca/about/administration/policies/fetch.php?ID=320.
- 2023. What is the Purpose of a Data Access Control Policy? https://satoricyber.com/data-access-control/what-is-the-purpose-of-a-data-access-control-policy/.
- A temporal access control mechanism for database systems. IEEE Transactions on Knowledge and Data Engineering 8, 1 (1996), 67–80. https://doi.org/10.1109/69.485637
- TRBAC: a temporal role-based access control model. In Proceedings of the Fifth ACM Workshop on Role-Based Access Control (Berlin, Germany) (RBAC ’00). Association for Computing Machinery, New York, NY, USA, 21–30. https://doi.org/10.1145/344287.344298
- Access control for databases: Concepts and systems. Foundations and Trends® in Databases 3, 1–2 (2011), 1–148.
- Elisa Bertino and Ravi Sandhu. 2005. Database security-concepts, approaches, and challenges. IEEE Transactions on Dependable and secure computing 2, 1 (2005), 2–19.
- Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness. arXiv:2301.08881 [cs.CL]
- Database access control and privacy: Is there a common ground?. In CIDR. Citeseer, 96–103.
- Democratizing data science. In Proceedings of the KDD 2014 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA. 24–27.
- C3: Zero-shot Text-to-SQL with ChatGPT. arXiv:2307.07306 [cs.CL]
- Data Governance: The Definitive Guide. O’Reilly Media, Inc.
- Bridget A Fahey. 2021. Data federalism. Harv. L. Rev. 135 (2021), 1007.
- CatSQL: Towards Real World Natural Language to SQL Applications. Proc. VLDB Endow. 16, 6 (feb 2023), 1534–1547. https://doi.org/10.14778/3583140.3583165
- Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. arXiv:2308.15363 [cs.DB]
- Yaga Dylan Hu Vincent, Kuhn Richard. [n.d.]. Verification and Test Methods for Access Control Policies/Models. https://csrc.nist.gov/pubs/sp/800/192/final.
- Data governance: Organizing data for trustworthy Artificial Intelligence. Government Information Quarterly 37, 3 (2020), 101493.
- Wolfgang Kerber. 2020. From (horizontal and sectoral) data access solutions towards data governance systems. (2020).
- Waking up to Marginalization: Public Value Failures in Artificial Intelligence and Data Science. In Proceedings of 2nd Workshop on Diversity in Artificial Intelligence (AIDBEI) (Proceedings of Machine Learning Research), Deepti Lamba and William H. Hsu (Eds.), Vol. 142. PMLR, 7–21. https://proceedings.mlr.press/v142/monroe-white21a.html
- LEVER: Learning to Verify Language-to-Code Generation with Execution. arXiv:2302.08468 [cs.LG]
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv:1908.10084 [cs.CL]
- Role-Based Access Control Models, 1996.
- Healthcare data breaches: insights and implications. In Healthcare, Vol. 8. MDPI, 133.
- ATHENA++: Natural Language Querying for Complex Nested SQL Queries. Proc. VLDB Endow. 13, 11 (2020), 2747–2759.
- Fix Me Up: Repairing Access-Control Bugs in Web Applications.. In NDSS. Citeseer.
- RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 7567–7578. https://doi.org/10.18653/v1/2020.acl-main.677
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL]
- Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii (Eds.). Association for Computational Linguistics, Brussels, Belgium, 3911–3921. https://doi.org/10.18653/v1/D18-1425
- Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR abs/1709.00103 (2017).
- Fine-Grained, Language-Based Access Control for Database-Backed Applications. The Art, Science, and Engineering of Programming 4, 2 (Sept. 2019). https://doi.org/10.22152/programming-journal.org/2020/4/3