Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 39 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Is Your AI Truly Yours? Leveraging Blockchain for Copyrights, Provenance, and Lineage (2404.06077v1)

Published 9 Apr 2024 in cs.CR, cs.AI, and cs.CY

Abstract: As AI integrates into diverse areas, particularly in content generation, ensuring rightful ownership and ethical use becomes paramount. AI service providers are expected to prioritize responsibly sourcing training data and obtaining licenses from data owners. However, existing studies primarily center on safeguarding static copyrights, which simply treats metadata/datasets as non-fungible items with transferable/trading capabilities, neglecting the dynamic nature of training procedures that can shape an ongoing trajectory. In this paper, we present \textsc{IBis}, a blockchain-based framework tailored for AI model training workflows. \textsc{IBis} integrates on-chain registries for datasets, licenses and models, alongside off-chain signing services to facilitate collaboration among multiple participants. Our framework addresses concerns regarding data and model provenance and copyright compliance. \textsc{IBis} enables iterative model retraining and fine-tuning, and offers flexible license checks and renewals. Further, \textsc{IBis} provides APIs designed for seamless integration with existing contract management software, minimizing disruptions to established model training processes. We implement \textsc{IBis} using Daml on the Canton blockchain. Evaluation results showcase the feasibility and scalability of \textsc{IBis} across varying numbers of users, datasets, models, and licenses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. R. Bommasani et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
  2. W. X. Zhao et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.
  3. Y. Zhu et al., “Large language models for information retrieval: A survey,” arXiv preprint arXiv:2308.07107, 2023.
  4. L. Bonifacio, H. Abonizio, M. Fadaee, and R. Nogueira, “Inpars: Data augmentation for information retrieval using large language models,” arXiv preprint arXiv:2202.05144, 2022.
  5. J. Li et al., “Pretrained language models for text generation: A survey,” arXiv preprint arXiv:2201.05273, 2022.
  6. J. Chen et al., “Benchmarking large language models in retrieval-augmented generation,” in AAAI, 2024.
  7. N. Carlini et al., “Extracting training data from large language models,” in USENIX Security, 2021, pp. 2633–2650.
  8. J. Hoffmann and thers, “An empirical analysis of compute-optimal large language model training,” NIPS, vol. 35, pp. 30 016–30 030, 2022.
  9. T. Chu, Z. Song, and C. Yang, “How to protect copyright data in optimization of large language models?” in AAAI, vol. 38, no. 16, 2024, pp. 17 871–17 879.
  10. N. Vyas, S. M. Kakade, and B. Barak, “On provable copyright protection for generative models,” in Int. Conf. on Machine Learning (ICML).   PMLR, 2023, pp. 35 277–35 299.
  11. Z. Yu, Y. Wu, N. Zhang, C. Wang, Y. Vorobeychik, and C. Xiao, “Codeipprompt: intellectual property infringement assessment of code language models,” in Int. Conf. on Machine Learning (ICML).   PMLR, 2023, pp. 40 373–40 389.
  12. Q. Lu et al., “Developing responsible chatbots for financial services: A pattern-oriented responsible AI engineering approach,” IEEE Intelligent Systems, 2023.
  13. Q. Lu, L. Zhu, X. Xu, J. Whittle, D. Zowghi, and A. Jacquet, “Operationalizing responsible AI at scale: CSIRO data61’s pattern-oriented responsible AI engineering approach,” Communications of the ACM (CACM), vol. 66, no. 7, pp. 64–66, 2023.
  14. A. Power, “Licensing agreements,” Miss. LJ, vol. 42, p. 169, 1970.
  15. M. Benjamin, P. Gagnon, N. Rostamzadeh, C. Pal, Y. Bengio, and A. Shee, “Towards standardization of data licenses: The montreal data license,” arXiv preprint arXiv:1903.12262, 2019.
  16. D. Contractor et al., “Behavioral use licensing for responsible AI,” in ACM Conf. on Fairness, Accountability, and Transparency, 2022, pp. 778–788.
  17. D. Siddarth, D. Acemoglu, D. Allen, K. Crawford, J. Evans, M. Jordan, and E. Weyl, “How AI fails us,” arXiv preprint arXiv:2201.04200, 2021.
  18. R. Li et al., “How do smart contracts benefit security protocols?” arXiv preprint arXiv:2202.08699, 2022.
  19. L. T. Nguyen et al., “Blockchain-empowered trustworthy data sharing: Fundamentals, applications, and challenges,” arXiv preprint arXiv:2303.06546, 2023.
  20. W. Zhang et al., “Blockchain-based distributed compliance in multinational corporations’ cross-border intercompany transactions,” in Advances in Information and Communication Networks.   Cham: Springer Int. Publishing, 2019, pp. 304–320.
  21. T. Scott, A. L. Post, J. Quick, and S. Rafiqi, “Evaluating feasibility of blockchain application for DSCSA compliance,” SMU Data Science Review, vol. 1, no. 2, 2018.
  22. M. Allena, “Blockchain technology and regulatory compliance: Towards a cooperative supervisory model,” European Review of Digital Administration & Law, pp. 37–43, 2022.
  23. O. Ural and K. Yoshigoe, “Survey on blockchain-enhanced machine learning,” IEEE Access, vol. 11, pp. 145 331–145 362, 2023.
  24. A. A. Hussain and F. Al-Turjman, “Artificial intelligence and blockchain: A review,” ETT, vol. 32, no. 9, p. e4268, 2021.
  25. Y. Liu, F. R. Yu, X. Li, H. Ji, and V. C. Leung, “Blockchain and machine learning for communications and networking systems,” IEEE Communications Surveys & Tutorials, 2020.
  26. A. Bernauer et al., “Daml: A smart contract language for securely automating real-world multi-party business workflows,” 2023.
  27. D. A. C. Team, “Canton: A Daml based ledger interoperability protocols,” Digital Asset, Tech. Rep., 2020. [Online]. Available: https://www.digitalasset.com/hubfs/Canton/canton-whitepaper.pdf?hsLang=en
  28. D. Kreuzberger, N. Kühl, and S. Hirschl, “Machine learning operations (MLOps): Overview, definition, and architecture,” IEEE Access, vol. 11, pp. 31 866–31 879, 2023.
  29. J. Litman, “What notice did,” Boston University Law Review, vol. 96, pp. 717–744, 2016.
  30. Q. Wang, R. Li, Q. Wang, and S. Chen, “Non-fungible token (NFT): Overview, evaluation, opportunities and challenges,” arXiv preprint arXiv:2105.07447, 2021.
  31. C. Meurisch and M. Mühlhäuser, “Data protection in AI services: A survey,” ACM Computing Surveys, 2021.
  32. B. Gedik and L. Liu, “Protecting location privacy with personalized k-anonymity: Architecture and algorithms,” IEEE Trans. on Mobile Computing (TMC), vol. 7, no. 1, pp. 1–18, 2007.
  33. D. Xu, S. Yuan, and X. Wu, “Achieving differential privacy and fairness in logistic regression,” in Companion Proc. of The World Wide Web Conf. (WWW), 2019, pp. 594–599.
  34. J. Zhang, Z. Gu, J. Jang, H. Wu, M. P. Stoecklin, H. Huang, and I. Molloy, “Protecting intellectual property of deep neural networks with watermarking,” in AsiaCCS, 2018, pp. 159–172.
  35. R. Gilad-Bachrach et al., “Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy,” in ICML.   PMLR, 2016, pp. 201–210.
  36. P. Mohassel and Y. Zhang, “Secureml: A system for scalable privacy-preserving machine learning,” in SP.   IEEE, 2017, pp. 19–38.
  37. B. D. Rouhani, M. S. Riazi, and F. Koushanfar, “Deepsecure: Scalable provably-secure deep learning,” in Proc. of the Annual Design Automation Conf. (DAC), 2018, pp. 1–6.
  38. R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,” in CCS, 2015, pp. 1310–1321.
  39. S. Servia-Rodríguez, L. Wang, J. R. Zhao, R. Mortier, and H. Haddadi, “Privacy-preserving personal model training,” in IEEE/ACM Int. Conf. on Internet-of-Things Design and Implementation (IoTDI).   IEEE, 2018, pp. 153–164.
  40. W. Liang et al., “Circuit copyright blockchain: Blockchain-based homomorphic encryption for IP circuit protection,” IEEE Trans. on Emerging Topics in Computing (TETC), vol. 9, no. 3, pp. 1410–1420, 2020.
  41. Y. Liu, H. Du et al., “Blockchain-empowered lifecycle management for AI-generated content products in edge networks,” IEEE Wireless Communications, 2024.
  42. A. Savelyev, “Copyright in the blockchain era: Promises and challenges,” Computer Law & Security Review, vol. 34, no. 3, pp. 550–561, 2018.
  43. N. Jing, Q. Liu, and V. Sugumaran, “A blockchain-based code copyright management system,” Information Processing & Management, vol. 58, no. 3, p. 102518, 2021.
  44. B. Wang et al., “Image copyright protection based on blockchain and zero-watermark,” TNSE, vol. 9, no. 4, pp. 2188–2199, 2022.
  45. G. Yu et al., “Ironforge: An open, secure, fair, decentralized federated learning,” TNNLS, 2023.
  46. A. Borzunov et al., “Distributed inference and fine-tuning of large language models over the internet,” NIPS, vol. 36, 2024.
  47. X. Liu et al., “Decentralized federated unlearning on blockchain,” arXiv preprint arXiv:2402.16294, 2024.
  48. Y. Gao, Z. Song, and J. Yin, “Gradientcoin: A peer-to-peer decentralized large language models,” arXiv.2308.10502, 2023.
  49. E. Androulaki et al., “Hyperledger Fabric: A distributed operating system for permissioned blockchains,” in EuroSys, 2018, pp. 1–15.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: