SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores (2403.10408v1)
Abstract: We present SocialGenPod, a decentralised and privacy-friendly way of deploying generative AI Web applications. Unlike centralised Web and data architectures that keep user data tied to application and service providers, we show how one can use Solid -- a decentralised Web specification -- to decouple user data from generative AI applications. We demonstrate SocialGenPod using a prototype that allows users to converse with different LLMs, optionally leveraging Retrieval Augmented Generation to generate answers grounded in private documents stored in any Solid Pod that the user is allowed to access, directly or indirectly. SocialGenPod makes use of Solid access control mechanisms to give users full control of determining who has access to data stored in their Pods. SocialGenPod keeps all user data (chat history, app configuration, personal documents, etc) securely in the user's personal Pod; separate from specific model or application providers. Besides better privacy controls, this approach also enables portability across different services and applications. Finally, we discuss challenges, posed by the large compute requirements of state-of-the-art models, that future research in this area should address. Our prototype is open-source and available at: https://github.com/Vidminas/socialgenpod/.
- Brett Bejcek. 2024. Who has access to my data? https://help.rewind.ai/en/articles/6526621-who-has-access-to-my-data. Accessed: 2024-02-05.
- Tim Berners-Lee. 2023. Inference from Private Data. https://www.w3.org/DesignIssues/PrivateData.html. Accessed:2024-02-05.
- Ben Derico. 2023. ChatGPT bug leaked users’ conversation histories. https://www.bbc.co.uk/news/technology-65047304. Accessed: 2024-01-24.
- Iron: Private inference on transformers. Advances in Neural Information Processing Systems 35 (2022), 15718–15731.
- Mixtral of experts. arXiv preprint arXiv:2401.04088 (2024).
- Language Models as a Service: Overview of a New Paradigm and its Challenges. arXiv e-prints (2023), arXiv–2309.
- A demonstration of the solid platform for social web applications. In Proceedings of the 25th international conference companion on world wide web. 223–226.
- Solid: a platform for decentralized social applications based on linked data. MIT CSAIL & Qatar Computing Research Institute, Tech. Rep. (2016).
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
- Diffusion models: A comprehensive survey of methods and applications. Comput. Surveys 56, 4 (2023), 1–39.
- A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).