Dice Question Streamline Icon: https://streamlinehq.com

Scope of EU DSM TDM Exceptions for AI Model Training

Determine whether Article 3 of the EU Directive (EU) 2019/790 on copyright in the Digital Single Market (DSM Directive) grants research organisations a text and data mining exception that extends beyond pattern extraction to permit the training of AI models on works and other subject matter to which they have lawful access, and specify any conditions or limitations applicable under EU copyright and database law.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper explains that EU intellectual property law includes text and data mining (TDM) exceptions intended to enable scientific research. However, despite judicial signals affirming dataset creation under the DSM Directive for qualifying research organisations (e.g., LAION v. Kneschke), the authors note persistent ambiguity about whether these exceptions cover the actual training of AI models, not merely the extraction of patterns.

Clarifying whether the DSM Article 3 exception encompasses AI model training is critical for academic and non-profit research institutions that rely on lawfully accessed online content, including social media data. This determination affects the legality of end-to-end pipelines, from data acquisition to model development, and shapes compliance strategies in relation to copyright and database rights, as well as interactions with platform terms of service.

References

Similarly, intellectual property law presents ambiguity -- whilst the EU's Text and Data Mining (TDM) exceptions permit researchers to extract patterns from protected data, it remains unclear whether these protections extend to training AI models.

PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research (2508.09232 - Oh et al., 12 Aug 2025) in Section 2.2, Subsection 2.2.2 "Intellectual Property and Contract Law"