EU AI Act: How Far Will EU Copyright Principles Extend?

February 12, 2024

The new EU AI Act contains provisions that potentially reach beyond EU borders that could conflict with US copyright law, among others, when proprietary datasets are used for AI training. Therefore, US companies doing business in the European Union will want to be aware of the adoption of the EU AI Act and its implementing regulations to avoid issues with their business models in the EU.

On January 24, 2024, an agreed text of the landmark EU AI Act (AI Act) was forwarded to the EU member states for technical examination and adoption. The text is close to final.Over the course of the 33 months since the first EU Commission proposal (COM/2021/206 final), the European Union’s approach has remained the same: While the AI Act generally does not discourage the use of AI, it prohibits the use of AI in numerous application scenarios or makes its use dependent on various technical, organizational, and legal requirements. In fact, in the Trilogue—the recent negotiations between the three EU bodies: the EU Commission, Parliament, and Council—many of the requirements of the AI Act were tightened even further than they had been in prior drafts.

The European Union’s ambitious goal is to adopt the EU Act in March or April—well before the EU election this summer.

One crucial distinction in the AI Act is between a “provider” and a “deployer” of AI.

While most rules in the AI Act target the deployer, some target the provider, including in cases where a third party’s infrastructure is used for providing the service.

  • provider is—largely as defined in earlier drafts—a natural or legal person, public authority, institution, or other body that develops an AI system (or a GPAI model) or has it developed in order to place it on the market, put it into circulation, or put it into operation under its own name or brand—whether for payment or free of charge—like a manufacturer in the sense of product law.
  • deployer of the AI system (still misleadingly referred to as a “user” in some EU documents) is a natural or legal person, public authority, institution, or other body that uses an AI system under its own responsibility, unless the AI system is used in the context of a purely personal and non-professional activity.


The AI Act's potentially far-reaching and stringent requirements for high-risk AI systems (a large group of AI systems) received comparatively little attention in the Trilogue, with recent amendments primarily focused on details, many of which are cosmetic.

The European Union’s insertion of undefined legal terms (e.g., ”relevant” and ”appropriate”) weakens the law, which has only further increased legal uncertainty at the expense of AI users. For instance, the fact that training data must be “relevant, representative, free of errors and complete”—one of the most critically examined requirements of the earlier drafts—has been modified to ”relevant, sufficiently representative, and to the best extent possible, free of errors and complete in view of the intended purpose.”

There remains a great need to eliminate ambiguities and make the AI Act more user-friendly.


Art. 3 (44e) of the current draft of the EU AI Act contains a short definition on a “general-purpose AI system,” as follows: “an AI system which is based on a general purpose AI model, that has the capability to serve a variety of purposes, both for direct use as well as for integration in other AI systems.” However, there is no clear definition of the term general-purpose AI (GPAI) “model” in the Article.

Recital 60c states that “large generative AI models are a typical example for a general-purpose AI model, given that they allow for flexible generation of content (such as in the form of text, audio, images or video) that can readily accommodate a wide range of distinctive tasks.”

Recital 60i of the EU AI Act contains potentially expansive obligations for GPAI “models”, such as the obligation to obtain the “authorization of the [individual] rightholder concerned” for AI training:

General-purpose models, in particular large generative models, capable of generating text, images, and other content, present unique innovation opportunities but also challenges to artists, authors, and other creators and the way their creative content is created, distributed, used and consumed. The development and training of such models require access to vast amounts of text, images, videos, and other data. Text and data mining techniques may be used extensively in this context for the retrieval and analysis of such content, which may be protected by copyright and related rights. Any use of copyright protected content requires the authorization of the rightholder concerned unless relevant copyright exceptions and limitations apply.

This Recital does not state clearly whether an exemption under US copyright law, such as a finding of fair use, would be sufficient. One possible interpretation of this ambiguous provision is that only authorization of the rightholder as recognized under EU copyright law will count, particularly if read alongside Recital 60j, which, in turn, potentially seeks to apply EU copyright law outside of the EU. Further, if this interpretation were to be accepted by courts and regulators in the EU, this may impose potentially heavy burdens on the applicable “provider” and/or the “deployer” with respect to their AI-related business activities (whether occurring inside and outside the EU). The text of Recital 60j states (emphasis added):

Providers that place general purpose AI models on the EU market should ensure compliance with the relevant obligations in this Regulation. For this purpose, providers of general purpose AI models should put in place a policy to respect Union law on copyright and related rights, in particular to identify and respect the reservations of rights expressed by rightholders pursuant to Article 4(3) of Directive (EU) 2019/790. Any provider placing a general purpose AI model on the EU market should comply with this obligation, regardless of the jurisdiction in which the copyright-relevant acts underpinning the training of these general purpose AI models take place. This is necessary to ensure a level playing field among providers of general purpose AI models where no provider should be able to gain a competitive advantage in the EU market by applying lower copyright standards than those provided in the Union.

In addition, Recital 60k requires that

providers of such models draw up and make publicly available a sufficiently detailed summary of the content used for training the general purpose model. While taking into due account the need to protect trade secrets and confidential business information, this summary should be generally comprehensive in its scope instead of technically detailed to facilitate parties with legitimate interests, including copyright holders, to exercise and enforce their rights under Union law, for example by listing the main data collections or sets that went into training the model, such as large private or public databases or data archives, and by providing a narrative explanation about other data sources used.

US copyright law does not currently require a developer or a deployer of AI systems to disclose their sources of training data, and ongoing litigation is examining whether US copyright law considers use of copyrighted works for training purposes to be a “fair use” such that permission from the copyright owner is not required. When it is decided, such a decision will be based on the facts before that U.S. court in that situation and may or may not provide guidance for use of other copyrightable materials for training purposes in other situations.


At the outset, the AI Act does not clearly define what a GPAI “model” is. For instance, the use of a chatbot by a business could potentially be such a model. In any event, drafting and compiling this information may potentially require significant financial and human resources—including for businesses that rely on an underlying AI provider for their own services.

The AI Act does not clearly state that such a business can rely on the technical description provided by the underlying AI provider. The new EU AI Office may provide templates and further guidance, but this may not become operational any time soon. This, of course, could create risks for business in the meantime. The legal effect of these “Recitals” of the AI Act (compared to its Articles) may also be subject to further debate.

Further, the AI Act potentially requires compliance with EU copyright law for AI training abroad, even if EU copyright law is not applicable to AI training abroad. If this interpretation were accepted, it would suggest that the EU is seeking to export its IP laws worldwide with respect to the AI sector.

The legislature’s aim in Recital 66j appears to be to protect EU providers by giving them an advantage over non-EU providers: The implications of the breach of these principles by the AI Act are potentially enormous, and could have a major impact in favor of domestic providers of GPAI models, e.g., in the public procurement of certain AI services.

Furthermore, the EU Commission and the new AI Office can—and likely will —adopt delegated acts (such as ordinances) and amend annexes to the AI Act with regard to technological innovations, etc. However, until the EU bodies take these measures, businesses using AI may remain in limbo about their obligations with respect to IP rights.


Businesses are well advised to be mindful of these potential challenges and uncertainties. We expect further guidance by the courts and the regulators, but guidance in the United States and in the European Union may not align. The fines imposed by the EU AI Act are set to be potentially even higher than the fines under the GDPR, and violations could also lead to private litigation in addition to EU regulatory action.


If you have any questions or would like more information on the issues discussed in this LawFlash, please contact any of the following:

Boston/Washington, DC