Update IP, Media & Technology No. 90 & Update Data Protection No. 168
AI Act: Preliminary result of the trilogue negotiations leaked
There has been speculation for almost three years about what the final version of the EU regulation of artificial intelligence ("AI Act") will look like. That is how long it has been since the EU Commission published an initial draft of the regulation (we reported). In December, the EU institutions reached a provisional agreement, but so far the information has been limited to the details announced at the press conference (see here). The content of the provisional agreement has now been leaked and even though the final version has not yet been published, only minor changes are likely to be made. There are still significant changes from both a compliance and copyright perspective. However, companies that have already started implementation can largely build on their previous efforts.
A. New content
It is striking that the preliminary version of the AI Act now focuses in many places on the fact that the AI Act is intended to ensure the trustworthiness of AI and also emphasises issues such as environmental protection, although this is likely to be primarily a political decision. However, AI systems that are used exclusively for scientific research and development have been excluded from the scope of application.
In addition, the rules on so-called regulatory sandboxes have been further developed, in which innovative AI systems can be tested under regulatory supervision – if possible – under real conditions.
It is also striking that the term "user" has now been replaced by "deployer", although this has not led to any change in content. The most important changes in the current draft, on the other hand, concern the following points:
1. More AI with unacceptable risk
The list of AI with unacceptable risk, which is prohibited in any case, has been adjusted once again. For example, AI for predictive policing will continue to be prohibited, but not if there is already concrete evidence that the person was involved in the offence. In addition, AI systems that create or expand facial recognition databases by aimlessly reading facial images from the internet or recordings from surveillance cameras, as well as systems for recognising emotions in the workplace, are also to be prohibited, unless the latter are used for medical or safety reasons (e. g. monitoring for fatigue).
2. Biometric identifications
However, the most important point of contention regarding more prohibited AI was the use of AI to collect and analyse biometric data. Here, Parliament pursued an approach that was intended to protect privacy and personal data much more strongly than is actually provided for in the current draft.
AI for biometric categorisation with the aim of drawing conclusions about the race, political opinion, trade union membership, religious or philosophical beliefs, sex life or sexual orientation of natural persons is now prohibited. However, there is an exception for the use for labelling and filtering of lawfully acquired biometric data sets and in the area of law enforcement (Art. 5 (1) lit ba) AI Act).
The ban on real-time biometric remote identification systems in publicly accessible areas for law enforcement purposes has also been restricted and is permitted, for example, when used to search for victims of kidnapping, human trafficking and sexual exploitation as well as to search for missing persons. In addition, the subsequent use of such systems (at least 48 hours after recording) is almost unrestricted. However, real-time remote biometric identification may only be used to confirm the identity of a specific wanted individual. In addition, the law enforcement authority must first carry out a "fundamental rights impact assessment" in accordance with Art. 29a of the AI Act and register the AI system (with exceptions for urgent cases). Furthermore, a binding decision that is detrimental to the person concerned may not be made solely on the basis of the decision of the real-time biometric remote identification.
3. General Purpose AI instead of foundation models
The term General Purpose AI (GPAI) was already familiar from the press reports on the agreement. It is now clear that GPAI is to replace the foundation models (AI base models) that were initially planned. GPAI is AI that has been trained with a large amount of data, is capable of performing a wide range of different tasks and can be integrated into a variety of downstream systems or applications, such as ChatGPT or image generators. The EU is now pursuing a tiered approach for GPAI. The following obligations apply to all providers of GPAI:
- Provision of technical documentation (including training and tests);
- Information for AI providers who want to integrate GPAI;
- Consideration of EU copyright law (more on this below);
- Information about training data.
These obligations are supplemented for GPAIs with systemic risk at Union level. These are in particular systems with very high computing power, but also include other AI systems that have been categorised accordingly by the EU Commission. In addition to a reporting obligation, the following applies to these systems in particular:
- Carrying out model evaluations;
- Investigating and minimising systemic risks at EU level;
- cybersecurity of the physical infrastructure as well.
4. High-risk AI systems
There have also been further changes with regard to high-risk AI, which entails the most comprehensive obligations for providers in particular. The exemptions from high-risk AI have been further specified. In addition, the Commission is to provide a list of examples of high-risk and non-high-risk AI no later than 18 months after the regulation comes into force, making categorisation easier in future. The exemption criteria can also be expanded or restricted by means of delegated acts.
There are simplifications in that the AI Act now explicitly states that companies can integrate the necessary processes and documentation relating to high-risk AI into existing documents or processes in accordance with other harmonised regulations. In addition, risk mitigation measures should be even more strongly orientated towards the risk associated with the respective system and the risk analysis in accordance with Art. 9 of the AI Act should only take into account those risks with regard to health, safety or fundamental rights that the AI system entails when used in accordance with its intended purpose. However, the risk analysis should not only take into account the potential impact of such systems on "children", but also on all persons under the age of 18 and members of vulnerable groups.
Other requirements relate to informing employees in advance about high-risk AI in the workplace and the accessibility of high-risk AI.
Finally, the exemptions for SMEs with regard to technical documentation have been extended. In future, SMEs will be allowed to provide this in a simplified form. The Commission will publish a (mandatory) template for this.
5. Retention periods
The latest draft now specifies minimum retention periods for many documents, such as the following:
- Documentation for high-risk AI (technical documentation, quality management documentation, documentation of changes authorised by the notified body, other documents from the notified body, declaration of conformity): 10 years after the AI system has been placed on the market;
- Logs automatically generated by high-risk AI: at least 6 months (by providers and users);
- Documents relating to the verification of subcontractors: at least 5 years;
- Documents for the introduction of AI (technical documentation, declaration of conformity, etc.): at least 10 years after the AI system has been placed on the market.
6. Labelling obligations
There have also been some changes to the labelling requirements for certain AI. For example, the labelling obligation for deepfakes was generally restricted, at least if their use is authorised for the prosecution and prevention of criminal offences, as well as for content that is part of an obviously artistic, creative, satirical, fictional, analogue work or programme. However, there is now a labelling obligation for artificial sound, image, video or text files generated by GPAI.
There should also be a labelling requirement for texts generated by AI systems if they are published without human control in order to inform the public about matters of public interest.
Otherwise, the planned labelling obligations for emotion recognition systems and for systems for biometric categorisation as well as for the disclosure of systems intended for interaction with natural persons remain in place.
B. Consequences for copyright law
The changes in the AI Act also affect the regulations on copyright and related rights, which we present below. Above all: What transparency obligations should now apply to AI models?
1. What became of the proposal for Art. 28b AI Act?
In June 2023, the EU European Parliament published a compromise proposal on the AI Act, the subject of which was – among other things – a newly inserted Art. 28b of the AI Act. In particular, this was intended to create transparency obligations, according to which providers of generative AI models should document a sufficiently detailed summary of the use of copyright-protected training data and make it publicly accessible.
Most recently, the American newspaper "New York Times" took legal action against the ChatGPT provider OpenAI and Microsoft, which uses ChatGPT in its search engine Bing. According to the New York Times, the companies had used several million articles to train the AI and thus violated copyright law.
It is therefore all the more surprising that Art. 28b has not been adopted.
Instead, the AI Act now contains recitals 60f - 60ka, which are particularly relevant from a copyright perspective and which in turn take up parts of the Art. 28b proposal.
The structure is as follows:
- Recital 60f: Open source AI models
- Recital 60g: AI models for non-commercial or scientific scientific research purposes
- Recital 60i: Large generative AI models
- Recital 60j: Compliance with the EU Copyright Directive and related rights
- Recital 60k: List of criteria for the development and training of AI models
- Recital 60ka: Monitoring of obligations by the AI Office
In the leaked draft, it remains unclear where these regulations will ultimately be anchored in the AI Act. As recitals alone, the transparency obligations would hardly be successfully enforced. Their core is likely to be transferred to a "GPAI Chapter" of the AI Act.
2. The regulations in detail
In future, a tiered regulatory concept will apply to the various AI models, which provides for strict regulation, particularly for large generative AI models.
When considering the regulatory concept, the objective of the AI Act pursued by the EU should always be kept in mind:
"The purpose of this Regulation is to improve the functioning of the internal market by laying down a uniform legal framework in particular for the development, placing on the market, putting into service and the use of artificial intelligence systems in the Union in conformity with Union values, to promote the uptake of human centric and trustworthy artificial intelligence while ensuring a high level of protection of health, safety, fundamental rights enshrined in the Charter, including democracy and rule of law and environmental protection, against harmful effects of artificial intelligence systems in the Union and to support innovation.“ (Excerpt from recital 1)
3. Open source AI models and AI models for non-commercial or scientific research purposes
For providers of AI models that make them available under an open source license or for non-commercial or scientific research purposes, exceptions will apply in future with regard to the transparency requirements, which are far less strict for these models.
For example, the newly inserted recital 60g now states:
"Without prejudice to Union Copyright law, compliance with these obligations should take due account of the size of the provider and allow simplified ways of compliance for SMEs including start-ups, that should not represent an excessive cost and not discourage the use of such models.“ (Excerpt from recital 60g)
It is clear that the EU particularly wants to promote the deployment and use of AI models in these areas.
4. Large generative AI models (ChatGPT and co.)
The obligations for large generative AI models are much more extensive. The EU is aware that such models require access to large amounts of text, images, videos and other data.
The AI Act addresses the problems associated with this.
"Text and data mining techniques may be used extensively in this context for the retrieval and analysis of such content, which may be protected by copyright and related rights.“ (Excerpt from recital 60i)
Any use of copyright-protected content generally requires the permission of the rights holder concerned, unless exceptions and limitations of Directive (EU) 2019/790 (Directive on Copyright in the Digital Single Market) apply. Under certain conditions, the Directive permits the reproduction of extracts from works for the purposes of text mining and data mining.
Nevertheless, the rights holders can reserve their rights in order to prevent mining. In these cases, the providers of AI models can obtain permission for use from the respective rights holders. OpenAI, the provider of ChatGPT, and a German publisher have already concluded an initial agreement to this effect.
However, exceptions apply within the directive in the case of mining for the purposes of scientific research. Here, a reservation of the rights holder is generally excluded.
5. Compliance with the EU Copyright Directive and related rights
In order to ensure that providers of AI models also comply with the obligations of the AI Act, they must introduce measures in future to guarantee compliance with EU law on copyright and related rights. In particular, the focus must be on how rights holders can express their reservations of rights in accordance with Art. 4 (3) of the Directive on Copyright in the Digital Single Market. This also applies in spite of any legal systems in the Member States to the contrary.
"This is necessary to ensure a level playing field among providers of general purpose AI models where no provider should be able to gain a competitive advantage in the EU market by applying lower copyright standards than those provided in the Union.“ (Excerpt from recital 60j)
The provision to introduce such measures can be found – apart from the recitals – in the text of the regulation in Art. 52c (1) lit c AI Act.
6. Criteria catalog for the development and training of AI models
Providers who train their AI models with (copyright-protected) data and texts must in future create a sufficiently detailed summary of the content used to train the model and make it publicly accessible. The aim is to utilise the resulting transparency to make it easier for copyright holders in particular to exercise and enforce their rights.
"While taking into due account the need to protect trade secrets and confidential business information, this summary should be generally comprehensive in its scope instead of technically detailed […].“ (Excerpt from recital 60k)
This is more about listing important data collections that were used to train the model, for example, or whether and how large private or public databases or archives were used.
In future, a template will be made available to providers for this purpose.
7. Monitoring of obligations by the AI Office
The AI Office, which has yet to be established, will monitor compliance with the aforementioned obligations. At present, it is not clear from the AI Act how much autonomy the AI Office will actually have. It is therefore still necessary to wait for a corresponding resolution to clarify this issue.
The AI Office's task will be to monitor whether the providers actually implement the required strategies and measures to comply with copyright law and whether they make a summary of the content used for the training publicly available.
However, the task will not be to review or individually assess the training data with regard to compliance with copyright law. The enforcement of copyright provisions is also not prevented by the regulation.
C. Prospects
Although the current draft still leaves some questions unanswered and the approval of the EU member states is still formally pending, companies can already take action on the basis of the current draft. Providers of high-risk AI in particular should now intensify their work on the necessary documentation and start working on a retention concept.