Risk levels of AI models | AI Service Desk

About RTR
Service
KI-Servicestelle
AI Act
GPAI models and risk levels of AI models

General Purpose AI Models ("GPAI Models")

The EU law-making bodies initially focussed on the regulation of AI systems that were developed for a more or less specific purpose (e.g. autonomous driving). Since AI tools such as ChatGPT & Co also reached the general public, the focus then turned to "AI models with a general purpose".

The infographic summarises the continuous text and describes the risk levels of AI models

GPAI models are AI models that are capable of handling a wide range of tasks. Certain obligations apply to them. In the event of a systemic risk, the catalogue of obligations expands © RTR (CC BY 4.0)

What are General Purpose AI Models?

General purpose AI models – also referred to as foundation models - are AI models (not to be confused with AI systems, see recitals 97) that can handle a wide range of tasks rather than being optimised for a specific task or application. These models are often able to process large amounts of unstructured data such as text, images, audio and video and also perform tasks such as classification, generation and prediction. Due to their flexibility and adaptability, these models are used in a variety of cases and different industries. They also often form the basis for fine-tuning specific AI systems. Examples of GPAI models include the large language models (LLM) based on transformer models from OpenAI ("GPT-3.5/GPT-4"), Meta ("LLama") or Mistral ("Mixtral"). However, GPAI models are not limited to language models; other models, for example for classification, can also fall under this definition.

In order to address possible risks which are specific to GPAI models (such as undesirable results, copyright and data protection violations during development), providers are subject of documentation and information obligations (see Art. 53 AIA).

Are GPAI models and AI systems the same?

A clear distinction needs to be made between the terms "AI systems" and "AI models". Not all AI models fall within the scope of the AI Act, only GPAI models. Although GPAI models can be part of an AI system, they do not form an AI system in isolation. For a GPAI model to become an AI system, additional components - such as a user interface - must be added. In this case we speak of a general-purpose AI system (or GPAI system) in accordance with Art 3 para. 66 AIA.

The infographic summarises the continuous text and describes the four risk levels of AI systems

A clear distinction needs to be made between the terms "AI systems" and "AI models". Not all AI models fall within the scope of the AI Act, only GPAI models. © RTR (CC BY 4.0)

Is GPAI and generative AI the same?

GPAI and generative AI are similar concepts, but they are not exactly the same. GPAI models are designed for a wide range of applications, whereas generative AI systems focus on generating text, images, videos, music, and other content. (GPAI-Systems such as ChatGPT for gnenerating texts, Midjourney or Dall-E 2 for generating images and videos, etc.). Generative AI is therefore a specific sub-area of GPAI models.

In short, GPAI models are general concepts for a wide range of applications, while generative AI systems refer specifically to the ability to generate data or content.

When does a GPAI model pose systemic risks?

GPAI models with systemic risks have a special status. This is when the AI system has capabilities "high-impact capabilities". By "systemic risk", the Union legislator refers to risks that are specific to a GPAI model that has high impact capabilities (Art. 3 para. 65 subpara. 1 AIA). This is considered to be the case when the capabilities of the GPAI model in question match to or exceed the capabilities recorded in the most advanced GPAI models. These models have a certain reach or have actual or reasonably foreseeable negative effects on public health, safety, public security, fundamental rights or society, which have a significant impact on the Union market that can propagated at scale across the value chain (see Art. 3(65)(2) of the AEOI).

A systemic risk GPAI model is one that meets one of the following criteria:

It has high-impact capabilities evaluated on the basis of appropriate technical tools and methodologies, including indicators and benchmarks. Providers are subject to reporting requirements if these criteria are met (e.g. this is assumed when the amount of compute used, measured in floating point operations [‘FLOPs’], is above10²⁵);
There is an ex officio decision by the Commission or a qualified warning by the Scientific Panel that the GPAI model has capabilities or impacts equivalent to the above criteria.

Excursus: What actually is a FLOP?

The AI Act defines a floating point operation (“FLOP”) in Art. 3 (67) as any arithmetic operation with floating point numbers. This includes basic arithmetic operations such as addition and multiplication. A simple calculation such as ‘42 * 42 + 17.32’ would therefore be two FLOPs (42*42; 1,764 + 17.32).

The AI Act uses the number of floating point operations that were necessary in the training phase as a substitute for the power of a model. The AIA therefore assumes that the currently defined threshold of 10²⁵ FLOPs (written out: 10,000,000,000,000,000,000,000,000,000,000,000 arithmetic operations) results in a model with a high degree of efficiency. Current open source models already exceed this limit: The LLama-3 model 405B published by Meta as open source in July 2024 achieves 3.8 x 10²⁵FLOPS in the training performance used.

To categorise this figure to categorise this number: A current smartphone chip currently manages an order of magnitude of several 10¹² FLOPs per second (‘teraflop/s’), a current home graphics card more than forty times that. Current data centre GPUs already achieve almost 2000 teraflops per second for simple simple computing operations. A current smartphone would have to calculate for 100,000 years of continuous computing to reach the limit of 10²⁵ operations.

The ratios are summarised in the graphic below.

In order to assess whether a GPAI model poses a systemic risk, the following parameters in accordance with Annex XIII must be taken into account:

the number of parameters of the model;
the quality or size of the data set, for example measured through tokens;
the amount of computation used for training the model, measured in FLOPs indicated by a combination of other variables such as estimated cost of training, estimated time required for training or estimated energy consumption for the training;
the input and output modalities of the model, e.g., text-to-text (large language models), text-to-image, multi-modality, and the state-of-the-art thresholds for determining high-impact capabilities for each modality, as well as the specific type of inputs and outputs (e.g., biological sequences);
the benchmarks and evaluations of the model's capabilities, including consideration the number of tasks it can perform without additional training, its adaptability to learn new, distinct tasks, its degree of autonomy and scalability, and the tools it has access to;
whether it has a high impact on the internal market in terms of its reach, which shall be presumed when it has been made available to at least 10 000 registered business users established in the Union;
The number of registered end-users.

Due to the risk potential, the providers of GPAI models with systemic risks are subject to obligations that go beyond Art 53 AIA. In particular, providers must take measures to identify, assess and mitigate systemic risks (see Art. 55 AIA).

Further link

European Parliament: General-purpose artificial intelligence (EN)

RUNDFUNK UND TELEKOM
REGULIERUNGS-GMBH

Austrian Regulatory Authority for Broadcasting and Telecommunications
Mariahilfer Straße 77-79, 1060 Vienna, Austria
tel.: +43 1 58058-0

GPAI models and risk levels of AI models