LIST PUTS AI TO THE TEST—IN LUXEMBOURGISH

Published on 29/04/2025

The Luxembourg Institute of Science and Technology (LIST) has launched a new linguistic leaderboard to assess how effectively large language models (LLMs) understand and generate content in Luxembourgish. The development is part of the LLMs4EU project, a €40 million pan-European initiative coordinated by the Alliance for Language Technologies (ALT-EDIC).

The leaderboard is an additional feature of the LIST AI Sandbox, a platform designed to test, benchmark, and improve AI technologies in a controlled, transparent environment. This module is part of a package assigned to the institute within the Luxembourg AI Factory, where LIST is helping optimize and develop AI sandboxes into a national, multi-purpose testing platform to support the validation of trustworthy AI across sectors and use cases.

LLMs’ capabilities remain skewed toward high-resource languages like English. This puts several European languages—including Luxembourgish—at risk of being digitally sidelined.

“Artificial intelligence is not neutral. The way we train and test AI determines who benefits from it,” said Jordi Cabot, Head of the Software Engineering RDI Unit at LIST. “With this leaderboard, we’re making sure Luxembourgish is actively part of the next generation of language technologies.”

Official language tests used to measure AI model performance

Unlike most benchmarks designed around English, LIST’s tool evaluates models using real language exams developed by the Institut National des Langues Luxembourg (INLL), which have been strictly anonymized. These standardised tests span the Common European Framework of Reference for Languages levels from A1 to B2 and offer a rigorous, real-world assessment framework.

“We tested 54 language models in total,” said Cédric Lothritz, Research & Technology Associate at LIST. “Many could handle basic levels like A1 or A2, but performance dropped sharply at B1 and B2. Only a handful of large models passed at the higher levels—models that many European SMEs can’t embed due to cost or infrastructure constraints.”

Common mistakes reveal blind spots in current AI systems

The results revealed recurring issues. Many models made the same kinds of mistakes: misunderstanding context, failing grammar rules, and even making simple logical errors. These patterns underline how current AI systems still struggle with languages that haven’t been central in model training.

“We’re proud that our exams are being used in such an innovative way,” said Luc SCHMITZ, Directeur Adjoint of INLL. “It’s a powerful example of how education and research can work together to expand the reach of AI in smaller language communities.”

Luxembourg’s contribution to a more inclusive AI future

Luxembourg’s contribution to the LLMs4EU initiative also includes the University of Luxembourg and the Zenter fir d'Lëtzebuerger Sprooch (ZLS), who are working together to ensure that AI development reflects the country’s cultural and linguistic identity.

“For Luxembourgish to remain a living, evolving language, it must be part of the conversations shaping our digital future—including generative AI,” said Pierre Reding, Commissaire à la langue luxembourgeoise. “LIST’s work shows that technology can serve linguistic diversity, not erase it.”

Alexandre Ecker, Director of ZLS, shares this assessment, adding that “benchmarks evaluating how well AI models handle Luxembourgish are an important step toward improving the performance of future AI technologies with regard to the language.”

With partnerships in 20 countries, LLMs4EU embodies a collaborative approach to digital inclusion. LIST’s contribution reinforces Luxembourg’s role in shaping AI that is more responsible, more equitable—and more European.

Find out more about the linguistic leaderboard, included in LIST AI Sandbox