The Lapa LLM, the most effective Ukrainian large-scale language model, developed specifically for deep reasoning and aligned with national values, was presented on the eve of the Day of Ukrainian Language and Writing. The model is named after Valentyn Lapa, who, together with Oleksii Ivakhnenko, created the method of group consideration of arguments, which is the predecessor to Deep Learning.
The project is the result of many months of work by a team of Ukrainian researchers in artificial intelligence from 91Ƶ, the Ukrainian Catholic University, AGH University of Krakow, and Ihor Sikorsky Kyiv Polytechnic Institute, which united to create the best model for Ukrainian language processing.
The world is entering the era of artificial intelligence, where those who master language models wield knowledge, information, and influence. That is why it is critically important for Ukraine to have its own national LLM – and Lapa LLM is a vivid example of Ukrainian science being ready to create the future.
No global model understands Ukrainian as well as we do. Lapa LLM is trained on Ukrainian texts, preserving linguistic nuances, context, history, and phraseology – forming the foundation of our linguistic and technological independence.
The development of LLM in Ukraine is the formation of our own school of artificial intelligence. Teams from leading universities, including Lviv Polytechnic, are forming a community of researchers, engineers, and innovators that strengthen the scientific potential of the state.
Lapa LLM will become the base for Ukrainian state services, educational and scientific platforms, media systems, and security technologies – areas where the use of foreign models is limited or undesirable.
Having its own LLM brings Ukraine to the level of states that have a full cycle of creating AI technologies, strengthening our digital sovereignty and position on the world stage.
According to its developers, Lapa LLM stands out because its tokenizer was entirely redesigned for the Ukrainian language. 80,000 out of the 250,000 tokens were replaced, significantly improving the processing of Ukrainian text. As a result, the model now requires one and a half times fewer tokens for the same tasks, reducing computational load. In terms of processing speed for Ukrainian, it outperforms the original Gemma model as well as most other closed models in its class.