AMALIA is a fully open Large Language Model for European Portuguese.
This repository serves as a central entry point for resources related to the paper:
โAMALIA: A Fully Open Large Language Model for European Portugueseโ
Accepted at PROPOR 2026
๐ https://aclanthology.org/2026.propor-1.38/
Despite recent advances in open Large Language Models (LLMs), European Portuguese (pt-PT) remains underrepresented in both training data and evaluation benchmarks. Existing evaluations often rely on machine-translated datasets, which fail to capture important linguistic and cultural nuances of the language.
AMALIA addresses this gap by:
- Prioritizing high-quality pt-PT data during all training stages
- Providing a fully open LLM tailored specifically for European Portuguese
- Introducing new evaluation benchmarks for pt-PT
Experimental results show that AMALIA remains competitive with strong baselines, while achieving substantial improvements on pt-PT-specific evaluations, highlighting the importance of targeted training and native benchmarking for underrepresented language variants.
For implementation details refer to the official organization repositories:
๐ https://github.com/orgs/AMALIA-LLM/repositories
If you use AMALIA in your work, please cite:
@inproceedings{simplicio-etal-2026-amalia,
title = "{AMALIA}: A Fully Open Large Language Model for {E}uropean {P}ortuguese",
author = "Simpl{\'i}cio, Afonso and
Vinagre, Gon{\c{c}}alo and
Ramos, Miguel Moura and
Tavares, Diogo and
Ferreira, Rafael and
Attanasio, Giuseppe and
Alves, Duarte M. and
Calvo, In{\^e}s and
Vieira, In{\^e}s and
Guerra, Rui and
Furtado, James and
Canaverde, Beatriz and
Paulo, Iago and
Ramos, Vasco and
Gl{\'o}ria-Silva, Diogo and
Faria, Miguel and
Treviso, Marcos and
Gomes, Daniel and
Gomes, Pedro and
Semedo, David and
Martins, Andr{\'e} and
Magalh{\~a}es, Jo{\~a}o",
booktitle = "Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1",
month = apr,
year = "2026",
address = "Salvador, Brazil",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2026.propor-1.38/",
pages = "380--391",
ISBN = "979-8-89176-387-6"
}