Sovereign AI: why almost no country wants "its own ChatGPT" anymore (and almost all choose open source)

🔬 In-depth analysisResearched from 8 sources · ~6 min read · our take · July 2, 2026 · 14:23

🎧 Listen to the analysis

Portugal launched Amália yesterday, its first large model for European Portuguese, fully open and built on top of EuroLLM. The fashionable question —should every country have a national LLM?— is the wrong one. The right one isn't "yes or no," but "which layer of the stack do you want to own, and what for?" And there, open source has stopped being an ideological preference and become the only rational strategy for almost everyone not named the United States or China.

🎬 Our Short

On July 1, 2026, a consortium of Portuguese universities (NOVA de Lisboa, Instituto Superior Técnico, Coimbra, Porto and Minho, with the Foundation for Science and Technology) unveiled Amália, the first large language model designed specifically for European Portuguese. It is open top to bottom —model, data and code—, multimodal, built on Europe's EuroLLM-9B, and it does not aspire to be a consumer chatbot but a platform on which public bodies, firms and research groups can build their own applications. The name honors Amália Rodrigues, the voice of fado. With €7 million committed through 2027, it is not a pharaonic bet: it is, deliberately, a modest and well-aimed one.

OUR THESIS: the question that resonated with the audience —"should every country have its own LLM?"— is badly framed, which is why it invites cartoonish answers (national pride versus wasteful spending). The useful question is different: which layer of the AI stack is worth being sovereign over, and at what cost? Sovereignty is not having "your ChatGPT." It is controlling what actually matters: training data, model weights, the infrastructure that runs it, and the rules by which it answers. Broken down that way, the answer for nearly every mid-sized country converges: you don't need to train a frontier model from scratch; you need to own and adapt an open one in your language, your values and your legal framework. That is why Amália, EuroLLM, Switzerland's Apertus and —after a painful start— Spain's ALIA all point the same way: open source.

The contrast between Amália and ALIA is the season's best lesson. In January 2025 Spain unveiled the ALIA family, headlined by ALIA-40B trained FROM SCRATCH on 9.2 trillion tokens with support for 35 languages, including the co-official ones, under an Apache 2.0 license and coordinated by the Barcelona Supercomputing Center on MareNostrum 5. The ambition was admirable; the execution, according to the specialist press, was erratic. Xataka and Genbeta documented that the model was presented before finishing its initial training, that in tests it performed at the level of 2023 open models (comparable to Llama-2-34B), and that, after €10.2 million of public spending, it logged barely 174 downloads a month in early 2026. Its own creators later clarified: "the goal is not to compete with ChatGPT," and months on, with training completed, ALIA claimed to be the best model in Basque and second in Catalan and Galician. That may well be true. But the sequence —political announcement first, finished model later— is exactly the mistake to avoid.

OUR READING: ALIA and Amália neither fail nor succeed because they are "sovereign," but because of method. Training a frontier model from scratch is hugely expensive, slow, and exposes you to comparisons you will almost always lose against Anthropic, OpenAI or the big Chinese open models. Building on a shared open base —as Amália does on EuroLLM and Switzerland does with its own pipeline— is cheaper, faster, and concentrates effort where a state does have comparative advantage: its language, its corpora, its regulation. Switzerland has taken this to the limit with Apertus (EPFL, ETH and CSCS), released in September 2025 under Apache 2.0 in 8-billion and 70-billion-parameter versions, trained across more than 1,000 languages and, crucially, with EVERYTHING open: weights, training recipes and even the scripts to reconstruct the data, honoring rights-holders' opt-outs. That is Europe's real moat: not raw power, but transparency and compliance built in by design.

The global map confirms open source won the debate decisively. The Bangkok Declaration, signed by more than 100 countries in February 2026, enshrines AI sovereignty as a shared goal; McKinsey projects a $600 billion sovereign-AI market by 2030. Within it, very different models coexist: the UAE with Falcon (TII), which boasts of topping the open Arabic LLM rankings; Saudi Arabia with HUMAIN and its ALLaM, backed by tens of thousands of GPUs; India with BharatGen, anchored at IIT Bombay and already covering 22 Indian languages; Singapore with SEA-LION, Japan with LLM-jp, Ukraine with its national model in beta and Malaysia with ILMU. Almost all open. Against them, two instructive exceptions: China, which has turned open-weight into a geopolitical weapon (DeepSeek, Qwen, GLM, Kimi), and France, which is betting on a PRIVATE champion, Mistral. The overwhelming majority, though, has grasped that open weights are the great equalizer: sovereignty without having to win the capital race only a handful of giants can run.

Now the short term, unsweetened. Three real risks. The first is "sovereign washing": wrapping in a flag a project that still depends on Nvidia GPUs, borrowed architectures and uncontrolled scraped data; server geography is not sovereignty if the rest of the stack is on loan. The second is vanity waste: the ALIA case shows what happens when the calendar is set by a government event rather than the actual state of training, and when success is measured in headlines rather than adoption (174 downloads a month is a damning number). The third is fragmentation: a hundred mediocre, disconnected national models serve citizens worse than a few well-maintained, shared open ones —which is why the collaborative European route (EuroLLM feeding Amália) has more future than a hundred isolated efforts.

And the long term, where we place our optimism —matized. An open model in your language is no nationalist toy: it is digital public infrastructure, of the same order as roads or the power grid. It preserves minoritized languages that English-first models treat as statistical noise (that ALIA performs better in Basque, or Apertus understands Romansh, is not trivia: it is culture surviving algorithmic homogenization). It gives a hospital, a court or a school the ability to deploy AI without sending sensitive data to a foreign server or depending on a price someone else sets. And it fits the underlying direction we defend: the open frontier is rising fast and democratizing access, and that democratization is precisely the path to abundance —toward a world where eradicating disease, extending life and freeing human time are not the privilege of whoever controls a closed model, but capacity installed in every country.

IMPLICATIONS. For governments: stop asking "do we have our LLM?" and start asking "which layer do we own, and for which use case?" Build on shared open bases, invest in data and evaluation in your own language, and measure success in adoption and public services deployed, not in launch events. For Europe: its edge won't be raw power —that race is led by the US and China— but trust: auditable models, compliance by design, and real sovereignty over data and weights. And for the citizen: the underlying good news is that AI sovereignty, properly understood, is not about flags but about the technology already changing the world speaking your language, respecting your rules, and being impossible for anyone abroad to switch off. Amália, with its fado-singer name and its open code, understands that better than many projects with ten times its budget.

Sovereign AI: why almost no country wants "its own ChatGPT" anymore (and almost all choose open source)

Sources & references