Apertus: Open & Multilingual LLMs
Script
If a chef serves you a meal but refuses to share the recipe or the ingredients, can you really call it open source? This is the paradox facing modern AI, where current definitions of openness often hide crucial details about data and compliance.
Most models today engage in what the researchers call 'open washing.' They release the final parameters, but keep the training data and code locked away, creating legal risks and limiting accessibility for non-English speakers.
To fix this, the authors introduce Apertus. Unlike typical models that only share weights, Apertus opens the entire pipeline, from raw data with retroactive privacy controls to full training code covering over one thousand eight hundred languages.
They also innovated on the architecture itself with something called the Goldfish Objective. By randomly masking tokens during training, the model learns general patterns but is mathematically discouraged from memorizing exact sentences like copyrighted text or personal data.
This isn't just a small experiment; it scales comfortably to production levels. This chart illustrates the training efficiency on the Alps supercomputer, proving that these compliant, privacy-preserving methods work effectively even when training massive seventy-billion parameter models across thousands of GPUs.
Apertus proves we don't have to choose between high performance and strict compliance. For more details on this blueprint for transparent AI, find the full paper on Emergent Mind dot com.