How modular is a frontier Mixture-of-Experts?
A pre-registered causal test on Command A+ (218B). Of six candidate expert modules, only one survived a change of corpus, metric, and statistical bar.
When I first learned about Mixture-of-Experts models, the image that popped into my mind was the early-2000s game show Beat The Geeks. ICYMI: three contestants faced off against a panel of resident "geeks," each an obsessive specialist in one domain (the Simpsons Geek, the Beatles Geek, the Star Wars Geek, etc.). The panel of geeks would routinely take turns beating generalist contestants, even when given much harder questions.
With that image in my mind, I started to speculate on the possibilities of MoE modularity. A sparse Mixture-of-Experts model like Command A+ has 128 internal experts but only activates about 8 on any given token. Most of the model sits idle at any moment. If experts naturally specialize (e.g. a math geek, an Arabic geek, a code geek) you'd have a powerful handle on the model's behavior:
- Safety: monitor or remove a capability by acting on the experts that carry it.
- Editing: change one behavior without disturbing the rest.
- Interpretability: explain what a model does by explaining what its parts do.

