LLMsMedium impactFor DevarXiv LLMs · June 10, 2026

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

This paper proposes a new design for Mixture-of-Experts (MoE) routers using Manifold Power Iteration to align router rows with principal singular directions of expert matrices, improving MoE model effectiveness.
Signal strength3.4/5·arXiv LLMs

This paper proposes a new design for Mixture-of-Experts (MoE) routers using Manifold Power Iteration to align router rows with principal singular directions of expert matrices, improving MoE model effectiveness.

TL;DR

This paper proposes a new design for Mixture-of-Experts (MoE) routers using Manifold Power Iteration to align router rows with principal singular directions of expert matrices, improving MoE model effectiveness.

What happened

Researchers introduced Manifold Power Iteration (MPI) as a redesign method for MoE routers to better align router rows with the principal singular directions of experts. The method enforces a norm constraint for stability and efficiency during training. Empirical results on MoE models ranging from 1B to 11B parameters show improved performance due to this alignment.

Why it matters

By theoretically and empirically improving router design, this approach enhances the token-to-expert affinity calculation in MoE models, potentially resulting in more efficient routing and better model capacity utilization, which is critical for scaling large models.

Generating deep dive...

AI-powered analysis takes a few seconds