Key facts
- Google released the Gemma 4 12B large language model.
- The model is designed to run on consumer laptops with 16GB of system RAM or VRAM.
- Gemma 4 12B is more capable than previous mobile-optimized Gemma models.
- The new model has about half the memory footprint of the Gemma 4 26B MoE model.
- Google claims Gemma 4 12B is nearly as capable as the 26B model based on benchmarks.
- Gemma 4 12B is the first mid-sized model from Google to support native audio input.
Google has introduced its new Gemma 4 12B model, a large language model designed for efficiency and accessibility. This new model aims to address the high memory demands driven by the generative AI boom, making advanced AI capabilities available on standard consumer hardware. The Gemma 4 12B model is positioned between Google's previously released mobile-optimized options (E2B and E4B) and its more powerful 26B Mixture of Experts and 31B Dense models. Google states that the 12-billion-parameter model can operate on laptops equipped with 16GB of system RAM or VRAM, a significant reduction compared to the memory footprint of the 26B MoE model. Benchmarks suggest that Gemma 4 12B offers nearly comparable capabilities to its larger counterpart, despite its reduced resource requirements. It also features multimodal capabilities, including native audio input, and utilizes an encoder-free architecture for improved performance.