Google's Gemma 4 E2B and E4B models are now in AICore Developer Preview - promising 4x faster inference, 60% less battery, and multimodal capabilities for on-device Android apps.
Google has opened a developer preview of Gemma 4, its latest family of open models, through the AICore platform for Android. The two edge-optimized variants - E2B and E4B - deliver up to 4x faster inference and 60% lower battery consumption compared to prior on-device models, while adding multimodal support for text, images, and audio. Android developers can start prototyping today, with code that will be forward-compatible with Gemini Nano 4 when it ships on 2026 flagship devices later this year.
On April 2, 2026, Google published two announcements simultaneously - one on the Android Developers Blog and one on the Google Developers Blog - introducing the Gemma 4 model family and making the E2B and E4B variants available via AICore Developer Preview. This is an early access program, not a production release, but it gives Android developers working hardware today to test against.
The preview runs on AICore-enabled devices - hardware with Google, MediaTek, or Qualcomm AI accelerators - and is accessible through the Android Studio ML Kit Prompt API or a dedicated AICore UI. Developers without qualifying hardware can use the AI Edge Gallery app to explore the models. Per 9to5Google, any code written today for Gemma 4 will automatically work on Gemini Nano 4-enabled devices when those ship - making this preview a direct investment in 2026 production apps, not a throwaway prototype exercise.
Google released two edge variants under Gemma 4. They share the same multimodal architecture but trade off speed against reasoning depth:
| Model | Effective Params | Speed | Best For | Context Window |
|---|---|---|---|---|
| E2B | ~2B | 3x faster than E4B | OCR, handwriting, low-latency tasks | 128K tokens |
| E4B | ~4B | Up to 4x faster than prior models | Reasoning, agentic workflows, planning | 128K tokens |
Both models support text, image, and audio inputs - a meaningful upgrade over text-only edge predecessors. They cover 140+ languages and carry a 128K token context window, which is unusually large for on-device inference. Both are released under the Apache 2.0 license - commercial use is permitted with no royalties or usage fees.
The Gemma 4 family also includes larger non-edge models - a 26B mixture-of-experts (MoE) and a 31B variant - targeting workstations and servers. Google says the 31B currently ranks third among open models on the Arena AI text leaderboard, beating rivals twenty times its size. Those larger models are not part of the AICore preview.
What 'Effective' Parameters Means
The core value here is cost and latency. Every call to a cloud LLM API carries a price tag and a round-trip delay. For latency-sensitive features - OCR, voice commands, real-time translation, document summarization - running inference on-device eliminates both.
For Android-focused agencies and small dev teams, the clearest opportunities right now are:
Techzine.eu also notes that Gemma 4 runs on Raspberry Pi and Jetson hardware, opening the door for IoT and edge computing use cases well beyond mobile - kiosk applications, field service tools, or on-premise document processing.
Preview Limitations to Know Before You Build
Gemma 4 is free and open under Apache 2.0. There are no usage fees, no enterprise gating, and no rate limits on on-device inference. The AICore Developer Preview is available now for devices running Google, MediaTek, or Qualcomm AI accelerators. Developers without qualifying hardware can use the AI Edge Gallery app to test the models.
Integration runs through the ML Kit Prompt API in Android Studio - standard tooling that most Android developers already use. There is no separate SDK to learn.
Gemma 4 lands in a crowded edge AI space. Meta's Llama 3.2 (1B and 3B) and Microsoft's Phi-3.5 Mini are the most direct comparisons, but independent side-by-side testing on identical Android hardware does not exist yet. Google's performance claims - 4x speed improvement and 60% battery savings - are self-reported and unverified by third parties as of this writing. Community discussion has been limited since launch, expected for a developer preview just two days old.
What sets Gemma 4 apart structurally is its Android ecosystem integration. By tying the preview to AICore - which coordinates directly with Qualcomm and MediaTek NPUs - Google creates a distribution moat. Developers writing against AICore today get automatic compatibility with every future Gemini Nano 4 flagship device. That forward-compatibility argument is stronger than the raw benchmark numbers right now.
The prior Gemma series accumulated 400M+ downloads and spawned 100,000+ community variants. If Gemma 4 follows that trajectory, the fine-tuning and tooling ecosystem will matter as much as the base model performance. Google has not confirmed a production timeline for Gemini Nano 4 beyond "later in 2026."
Building on AICore creates a meaningful Android dependency. If your target users are split between Android and iOS, the on-device story is incomplete. Google supports iOS and other platforms at the framework level, but AICore itself is Android-specific. Teams building cross-platform apps should treat Gemma 4 on AICore as an Android enhancement, not a full replacement for cloud inference in mixed-platform products.
Weekly AI tool reviews, news digests, and how-to guides.
Join 12,000+ builders