iPhone 17 Pro runs a 400B AI model locally—which needs over 200GB of RAM

Mar 23, 2026

Apple’s latest hardware is doing something pretty unexpected on the AI side, though it comes with a clear catch. The iPhone 17 Pro has been shown running a 400-billion parameter language model locally, which sounds almost unreal for a phone.

The demo comes from an open-source project called Flash-MoE, shared by developer @anemll. Models of this size usually need well over 200GB of memory to even load, so getting one to run on a device with 12GB of RAM shouldn’t really be possible in the usual sense.

What’s happening here is a bit different. Instead of loading the whole model into memory, the system pulls in pieces from storage as needed. It also relies on a Mixture of Experts setup, where only a small portion of the model is active at any given moment. That combination is what makes it run at all.

The problem is speed. Or rather, the lack of it. The model generates at about 0.6 tokens per second, which means you’re waiting a couple of seconds for a single word. It’s slow enough that even simple prompts start to feel like a test of patience. Battery drain is another likely issue here, though that’s expected with this kind of workload.

Still, it’s interesting to see. Not because it’s usable right now, but because it shows where things might be heading. Running something this large entirely on-device, without relying on the cloud, wasn’t even part of the conversation not too long ago.

For now, though, there’s a clear gap between what’s possible and what actually makes sense to use. Smaller models are still the practical choice. But experiments like this do give a glimpse of what future phones might eventually handle more comfortably.

Don’t miss a thing! Join our Telegram community for instant updates and grab our free daily newsletter for the best tech stories!

For more daily updates, please visit our News Section.

(Source: @anemll on X)

iPhone 17 Pro runs a 400B AI model locally—which needs over 200GB of RAM

Comments

Xiaomi launches Mijia Titanium Thermos Cup Ti2 in Moonlight Silver color

Casio launches three Oceanus limited edition watches inspired by Japanese Awa Indigo

Galaxy S27 Ultra tipped to be the first to use UFS 5.0 storage

Comments

RELATED ARTICLESMORE FROM AUTHOR

Smartphone shipments slip in Q1 2026, while Apple and Samsung continue to grow share

TSMC hits 5GHz speeds on smartphone processors, leaving Huawei generations behind

iPhone 18 Pro could get Deep Red, and Android brands will copy again

Xiaomi launches Mijia Titanium Thermos Cup Ti2 in Moonlight Silver color

Casio launches three Oceanus limited edition watches inspired by Japanese Awa Indigo

Galaxy S27 Ultra tipped to be the first to use UFS 5.0 storage

RELATED ARTICLES MORE FROM AUTHOR