Skip to main content

Apple just gave developers access to its new local AI models, here’s how they perform

One of the very first announcements on this year’s WWDC was that for the first time, third‑party developers will get to tap directly into Apple’s on‑device AI with the new Foundation Models framework. But how do these models actually compare against what’s already out there?

With the new Foundation Models framework, third-party developers can now build on the same on-device AI stack used by Apple’s native apps.

In other words, this means that developers will now be able to integrate AI features like summarizing documents, pulling key info from user text, or even generating structured content, entirely offline, with zero API cost.

But how good are Apple’s models, really?

Competitive where it counts

Based on Apple’s own human evaluations, the answer is: pretty solid, especially when you consider the balance (which some might call ‘tradeoff’) between size, speed, and efficiency.

In Apple’s testing, its ~3B parameter on-device model outperformed similar lightweight vision-language models like InternVL-2.5 and Qwen-2.5-VL-3B in image tasks, winning over 46% and 50% of prompts, respectively.

And in text, it held its ground against larger models like Gemma-3-4B, even edging ahead in some international English locales and multilingual evaluations (Portuguese, French, Japanese, etc.).

In other words, Apple’s new local models seem set to deliver consistent results for many real-world uses without resorting to the cloud or requiring data to leave the device.

When it comes to Apple’s server model (which won’t be accessible by third-party developers like the local models), it compared favorably to LLaMA-4-Scout and even outperformed Qwen-2.5-VL-32B in image understanding. That said, GPT-4o still comfortably leads the pack overall.

The “free and offline” part really matters

The real story here isn’t just that Apple’s new models are better. It’s that they’re built in. With the Foundation Models framework, developers no longer need to bundle heavy language models in their apps for offline processing. That means leaner app sizes and no need to fall back on the cloud for most tasks.

The result? A more private experience for users, and no API costs for developers, savings that can ultimately benefit everyone.

Apple says the models are optimized for structured outputs using a Swift-native “guided generation” system, which allows developers to constrain model responses directly into app logic. For apps in education, productivity, and communication, this could be a game-changer, offering the benefits of LLMs without the latency, cost, or privacy tradeoffs.

Ultimately, Apple’s models aren’t the most powerful in the world, but they don’t need to be. They’re good, they’re fast, and now they’re available to every developer for free, on-device, and offline.

That might not make for the same headlines as more powerful models will, but in practice, it could lead to a wave of genuinely useful AI features in third-party iOS apps that don’t require the cloud. And for Apple, that may very well be the point.

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Marcus Mendes Marcus Mendes

Marcus Mendes is a Brazilian tech podcaster and journalist who has been closely following Apple since the mid-2000s.

He began covering Apple news in Brazilian media in 2012 and later broadened his focus to the wider tech industry, hosting a daily podcast for seven years.