Ai2’s new small AI model outperforms similarly-sized models from Google, Meta

5 months ago 562

Lumen Orbit, startups, task capital, space, information centers

11:29 AM PDT · May 1, 2025

‘Tis the week for tiny AI models, it seems.

On Thursday, Ai2, the nonprofit AI probe institute, released Olmo 2 1B, a 1-billion-parameter exemplary that Ai2 claims beats similarly-sized models from Google, Meta, and Alibaba connected respective benchmarks. Parameters, sometimes referred to arsenic weights, are the interior components of a exemplary that usher its behavior.

Olmo 2 1B is disposable nether a permissive Apache 2.0 licence connected the AI dev level Hugging Face. Unlike astir models, Olmo 2 1B tin beryllium replicated from scratch; Ai2 has provided the codification and information sets (Olmo-mix-1124, Dolmino-mix-1124) utilized to make it.

Small models mightiness not beryllium arsenic susceptible arsenic their behemoth counterparts, but importantly, they don’t necessitate beefy hardware to run. That makes them overmuch much accessible for developers and hobbyists contending with the limitations of lower-end and user machines.

There’s been a raft of tiny exemplary launches implicit the past fewer days, from Microsoft’s Phi 4 reasoning family to Qwen’s 2.5 Omni 3B. Most of these — and Olmo 2 1B — tin easy tally connected a modern laptop oregon adjacent a mobile device.

Ai2 says that Olmo 2 1B was trained connected a information acceptable of 4 trillion tokens from publically available, AI-generated, and manually created sources. Tokens are the earthy bits of information models ingest and make — 1 cardinal tokens is equivalent to astir 750,000 words.

On a benchmark measuring arithmetic reasoning, GSM8K, Olmo 2 1B scores amended than Google’s Gemma 3 1B, Meta’s Llama 3.2 1B, and Alibaba’s Qwen 2.5 1.5B. Olmo 2 1B besides eclipses the show of those 3 models connected TruthfulQA, a trial for evaluating factual accuracy.

Techcrunch event

Berkeley, CA | June 5

BOOK NOW

This exemplary was pretrained connected 4T tokens of high-quality data, pursuing the aforesaid modular pretraining into high-quality annealing of our 7, 13, & 32B models. We upload intermediate checkpoints from each 1000 steps successful training.

Access the basal model: https://t.co/xofyWJmo85 pic.twitter.com/7uSJ6sYMdL

— Ai2 (@allen_ai) May 1, 2025

Ai2 warns that that Olmo 2 1B carries risks, however. Like each AI models, it tin nutrient “problematic outputs” including harmful and “sensitive” content, the enactment says, arsenic good arsenic factually inaccurate statements. For these reasons, Ai2 recommends against deploying Olmo 2 1B successful commercialized settings.

Kyle Wiggers is TechCrunch’s AI Editor. His penning has appeared successful VentureBeat and Digital Trends, arsenic good arsenic a scope of gadget blogs including Android Police, Android Authority, Droid-Life, and XDA-Developers. He lives successful Manhattan with his partner, a euphony therapist.

Read Entire Article