Models·3 min read·The Decoder

MiniMax M3 Lands as an Open-Weight, Million-Token Coding Model That Claims to Edge Out GPT-5.5

The Chinese lab says its new open-weight model pairs frontier coding, a 1M-token context window and native multimodality on a sparse-attention architecture that cuts long-context compute 20x — but the weights and technical report are still days away, so every number is company-reported.

OPEN MODELS · CODING · 1M CONTEXTMINIMAX M3 · JUNE 1MiniMax M3Open-weight · native multimodal · MSA attention~1/20 compute at 1M ctx · 9× faster prefill · 15× decodeBENCHMARK SCORECARD · MINIMAX-REPORTEDSWE-Bench Pro59%Terminal-Bench 2.166%SWE-fficiency34.8%BrowseComp83.5BROWSECOMPBeats Opus 4.783.5 vs 79.3Weights + technical report promised within 10 days · scores not yet independently verifiedBITSMINDS.COMSource: The Decoder · MiniMax
Share:

Chinese AI lab MiniMax unveiled M3 on June 1, 2026, calling it the first open-weight model to combine top-tier coding performance, a one-million-token context window and native multimodality in a single system — a bundle of capabilities the company says had until now been the exclusive domain of proprietary frontier models such as Anthropic's Claude Opus 4.7, OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro.

The headline engineering claim is a new attention mechanism called MiniMax Sparse Attention, or MSA. Rather than comparing every token against every other token — the quadratic cost that makes long contexts expensive — MSA pre-filters down to the relevant key-value blocks and then processes them sequentially, batching the queries that need each block into a single contiguous memory read. MiniMax says the result is roughly one-twentieth the per-token compute at a million-token context versus its previous generation, more than 9x faster prefill, more than 15x faster decoding, and an implementation that runs over four times faster than competing open-source alternatives.

On benchmarks, MiniMax reports M3 scoring 59% on SWE-Bench Pro — ahead of GPT-5.5 and Gemini 3.1 Pro and just behind Opus 4.7 — and 83.5 on the BrowseComp web-search test, edging past Opus 4.7's 79.3. The company also published three long-horizon autonomy experiments: M3 reproduced an ICLR 2025 fine-tuning paper over about 12 hours, producing 18 commits and 23 figures for a reproduction score of 0.650; it optimized an FP8 GEMM kernel on Nvidia Hopper GPUs from a broken 7.6% hardware utilization up to 71.3% across roughly 24 hours and 147 attempts; and on PostTrainBench it trained four base models end to end, landing just behind Opus 4.7 and GPT-5.5.

The crucial caveat is that none of this can yet be checked. At launch MiniMax had released neither the weights nor a technical report, promising both within ten days on Hugging Face and GitHub, along with open-sourcing its in-house MiniMax Code agent. Token plans run from about $20 a month for roughly 1.7 billion tokens up to $120 for around 9.8 billion, with a toggleable thinking mode. Until independent engineers can reproduce the architecture and rerun the benchmarks, M3's frontier and open-weight claims remain a company commitment rather than a verified fact.

Comments

Share your thoughts. Be kind.

0/2000

Loading comments…

Related Articles

MODEL WATCH · OPENAI · LEAKS, UNCONFIRMED JUN 17 GPT-5.6 looks days away — if the leaks hold. Prediction markets put a late-June launch near 83% — but OpenAI has confirmed nothing. 83% LAUNCH ODDS Polymarket · June 22–28 window RUMORED SPECS Context window up to ~1.5M tokens Stronger agentic coding, a leaker claims Pricing rumored near a third of Fable 5 BITSMINDS.COM Source: Polymarket · Cryptopolitan · leaks
Models

GPT-5.6 Rumors Reach a Fever Pitch: Prediction Markets Bet on a Late-June Launch, Leaks Claim a 1.5M-Token Window

ZHIPU / Z.AI · OPEN MODEL JUN 16 GLM-5.2 goes open under MIT. A 744B-parameter MoE built for long-horizon agentic coding. 744B TOTAL · 40B ACTIVE (MoE) 1M CONTEXT TOKENS 131K MAX OUTPUT TOKENS MIT OPEN-WEIGHTS LICENSE No benchmarks published at launch — performance is a vendor claim. glm-5.2[1m] · 28.5T training tokens · agentic coding BITSMINDS.COM Source: Z.ai
Models

Zhipu Releases GLM-5.2: a 744B-Parameter, 1M-Token Coding Model Under a Full MIT License

Moonshot AI Ships Kimi K2.7-Code, an Open-Weight Coding Model It Says Uses 30% Fewer Reasoning Tokens
Models

Moonshot AI Ships Kimi K2.7-Code, an Open-Weight Coding Model It Says Uses 30% Fewer Reasoning Tokens