We evaluated Devstral 2 against DeepSeek V3.2 and Claude Sonnet 4.5 using human evaluations conducted by an independent annotation provider, with tasks scaffolded through Cline. Devstral 2 shows a clear advantage over DeepSeek V3.2, with a 42.8% win rate versus 28.6% loss rate. However, Claude Sonnet 4.5 remains significantly preferred, indicating a gap with closed-source models persists.
yogthos in technology
Introducing: Devstral 2 and Mistral Vibe CLI. | Mistral AI
https://mistral.ai/news/devstral-2-vibe-cliThank you for being honest about performance