
Google’s Gemini 2.5 Pro I/O Edition: The New AI Coding King Dethrones Claude 3.7 Sonnet
A new contender has seized the throne in the realm of AI coding models. Google's DeepMind has unveiled the Gemini 2.5 Pro I/O Edition, an upgraded iteration of its multimodal LLM released in March. According to DeepMind CEO Demis Hassabis, this version is "the best coding model we’ve ever built!" Initial benchmarks suggest that Google has surpassed its competitors, marking its first lead since the generative AI surge began with ChatGPT's launch in late 2022.

The new model, dubbed "gemini-2.5-pro-preview-05-06," succeeds the previous 03-25 release and is now accessible to independent developers via Google AI Studio, enterprise users through Vertex AI, and individual users within the Gemini app. Google's statement indicates that the enhanced model also powers features like Canvas within the Gemini mobile app.
This update streamlines feature development in apps such as Gemini 95, automating the harmonization of visual styles across components. Moreover, it facilitates the conversion of YouTube videos into comprehensive learning applications and the creation of stylized components. Enterprises utilizing the model will pay Google for access through its web services at a cost of $1.25/$10 per million tokens in/out, lower than Claude 3.7 Sonnet’s pricing.
Google’s timing, preceding its annual I/O developer conference (May 20-21), is strategic. Logan Kilpatrick, Senior Product Manager for Gemini API and Google AI Studio, highlighted that the update addresses feedback concerning function calling, reducing errors and enhancing reliability.
WebDev Arena Leaderboard, an independent metric for evaluating models based on human preference in generating web applications, positions Gemini 2.5 Pro Preview (05-06) ahead of Anthropic’s Claude 3.7 Sonnet. Gemini scored 1499.95, surpassing Sonnet’s 1377.10. This performance leap is notable, as even OpenAI’s GPT-4o failed to unseat Sonnet previously.
Developers have lauded Gemini's increased reliability and real-world application, reporting successful complex refactoring tasks indicative of senior developer-level decision-making. They have also praised improvements in tool call failures and overall effectiveness.
One standout feature is the model's ability to construct complete, interactive web applications or simulations from just a single text prompt. Such ability aligns with DeepMind’s vision that simplifies the process of prototyping and development. While the underlying architectural nuances remain undisclosed, the emphasis remains on expediting and streamlining development.

With its focus on enhanced code generation and multimodal input, Gemini 2.5 Pro is emerging as a practical tool for complex coding, not just a research concept. The early release signifies Google DeepMind's ambition to meet developer demands and sustain momentum prior to its conference announcements.
The Gemini 2.5 Pro I/O Edition has clearly set a new benchmark for AI coding. Will this improved AI spur a new innovative range of web applications and interfaces? Let us know your thoughts in the comments below.
What are your thoughts on Gemini 2.5 Pro's capabilities? Share your opinions and predictions in the comments below!