Gemini 3.5 is here! Tonight, Google personally eliminates Google

By: rootdata|2026/05/20 15:10:34
0
Share
copy

Author: Xinzhi Yuan

Google I/O 2026 conference, full throttle!

Just now, Pichai and Demis Hassabis appeared on stage together, revealing all the big surprises they have been preparing for half a year in one go.

Without a hint of suspense, tonight's biggest star, Gemini Omni, officially debuted!

As a truly "versatile" large model, Omni can accept any form of input and generate any content. It also supports video output for the first time, making it the "video version of Nano Banana."

Another highlight of tonight belongs to Gemini 3.5 Flash.

In almost all benchmark tests, 3.5 Flash has completely outperformed its predecessor, the flagship Gemini 3.1 Pro. The output speed has also doubled, making it more than four times faster compared to GPT-5.5 and Opus 4.7. The more powerful 3.5 Pro will be released next month.

In addition, a wave of heavyweight new products was also unveiled:

  • Antigravity 2.0: A brand new standalone desktop application, evolving from an IDE to an Agent development platform.

  • Gemini Spark: A personal AI agent, running 24/7 in the cloud.

  • Gemini App Revamp: Codenamed Neural Expressive, switching to compute billing.

  • AI Ultra Subscription Plan: A new $100 version added, with the highest tier reduced from $250 to $200.

  • The biggest upgrade to Google Search in 25 years: Integrating 3.5 Flash, adding smart search boxes, automatically generating mini-apps, etc.

    ......

It is not an exaggeration to say that the density of valuable content in this I/O is the highest ever.

Gemini Omni Debut: A "Versatile" AI is Born

As the pre-release video wildly hinted, the highly anticipated Gemini Omni has finally arrived. Hassabis personally announced on stage, "We are taking the next important step—Gemini Omni, a new model that can create content from any input."

This lineup says it all. What Google aims to create this time is a "versatile" AI creation engine. It integrates Gemini's intelligence with the strongest generative AI, maximizing understanding of the world, multimodality, and editing across three dimensions. To put it simply, given any combination of images, audio, video, and text, it can generate a high-quality video. Moreover, it can edit videos in a conversational manner.

More importantly, Omni is not just "looking like" it; it truly understands the physical world. Hassabis's exact words were that previous systems often failed when simulating concepts like gravity and kinetic energy, but Omni has achieved a "leap change." It injects Gemini's "world knowledge" and "reasoning ability" into video generation.

  • Give it a prompt like "Explain protein folding with clay animation," and the generated video accurately depicts each step of the amino acid chain folding into α helices and β sheets, visually presented as exquisite stop-motion animation.

  • For example, assigning corresponding objects to the 26 letters of the English alphabet. C is a capybara, D is a disco ball, L is a lava lamp. Omni is not just collaging materials; it is genuinely connecting language, images, and semantics.

It must be said that the leap from realism to meaningfulness is enormous.

On stage, Hassabis pulled out a video of himself and began to modify it live. A circle drawn on his palm turned into a black hole, and a street scene from an evening walk transformed into a cyberpunk setting. One sentence rewrites the scene, one sentence changes the world. Anything can become a canvas for creating a new reality. For instance, playing with fire in a selfie, a circle drawn on a piece of paper immediately becomes a black hole, and various imaginative gameplay can be realized.

Moreover, this is not a one-time generation. You can continue to chat. The videos output by Gemini Omni maintain consistent characters, logical physics, and coherent scene memory.

  • Starting from an original performance scene. In the second round, "Teleport the violinist to the environment of this picture," attaching a reference image of a snowy mountain meadow, the scene instantly switches, and the actions and lighting adapt to the new environment.

  • In the third round, "Cut the camera to the shoulder of the violinist," the perspective rotates, but the playing action and music remain completely continuous.

No matter how the scene changes, the main subject of the image will not collapse.

What is even more chilling to think about is Omni's input flexibility. Images, text, video, audio, any reference material can be mixed and input to generate a coherent output. You can even create your own Avatar, allowing an AI version of yourself to appear in any scene, speaking your voice and doing things you haven't done.

Currently, Omni Flash has officially launched, and the API version will be available in the coming weeks. An even stronger Omni Pro is also on the way. With Google's powerful integration capabilities, Omni's debut has already integrated with Gemini App, Google Flow, and YouTube Shorts, and YouTube Shorts users can even use it for free.

Flash Overwhelms Pro: 3.5 Redefines "Flagship"

After Gemini Omni, another highlight of the I/O conference is the release of the new flagship Gemini 3.5 Flash. Google defines it as the strongest coding and agent model to date.

On site, Pichai personally announced, "3.5 Flash has comprehensively outperformed Gemini 3.1 Pro in almost all benchmark tests!" To be clear, 3.1 Pro was just launched by Google three months ago, and now a Flash-level model has crushed it.

Unexpectedly, Google has delivered impressive results in such a short time:

  • Terminal-Bench 2.1 (coding): 76.2%

  • GDPval-AA (real-world agent tasks): 1656 Elo

  • MCP Atlas (large-scale tool usage): 83.6%

  • CharXiv Reasoning (multimodal understanding): 84.2%

These four benchmark tests show that 3.5 Flash represents a significant leap compared to Gemini 3.1 Pro. In terms of speed, 3.5 Flash occupies a quadrant of its own, at 289 tokens/second, more than four times faster than other leading models. Additionally, in some benchmark tests, 3.5 Flash's performance rivals or even surpasses GPT-5.5 and Claude Opus 4.7. It must be said that 3.5 Flash is both fast and powerful, with almost no competitors.

Parameters are too abstract; let’s look at real extreme demonstrations. In just a moment, 3.5 Flash can digest an incomprehensible academic paper and write a fully interactive visual website. In agent tasks, through Antigravity, it can complete multi-step workflows, automatically classifying and naming displayed assets. Alternatively, using two agents, it replicated the AlphaZero paper and wrote a fully functional game in just six hours.

93 Agents Create an OS in Just 12 Hours

It can be seen that all of 3.5 Flash's capabilities are realized through the brand new Antigravity 2.0. Today, Google's agent development platform Antigravity has upgraded to 2.0, transforming from an IDE into a standalone desktop application, fully embracing agent-first design.

Varun took the stage to present a breathtaking demo. He had Antigravity equipped with 3.5 Flash build an operating system from scratch. 93 sub-agents worked in parallel, making over 15,000 model requests and processing 2.6 billion tokens. After 12 hours, a completely blank project turned into a fully functional OS kernel. The scheduler, memory management, file system—every line of code was written, tested, and audited by agents. The API cost was less than $1000.

Then, he tried to run DOOM on this AI-written operating system. The first attempt failed due to missing video and keyboard drivers. So he entered a fix command in Antigravity 2.0 on the spot, and the agents began to automatically write the driver code. After a while, the DOOM screen appeared on the monitor, and the audience erupted in excitement.

In summary, the core upgrades brought by Antigravity 2.0 include:

  • Sub-agents can be dynamically generated, with the main agent breaking tasks into sub-tasks for parallel execution without interference;

  • Asynchronous task management prevents long-running operations from blocking the main thread;

  • Scheduled Tasks allow for "scheduled tasks" that enable agents to execute automatically, such as checking PR status daily or running health check scripts hourly.

  • New slash commands: /goal allows agents to complete tasks in one go, /grill-me prompts agents to clarify requirements before acting, and /browser explicitly controls browser usage.

However, these are all capabilities that have already been successfully implemented internally. The speed at which Google internally processes tokens with Antigravity was 500 billion per day in March. Now, it has skyrocketed to 3 trillion per day. Moreover, this 12x accelerated version of Flash is now available in Antigravity starting today.

3.5 Flash has also become the default model for Gemini App and Google Search AI Mode, available to users worldwide. Developers can access it through Antigravity 2.0, Gemini API, and Google AI Studio. Even more explosive, 3.5 Pro is currently in internal testing and will be released next month.

7x24h Personal Assistant: Google Spark Has Finally Arrived

The third major release tonight belongs to Gemini Spark! Pichai's positioning for it is very clear: your personal AI agent. Even when you close your laptop, it doesn't stop. It runs on a dedicated virtual machine in the cloud, capable of being online 24/7.

Gemini Spark is powered by Gemini 3.5 + Antigravity framework, deeply integrated with Google's "office suite." Product VP Josh Woodward demonstrated two scenarios that drove the audience wild.

  • The first is a work scenario: Input a command, "Help me draft an email to the team summarizing all information about the Gemini Live release from the past week." Spark automatically pulls information across Gmail, Docs, and chat records, and even calls a "ghostwriter" skill that Woodward wrote himself, allowing the email to automatically match his personal tone. The entire process is completed in the background, with humans only needing to review and send. Yes, Spark supports custom skills, allowing it to learn your tone, preferences, and working style.

  • The second is a life scenario: Planning a block party. After receiving the task, Spark executes it step by step. It creates an RSVP tracking sheet in Google Sheets, directly connecting to Gmail, automatically updating who has responded. For neighbors who haven't signed up, Spark automatically drafts reminder emails and generates drafts to be sent after confirmation. Then, it also generates a promotional deck in Google Slides, including information about inflatable castles for the block party. The entire process was completed without opening any apps.

Moreover, Spark also possesses powerful voice input capabilities. On site, Woodward took out his phone and directly used voice to issue three tasks: "Find all meetings with Sundar marked in bright pink," "Write an invitation letter for new neighbor John to join the block party list," "Create a document listing things to do for the kids before the end of the school year, sorted by deadline."

The voice was directly converted into text commands, and Spark automatically split a continuous voice input into three independent task threads, executing them in parallel in the background.

In terms of pricing, the AI Ultra subscription is available for $100 per month for Spark Beta. The highest Ultra plan has been reduced from $250 to $200. Spark will first open the Beta trial to US AI Ultra users next week.

On This Night, Google Ripped Open the ASI Entrance

Looking back at this I/O, what truly sends chills down the spine is not a single product. It is the simultaneous arrival of all capabilities.

Full-modal understanding, full-modal generation, 24/7 online agents—these three puzzle pieces have all been put together by Google in one night. Omni turns a sentence into a world without needing humans to provide any materials; 93 agents built an operating system from scratch without requiring humans to write a single line of code; Spark works for you 24/7 without needing humans to open any apps.

When AI no longer needs humans to "feed it," but instead understands, decides, executes, and iterates on its own—this path's endpoint is called ASI (Artificial Superintelligence).

No one can provide a precise timeline. But tonight's Google I/O made everyone realize one thing: the barrier of "technically impossible" on the road to superintelligence has already been removed. What remains is just the speed of engineering deployment. Six months ago, we were still debating whether AGI was a bubble. Six months later, Google is already using agents to write operating systems. The acceleration in this industry has exceeded the range that human intuition can perceive.

References:

  • https://youtu.be/wYSncx9zLIU

  • https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

  • https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/

  • https://antigravity.google/blog/introducing-google-antigravity-2-0

  • https://antigravity.google/blog/google-io-2026-feature-deep-dive

-- Price

--

You may also like

Founder of Baixing.com: We have transformed from leaders in AI to guides in AI

For the first time in thousands of years, humanity has a partner that is smarter than us—not a boss, not a slave, not a child, but a partner.

The AI stock god who made 60 times profit bets 7.7 billion dollars on Nvidia reaching its peak

The significance of this quarterly report lies in its transformation of a judgment that previously remained more at the narrative level into a concrete position structure.

The tokenized market will reach a trillion-dollar scale, but there are still four major obstacles

Tokenizing trillions of dollars in assets is no longer a question of "if it will happen," but rather a question of who will build the applications for issuing, trading, and custodial services for these assets on a global scale.

WEEX Crypto Pizza Day: Join Us in Dubai for a Slice of Crypto History

Join WEEX in Dubai for a Bitcoin Pizza Day celebration of the first bitcoin transaction – 10,000 BTC for pizza (now worth $774,880,000 by the time of writing). Live trading, poker, and networking with crypto elites. Join WEEX, for a slice of crypto history.

Harvard and other institutions are liquidating their assets, and six core talents have left in a month. What is happening with Ethereum?

Vitalik's long-term vision is clear, but the realization of that vision requires stable coordination capabilities and continuous experience accumulation, both of which seem to be rapidly diminishing.

WEEX P2P now supports KES—Merchant Recruitment Now Open

To make crypto deposits easier, WEEX has officially launched its P2P trading platform and continues to expand fiat support. We're excited to announce that the Kenyan Shilling (KES) is now available on WEEX P2P!

Contents

Popular coins

Latest Crypto News

Read more
iconiconiconiconiconiconicon
Customer Support:@weikecs
Business Cooperation:@weikecs
Quant Trading & MM:bd@weex.com
VIP Program:support@weex.com