Gemini 2.5 Flash Native Audio improves live voice agents and translation

Gemini 2.5 Flash Native Audio improves live voice agents and translation — Google DeepMind News
Source: Google DeepMind News

Gemini 2.5 Flash Native Audio has been updated to improve live voice agents and real-time translation. The model now delivers sharper function calling, more reliable instruction following and smoother multi-turn conversations. It is rolling out across Google AI Studio, Vertex AI, and into Gemini Live and Search Live.

A live speech-translation beta is also available in the Google Translate app on Android in the US, Mexico and India. The update boosts function-calling reliability so the model can fetch and weave real-time information into audio responses without breaking the flow; it scored 71.5% on ComplexFuncBench Audio.

Instruction adherence improved to 90% from 84%, increasing content completeness. Context retrieval across turns has been strengthened to make dialogues more cohesive. Live speech-to-speech translation preserves intonation, pacing and pitch, and supports over 70 languages and 2,000 language pairs.

United States, Mexico, India

gemini 2.5, native audio, live agents, real-time translation, function calling, instruction following, vertex ai, gemini live, google translate, speech translation