Explain the Gemini Live API streaming capabilities in Google ADK for real-time voice/video.
#google-adk#streaming#gemini-live#voice#video#real-time
Answer
Streaming & Gemini Live API in Google ADK

Google ADK integrates with the Gemini Live API for low-latency, bidirectional voice and video streaming, enabling real-time conversational AI.
Streaming Architecture
Capabilities
| Feature | Description |
|---|---|
| Audio Input | Real-time microphone streaming |
| Audio Output | Voice response generation |
| Video Input | Camera/screen sharing analysis |
| Bidirectional | Full-duplex communication |
| Interruption | User can interrupt mid-response |
| Tool Calling | Tools work during streaming |
| Transcription | Real-time audio transcription |
Audio Streaming Agent

pythonfrom google.adk.agents import Agent from google.adk.runners import Runner, RunConfig agent = Agent( name="voice_assistant", model="gemini-2.5-flash", instruction="You are a voice assistant. Respond naturally and conversationally.", tools=[get_weather, search_web], ) # Enable streaming run_config = RunConfig( streaming_mode="SSE", response_modalities=["AUDIO"], output_audio_transcription=True, speech_config={ "voice_config": { "prebuilt_voice_config": { "voice_name": "Charon" } } } )
Running Streaming in Web UI

bash# Launch with streaming support adk web my_agent # Click the microphone icon to start voice streaming
Text Streaming (SSE)
pythonfrom google.adk.runners import Runner, RunConfig config = RunConfig(streaming_mode="SSE") async for event in runner.run_async( user_id="user-1", session_id=session.id, new_message="Explain quantum computing", run_config=config, ): # Events arrive as they're generated if event.content and event.content.parts: for part in event.content.parts: if part.text: print(part.text, end="", flush=True)
Streaming with Tools
Streaming vs Non-Streaming
| Feature | Non-Streaming | Streaming (SSE) | Gemini Live |
|---|---|---|---|
| Response | Full at once | Token by token | Real-time audio/video |
| Latency | High (wait for full) | Low (incremental) | Lowest (bidirectional) |
| Modality | Text | Text | Text + Audio + Video |
| Interruption | Not possible | Not possible | Supported |
Learn more at Streaming.