The problem: agents speak text, but users want UI
Most agent frameworks today return strings. That’s fine for short answers, but it breaks down quickly:
-
Multi-turn slot filling (date, time, party size) burns turns and patience.
-
Choices among options (which restaurant? which insurance plan?) become long bulleted lists the user has to copy-paste back.
-
Spatial information (locations, routes, floor plans) is reduced to addresses.
Developers have tried to patch this by sending HTML or JavaScript fragments, but that introduces real risks: cross-site scripting, UI injection from a remote agent you don’t fully control, and visual drift from the host app’s design system. What’s needed is a way to transmit UI that’s safe like data and expressive like code.
What A2UI is
A2UI is an open protocol, introduced by Google and co-developed with the Flutter team and product teams behind Gemini Enterprise. Instead of returning text or HTML, an agent returns a JSON payload that describes a UI: a tree of components (Card, Text, Button, ChoicePicker, Image, …) and a separate data model holding the values those components display.
Three properties make this useful in practice:
-
Declarative, not executable. The payload is data. The client only renders components from a pre-approved catalog, so a remote agent can’t inject arbitrary code or steal credentials through a UI widget.
-
Streaming-friendly. The format is a flat list of small JSON messages, so the LLM can emit them incrementally and the client can paint as they arrive.
-
Framework-agnostic. The same agent response renders through Lit, Angular, Flutter, or native mobile. The agent doesn’t know — or care — what’s on the other end.
A2UI is also transport-agnostic. The messages ride inside whatever pipe you already use: A2A JSON-RPC, AG-UI, WebSockets, SSE. In our reference implementation, A2UI rides inside the A2A protocol as DataPart objects with the MIME type application/json+a2ui.
Where A2UI sits in the stack
A2UI is one piece of a four-layer stack. Confusion usually comes from conflating these layers — they’re each doing a different job:
|
Layer |
Owns |
Examples |
|---|---|---|
|
App experience |
Client shell and conversation state — chat window, input box, message history |
CopilotKit, AG-UI |
|
Pixel drawing |
Turning component descriptions into actual rendered UI |
Lit, Flutter, Angular |
|
Conversation pipeline |
Client–server transport — sending messages, receiving responses |
A2A Protocol |
|
Cargo (data format) |
The thing flowing through the pipeline that describes the UI |
A2UI |
Read top to bottom: CopilotKit/AG-UI owns the app experience. Lit/Flutter/Angular own the rendering. While CopilotKit and AG-UI provide valuable abstractions, they remain strictly optional for implementing A2UI. In this architecture, A2A serves as the underlying conversation pipeline, while A2UI represents the structured cargo that actually traverses that pipe.
That separation is why the same A2UI payload renders identically in three very different deployment shapes:
-
Bespoke web app — a custom client shell (like the reference repo’s Lit frontend/) plus a custom A2UI renderer.
-
CopilotKit / AG-UI app — CopilotKit owns the chat shell, an A2UI renderer is registered inside it for rich cards.
-
Gemini Enterprise — GE is the shell, the renderer, and the transport client. You only build the agent.
So for the GE path, the stack collapses to two layers you control: the A2A endpoint (your agent) and the A2UI cargo it emits. The other two layers are GE’s responsibility. CopilotKit and AG-UI are great if you’re building a standalone product UI elsewhere — they’re just out of scope for embedding an agent inside Gemini Enterprise.
Pattern revisions
The protocol evolves quickly, and different clients support different revisions. Two patterns are common today:
-
Inline pattern — the agent sends a component tree with the data baked into each component (the pattern Gemini Enterprise renders today).
-
Decoupled pattern — the agent sends the component tree and the data model as separate messages, so subsequent turns can update one without re-sending the other. This reduces tokens and latency for long-running conversations and is the direction the protocol is heading.
The reference repo serves both patterns from one backend, picking which to emit per request based on the client’s X-A2A-Extensions header. As new revisions ship, you add another catalog and the same negotiation pattern keeps working.
How A2UI works inside Gemini Enterprise
Gemini Enterprise ships with a built-in A2UI renderer. For the developer, that means the integration story is short:
-
Build your A2A agent, embedding an A2UI catalog and example payloads alongside the regular tool definitions.
-
Register the agent with Gemini Enterprise as an A2A endpoint. (Use make register-gemini-enterprise in the reference repo.)
-
A GE admin shares the agent with employees, just like any other agent in the GE catalog.
At runtime, the flow looks like this:
-
The user types a request in the GE chat. GE calls your agent’s A2A endpoint and sends along GE’s own A2UI catalog — the list of UI components GE knows how to render.
-
Your agent decides whether a UI widget is the right response. If yes, it emits an A2UI JSON message (e.g., a ChoicePicker of restaurant options). If no, it falls back to text. Both can coexist in the same response.
-
GE receives the JSON, validates it against its catalog, and renders the widget natively in GE’s own design language — so it visually matches the rest of the chat surface.
-
When the user interacts with the widget (selects three options, picks a date), GE serializes the interaction back into JSON and sends it to your agent as the next turn. Your agent processes structured input, not free-form text.
One thing worth flagging: because your agent doesn’t ship its own renderer for GE, you don’t need to choose a frontend framework to start. Your A2A endpoint can run anywhere — Cloud Run, GKE, on-prem — and GE handles the rendering.
High-level architecture example
The reference implementation is an ADK backend on Cloud Run designed to plug seamlessly into Gemini Enterprise.






