The Factory Floor
Building a Local Food Tour Agent
Timestamp: 5:29
We showcased a food tour agent powered by Gemma 4 using the Agent Development Kit (ADK) and a Google Maps MCP server. We demonstrated how a local model can handle complex, multi-step reasoning tasks.
-
The agent identified the best ramen spots in Seattle under a $30 budget.
-
It verified that the locations were within walking distance of each other.
-
It processed search results to provide specific tips on what to order and what to avoid.
Autonomous Python Code Execution
Timestamp: 8:03
In this demo, we pushed Gemma 4’s coding capabilities to the limit by asking it to express itself through animation. Using a sandbox execution environment, the model performed the following:
-
Wrote Python code using the Matplotlib library.
-
Attempted to build a physics engine to simulate a bouncing ball.
-
Self-corrected when the initial execution environment lacked certain CPU features, finding an alternative path to successfully generate the animation.
-
Demonstrated a deep understanding of real-world physics and gravity through code.
The Shift to Apache 2 Licensing
Timestamp: 4:05
A major theme of the conversation was the community-driven decision to move Gemma 4 to an Apache 2 license. This change provides developers and startups with maximum flexibility to build, modify, and commercialize applications. Omar emphasized that this was a direct response to developer feedback, aiming to unlock a new wave of innovation in the open models ecosystem.
Developer Q&A
Architectural Decisions and Mixture of Experts (MoE)
Timestamp: 17:23
Omar explained the technical shifts that make Gemma 4 so efficient. For the first time, the Gemma family includes a Mixture of Experts (MoE) architecture, which optimizes for extremely low latency in production. Additionally, the smaller E2B and E4B models utilize per-layer embeddings to remain “cheap” to run on GPUs. For vision tasks, the model now supports variable aspect ratios, allowing it to understand images of various sizes more accurately than previous fixed-resolution versions.
Comparing Gemma to Gemini
Timestamp: 19:51
When asked how Gemma stacks up against its larger sibling, Gemini, Omar clarified that they serve different purposes. While Gemini excels at massive-scale tasks and deep “world knowledge” due to its size, Gemma is the “best open model that can run on a single consumer GPU.” It is specifically optimized for instruction following, coding, and agentic use cases where local deployment or fine-tuning is required.
Fine-Tuning for Specialized Industries
Timestamp: 21:10
The conversation touched on the importance of “Sovereign AI” and privacy. Because Gemma is an open model, developers in regulated industries, like healthcare or finance, can fine-tune the model on their private data and deploy it within their own air-gapped infrastructure. This gives developers full control over their data and the model’s specialized expertise.
Conclusion
Gemma 4 marks a turning point for agentic development, proving that you don’t always need a massive cloud cluster to build something smart. Whether it’s running a physics simulation on a laptop or a travel guide on a phone, the barrier to entry for high-performance AI has never been lower. We are entering an era where the “conductor” of the AI orchestra can be any developer with a single GPU and a great idea.
Your turn to build
Now that you’ve seen what Gemma 4 can do, it’s time to start building. Check out the resources in our show notes, the food tour agent, the coding agent, explore the ADK support, and try running Gemma 4 on your local machine or on Cloud Run. We can’t wait to see what agents you create!
Watch more of The Agent Factory → Reinforcement learning & fine-tuning on TP…
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech






