Thursday, February 19, 2026
  • Login
  • Register
Technology Tutorials & Latest News | ByteBlock
  • Home
  • Tech News
  • Tech Tutorials
    • Networking
    • Computers
    • Mobile Devices & Tablets
    • Apps & Software
    • Cloud & Servers
    • IT Careers
    • AI
  • Reviews
  • Shop
    • Electronics & Gadgets
    • Apps & Software
    • Online Courses
    • Lifetime Subscription
No Result
View All Result
Tech Insight: Tutorials, Reviews & Latest News
No Result
View All Result
Home News Google

Provisioned Throughput (PT) on Vertex AI

February 18, 2026
in Google
0 0
0

When AI agents make thousands of decisions a day, consistent performance isn’t just a technical detail — it’s a business requirement. 

Provisioned Throughput (PT) solves this by giving you reserved resources that guarantee capacity and predictable performance. To help you scale, we are updating PT on Vertex AI with three key improvements:

  • Model diversity: Run the right model for the right job.

  • Multimodal innovation: Process text, images, and video seamlessly.

  • Operational flexibility: Adapt your resources as your agents grow.

In this post, we’ll share the resources available to you today on Vertex AI, and how you can get started. 

Expanding support for a diverse model portfolio

A mature AI strategy requires selecting the right model for the specific task. Vertex AI Model Garden, our curated set of 200+ first-party, third-party, and open-source models, makes it easy to use the best resource for your business needs. 

We standardized the PT experience across this infrastructure to ensure your capacity strategy remains consistent regardless of the model you deploy.

  • Anthropic integration (private preview): You can now purchase and manage PT for Anthropic models directly from the Vertex AI console, bringing one of the industry’s leading third-party providers into your primary capacity workflow.

  • Open model ecosystem: We have extended PT support to the most popular open-source models, including Llama 4, Qwen3, GLM-4.7, and DeepSeek-OCR, all from the same console experience.

  • Unified governance: Because PT now covers all types of models under a single framework, engineering teams no longer need to design separate reservation or procurement strategies for different model providers.

Powering multimodal innovation

The next wave of AI agents are seeing, hearing, and acting in real time. This movement toward native audio, high-definition video, and complex reasoning creates a massive, non-negotiable demand for reliable compute. 

We are ensuring that PT supports these advanced modalities as soon as they reach your production environment.

  • Gemini 3 and Nano Banana: You can now secure dedicated PT for our most capable Gemini 3 models and Nano Banana, our state-of-the-art model for high-fidelity image generation and editing.

  • Gemini Live API: By using PT with Gemini Live API, you get the guaranteed throughput required for high-bandwidth multimodal streams – whether your agents are processing live video feeds or providing real-time audio responses.

  • Veo 3 and 3.1: For video workloads, PT GSU (Generative AI Scale Unit) minimums and incremental limits have been removed for Veo 3 and Veo 3.1. This allows you to purchase the exact amount of capacity you need, making it easier to scale video generation without being forced into high entry-level commitments.

Increasing operational flexibility

Scaling for global production shouldn’t mean sacrificing agility. We provide levers to treat AI compute as a dynamic resource that aligns with actual business cycles.

  • Flexible term lengths: We now offer 1-week PT terms for select models. This allows you to secure guaranteed capacity for high-impact, short-term windows – like a holiday traffic spike or a product launch – without a monthly or yearly commitment.

  • Proactive capacity planning: You can now schedule change orders for your PT requests up to two weeks in advance for select models. This enables your team to automate the ramp-up of resources for known peak events, shifting your strategy from reactive scaling to proactive planning.

  • Maximizing token value: For agentic workloads with long, repetitive contexts, PT now integrates with explicit caching for select models. This delivers reserved performance alongside the significant input cost reductions of caching, ensuring the price of your reservation aligns with actual business value.

How customers are scaling with confidence on Vertex AI

ShareTweetShare
Previous Post

Gartner ranks Spanner #1 for Lightweight Transactions Use Case

Next Post

Cloud CISO Perspectives: New AI threats report: Distillation, experimentation, and integration

Next Post

Cloud CISO Perspectives: New AI threats report: Distillation, experimentation, and integration

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You might also like

Using the Neo4j Extension in Gemini CLI

February 19, 2026

Cloud CISO Perspectives: New AI threats report: Distillation, experimentation, and integration

February 19, 2026

Provisioned Throughput (PT) on Vertex AI

February 18, 2026

Gartner ranks Spanner #1 for Lightweight Transactions Use Case

February 18, 2026

Unlocking enterprise data to accelerate agentic AI: How Ab Initio does it

February 18, 2026

Managed MCP servers for Google Cloud databases

February 18, 2026
monotone logo block byte

Stay ahead in the tech world with Tech Insight. Explore in-depth tutorials, unbiased reviews, and the latest news on gadgets, software, and innovations. Join our community of tech enthusiasts today!

Stay Connected

  • Home
  • Tech News
  • Tech Tutorials
  • Reviews
  • Shop
  • About Us
  • Privacy Policy
  • Terms & Conditions

© 2024 Byte Block - Tech Insight: Tutorials, Reviews & Latest News. Made By Huwa.

Welcome Back!

Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
Sign Up with Linked In
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
  • Login
  • Sign Up
  • Cart
No Result
View All Result
  • Home
  • Tech News
  • Tech Tutorials
    • Networking
    • Computers
    • Mobile Devices & Tablets
    • Apps & Software
    • Cloud & Servers
    • IT Careers
    • AI
  • Reviews
  • Shop
    • Electronics & Gadgets
    • Apps & Software
    • Online Courses
    • Lifetime Subscription

© 2024 Byte Block - Tech Insight: Tutorials, Reviews & Latest News. Made By Huwa.

Login