As enterprise storage footprints scale to billions of objects, AI applications and agentic workloads are fundamentally shifting the role of storage from a passive repository to the foundation of the data platform. This is driven by a surge in unstructured model data and the billions of actions performed on those objects, including session logs and audit trails. To manage this and answer questions about cost, operations, and security, storage and platform admins need to go beyond knowing what data they have, to understanding exactly how it is being accessed, moved, and modified.
To help, we’re excited to announce activity insights within Storage Insights datasets. Now generally available, these new views provide visibility into the operational details of your Google Cloud Storage assets, enabling data-driven cost optimization and faster troubleshooting. For example, with activity insights, you can answer questions like:
-
Are my objects located in the right storage classes within my buckets?
-
What regions is my bucket interacting with the most so I can assess if it is optimally located?
-
Where are there errors across operations on my storage estate and why?
Answering these questions confidently is the key to unlocking cost optimizations and reclaiming engineering time. Storage Insights datasets, a feature of Storage Intelligence for Cloud Storage, provides daily metadata and frequent activity insights (typically within four hours of the activity) so you have better visibility into your storage estate. While Storage Intelligence is a unified management product with capabilities like Bucket relocation, Batch operations and Gemini Cloud Assist, this blog focuses on how you can leverage Storage Insights datasets for operational optimization.
What are Storage Insights datasets?
Storage Insights datasets deliver an automated, query-ready BigQuery index of your entire storage estate, complete with raw metadata and activity insights, replacing manual, error-prone data collection. Storage Insights datasets can be customized in scope: create a dataset for your entire org, a specific folder, a project, or a set of projects, or even specific buckets. The dataset then refreshes with regular updates, giving you a comprehensive view of your storage.
From static metadata to live intelligence
Storage Insights datasets are your go-to tool for understanding your storage metadata, acting as an inventory management tool, scanning object metadata (storage class, location, age, custom metadata) and organizing it into a powerful, queryable BigQuery-linked dataset. This is crucial for knowing what data you have (learn more about how to optimize storage spend with Storage Insights datasets here).
But what if you also knew how and when that data is being used?
Storage Insights datasets now offers a set of new views that capture:
-
Object-level activity, including writes, updates, deletes, and errors
-
Bucket-level aggregate activity, including total object operations, a breakdown by type of operations, total errors and most active prefixes
-
Bucket-level regional traffic activity, including ingress and egress bytes per region that interact with your bucket
-
Project-level aggregate activity, including total object operations, a breakdown by type of operations and total errors
This data flows directly into new BigQuery views within your dataset so you can run analytics queries for specific insights, interact with the data via Gemini or simply connect it to powerful Looker dashboards for visualization.
This moves you from a static snapshot to a dynamic, queryable analysis of your data’s entire lifecycle. It’s the difference between knowing what’s in your warehouse and knowing what’s used and when.
Three ways to use activity insights immediately
Here’s what you can do, starting today, with activity insights in Storage Intelligence datasets.
1. Right-size your storage estate
-
The challenge: You have terabytes of data in Standard or Nearline class storage that you believe is cold. But without proof, moving it to Coldline or Archive class is risky. What if a critical process still needs to read it once per quarter?
-
The solution: With the new Storage Intelligence views that surface activity insights, you can now identify buckets that have had minimal read/write activity over the last 30, 60, or 90 days.
-
The outcome: Apply or fine-tune lifecycle policies to transition this data to more cost-effective storage classes.
For example, here’s a SQL query to order all the buckets in your estate with little to no activity in the last six months:






