In the world of generative AI and Retrieval-Augmented Generation (RAG), embeddings are the “secret sauce” that allow machines and AI agents to understand the semantic meaning of data. As BigQuery extends its autonomous data-to-AI platform, embeddings unblock valuable multimodal use cases. However, for many data engineers, managing embeddings is a headache. Traditionally, users have to set up embedding generation pipelines themselves to propagate source content updates, embedding generation, and storage.
To help BigQuery users with their AI workloads, we’re introducing autonomous embedding generation. This feature allows BigQuery to automatically maintain an embedding column on a table based on a source column. No more manual pipelines, no more synchronization issues, just easy, AI-ready data.
Managing embeddings, the old way
Before autonomous generation, the process of updating your vector search database usually looked like this:
-
Detect new rows in your source table.
-
Generate embeddings via functions like AI.EMBED.
-
Handle rate limits and retries.
-
Update the destination table with the new vectors.
-
Monitor the progress of your embedding generations.
If your data changes frequently, keeping these vectors in sync can be a full-time job for the user/administrator. With this as the backdrop, we set out to enhance BigQuery with the following capabilities.
1. Help the user directly work with their data
We want to simplify the search experience for the user, so that they can do simple things like AI.SEARCH(TABLE mydataset.products, ‘product_description’, “A really fun toy”), without having to interact or understand the embeddings.
2. Automatic synchronization
BigQuery should manage embedding generation on behalf of the user and keep generated embeddings in sync with the source data.
3. Tight integration with vector indexes
BigQuery’s VECTOR_SEARCH has many users, and we want to ensure that the managed embedding was integrated into it.
The solution: autonomously generated embedding columns
We solved this by treating embeddings as a managed part of your table. Using a familiar SQL syntax, you can now define an autonomous embedding column that BigQuery manages for you.






