We recently announced AI-first Colab Enterprise notebook experience in BigQuery and Vertex AI to help you simplify and transform your data science and analytics workflows. Colab Enterprise notebooks come with a built-in Data Science Agent to accelerate your data science development with agentic capabilities that facilitate data exploration, transformation, and machine learning modeling. With nothing but a simple prompt, the agent generates a detailed plan for your workflows – from data loading and cleaning to model training and evaluation.
Today, we’re introducing powerful new features in the Data Science Agent to further simplify and scale your analytical journeys, especially with large and open-format datasets.
Generate BigQuery ML, BigQuery DataFrames, & Spark
You can now harness the power of BigQuery Machine Learning (ML), BigQuery DataFrames (BigFrames), and Spark for large-scale data processing directly within the Data Science Agent. BigQuery ML and BigQuery DataFrames allow you to scale up data transformation, model training, and inference by running them directly on BigQuery. And with Serverless for Apache Spark, you can perform distributed data processing on large datasets, allowing you to work with data that is too large to fit into memory on a single machine.
To invoke these tools, simply include the following keywords in your prompt:
- For BigQuery ML: use “BigQuery ML”, “BQML”, or “SQL”
- For BigQuery DataFrames: specify “BigQuery DataFrames” or “BigFrames”
- For PySpark: include “Spark” or “PySpark”