Building and Scaling Data Agents with Google Cloud

The Shift to Agentic Data Workflows

Google Cloud is positioning its data stack to move beyond simple SQL assistance toward autonomous, proactive agents. The strategy categorizes investments into three tiers: assistive tools (BigQuery Assistant), persona-specific agents (Data Engineering, Data Science, Conversational Analytics), and developer primitives (SDKs, MCP APIs) for custom agent building.

Persona-Specific Agent Capabilities

Google has moved several task-specific agents into General Availability (GA) to address distinct user needs:

Data Engineering Agent: Automates pipeline construction from natural language, allowing engineers to review and modify generated DAGs and specifications.
Data Science Agent: Integrated into Colab notebooks, this agent handles data exploration, feature engineering, and model validation, drastically reducing the time required for model development cycles.
Conversational Analytics: Now available across Looker, BigQuery, and dashboards, this allows business users to perform complex tasks like forecasting, key driver analysis, and anomaly detection without needing to understand underlying schemas.

Developer-Facing Infrastructure

To support developers building custom agents, Google introduced the Data Agent Kit, which provides a VS Code extension and plugins for Gemini CLI and Codex. Key technical components include:

Model Context Protocol (MCP) Integration: A fully managed remote MCP server eliminates the need for custom glue code to connect external frameworks to BigQuery.
Agent Analytics Plugin: Enables real-time observability into agent performance, including latency, token usage, and sentiment, with data persisted directly into BigQuery for further analysis.
Deep Data Research (Deep Dive): A new feature in Gemini Enterprise that synthesizes structured BigQuery data with unstructured and real-time web information to generate comprehensive, long-form reports rather than simple point-in-time answers.

Operationalizing Agents

For enterprise teams, the challenge remains moving from prototype to production. The speakers emphasized that agents must be "proactive and autonomous," capable of triggering workflows based on events or schedules. By exposing these agents via APIs, organizations can embed conversational analytics directly into custom applications, ensuring that governance and data access controls remain consistent across the enterprise.

Key Takeaways

Standardize on MCP: Use the BigQuery MCP server to reduce the engineering overhead of connecting LLM frameworks to your data warehouse.
Instrument Early: Use the agent analytics plugin to track token usage and latency from day one; treat agent performance as a first-class metric in your MLOps pipeline.
Leverage Existing Surfaces: Instead of building custom UIs, publish agents to Gemini Enterprise or Data Studio to reach business users where they already work.
Human-in-the-Loop: For data engineering, always treat agent-generated pipelines as suggestions; use the provided DAG visualizations to review and refine logic before deployment.
Synthesize Sources: Use 'Deep Dive' capabilities to combine structured warehouse data with unstructured context for higher-quality, report-style outputs.

Notable Quotes

"By 2027, more than 50% of business decisions are going to be augmented or automated by the agent AI." — Ganesh Kumar Gella (citing Gartner, emphasizing the urgency of the agentic shift).
"Tasks that used to take weeks and months now can be accomplished with a matter of minutes." — Manish Srinivasan (on the impact of the Data Science Agent).
"Customers want to be able to take conversational analytics and embed it in their own applications, whether it's their chat application or a custom app." — Manish Srinivasan (explaining the rationale behind the new Conversational Analytics API).

The Shift to Agentic Data Workflows

Persona-Specific Agent Capabilities

Developer-Facing Infrastructure

Operationalizing Agents

Key Takeaways

Notable Quotes

More from Agents & Orchestration

Agent-Native Immune System (ANIS): Architecture for Runtime Defense

ATOD: Hybrid Distillation for Autonomous Agent Training

Reducing LLM Agent Hallucinations with Grounded Iterative Planning

Odyssey: A Categorical Framework for Verifiable Foundation Models