Unlocking Native Computer Use: Setting up CUA with Gemini 3.5 Flash
In the rapidly evolving landscape of agentic workflows, desktop automation represents the next frontier. Historically, building agents that interact with desktop interfaces meant wrapping complex UI libraries or building fragile, pixel-coordinate-based models.
Google’s early-access release of the native Computer Use API powered by Gemini 3.5 Flash changes the game. Evaluating Gemini 3.5 Flash on trycua's Cua-Bench KiCad EDA suite demonstrated a massive leap forward: highest mean reward (0.267) of any frontier model tested — edging GPT-5.5, which had more outright full solves (6 vs 5) but earned no partial credit — along with precise visual grounding on zoomed-in CAD targets and actual analog-design reasoning.
This article walks through the end-to-end setup to hook up Gemini 3.5 Flash to the CUA (Computer-Use Agent) infrastructure layer using the Google Antigravity developer environment.
High-Level Architecture
The integration leverages the Model Context Protocol (MCP) to allow Antigravity agents (including CLI tools and IDE plugins) to communicate with your local desktop environment via a standardized protocol.
[Antigravity Agent (CLI/IDE)]
│
▼ (MCP over stdio)
[cua-driver (symlinked binary)]
│
▼ (Unix Domain Socket / UDS)
[CuaDriver.app (Daemon)] ───► [macOS Screen & Input APIs]
- CuaDriver.app: A daemon running locally that interacts directly with macOS accessibility and screen recording APIs.
- cua-driver CLI: A lightweight command-line interface that acts as the entry point and communicates with the daemon over a Unix Domain Socket (UDS).
- Antigravity: An AI-first agent that loads the MCP server configuration and leverages its tool interface to execute actions.
Step-by-Step Integration Guide
1. Installing CUA Driver
The CUA Driver provides the infrastructure interface. Run the official macOS pre-release installation script to download and install the Rust-based daemon:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/\
cua/main/libs/cua-driver/scripts/install.sh)" --backend=rust
This installation script:
- Downloads the latest
cua-driver-rsuniversal release for macOS. - Places
CuaDriver.appin/Applications. - Symlinks the executable to
~/.local/bin/cua-driver.
Verify that the binary resolves correctly on your PATH:
cua-driver --version
2. Starting the Daemon & Granting macOS Permissions
For CUA to capture screenshots and dispatch mouse/keyboard events, you must grant macOS Accessibility and Screen Recording permissions.
First, launch the driver daemon:
open -n -g -a CuaDriver --args serve
Next, trigger the permission grant workflow:
cua-driver permissions grant
This command launches the app through LaunchServices. macOS will prompt you to enable the required permissions. Follow the settings dialogs to allow Cua Driver.
Once granted, verify status using the read-only check:
cua-driver permissions status
Ensure both Accessibility and Screen Recording return ✅ granted.
3. Configuring Antigravity MCP Workspace
To make CUA tools visible to the Antigravity agent, you must define a workspace-level MCP configuration. Under the root of your working directory, create a .agents/mcp_config.json file:
{
"mcpServers": {
"cua-driver": {
"command": "/Users/adnanahmad/.local/bin/cua-driver",
"args": ["mcp"]
}
}
}
Antigravity automatically discovers the .agents/mcp_config.json file and starts the cua-driver server using standard I/O (stdio) as a background process.
Verification & Usage
To verify the setup, launch the Antigravity CLI (agy) or open your project in the Antigravity IDE within the configured workspace directory.
Run /mcp in the CLI session to verify that the cua-driver server is loaded and shows its full suite of tools (such as list_apps, get_accessibility_tree, click, type_text, drag, etc.).
You can now ask the agent to drive your desktop directly:
- "Open the Calculator app, calculate 42 * 9, and show the result."
- "Bring Chrome to the foreground and list the visible tabs."
Key Takeaways
- Scope Configs: Workspace-level configs (
.agents/mcp_config.json) keep your global settings clean and load tools only when needed. - Daemon Integrity: If tools fail to capture the screen, check
cua-driver statusto ensure the daemon didn't exit, and restart it withcua-driver stop && open -n -g -a CuaDriver --args serve.