Computer Tool

Overview

The computer tool provides full control over the desktop — taking screenshots, clicking, typing, scrolling, and keyboard shortcuts. It’s the primary tool available on every cloud desktop.

Actions

screenshot

Capture the current screen state.

{"name": "computer", "arguments": {"action": "screenshot"}}

Returns a base64-encoded PNG in the screenshot field.

left_click

Click at specific screen coordinates.

{"name": "computer", "arguments": {"action": "left_click", "coordinate": [500, 300]}}

right_click

Right-click at specific screen coordinates.

{"name": "computer", "arguments": {"action": "right_click", "coordinate": [500, 300]}}

double_click

Double-click at specific screen coordinates.

{"name": "computer", "arguments": {"action": "double_click", "coordinate": [500, 300]}}

mouse_move

Move the cursor without clicking.

{"name": "computer", "arguments": {"action": "mouse_move", "coordinate": [500, 300]}}

type

Type a string of text at the current cursor position.

{"name": "computer", "arguments": {"action": "type", "text": "Hello, world!"}}

key

Press a key or key combination.

{"name": "computer", "arguments": {"action": "key", "text": "Return"}}

Common key names: Return, Tab, Escape, Backspace, Delete, space Key combinations use +:

{"name": "computer", "arguments": {"action": "key", "text": "ctrl+a"}}
{"name": "computer", "arguments": {"action": "key", "text": "ctrl+c"}}
{"name": "computer", "arguments": {"action": "key", "text": "alt+F4"}}

scroll

Scroll in a direction.

{"name": "computer", "arguments": {"action": "scroll", "coordinate": [500, 300], "scroll_direction": "down", "scroll_amount": 3}}

Parameter	Type	Description
`coordinate`	`[x, y]`	Where to scroll
`scroll_direction`	`string`	`up`, `down`, `left`, `right`
`scroll_amount`	`integer`	Number of scroll increments

wait

Wait for a specified duration (useful for loading screens).

{"name": "computer", "arguments": {"action": "wait"}}

cursor_position

Get the current cursor position.

{"name": "computer", "arguments": {"action": "cursor_position"}}

Response Format

Every action returns:

{
  "success": true,
  "output": "clicked at (500, 300)",
  "screenshot": "iVBORw0KGgo..."
}

Field	Type	Description
`success`	`boolean`	Whether the action succeeded
`output`	`string`	Human-readable description of what happened
`screenshot`	`string`	Base64-encoded PNG of the screen after the action
`error`	`string`	Error message (only present on failure)

Getting Started

Computer-Use Desktops

Managed Workflows

Help

Changelog

Computer Tool

Overview

Actions

screenshot

left_click

right_click

double_click

mouse_move

type

key

scroll

wait

cursor_position

Response Format

Next Steps

Quickstart

Examples

Getting Started

Computer-Use Desktops

Managed Workflows

Help

Changelog

​Overview

​Actions

​screenshot

​left_click

​right_click

​double_click

​mouse_move

​type

​key

​scroll

​wait

​cursor_position

​Response Format

​Next Steps

Quickstart

Examples

Overview

Actions

screenshot

left_click

right_click

double_click

mouse_move

type

key

scroll

wait

cursor_position

Response Format

Next Steps