Skip to main content

Overview

Every Nen desktop exposes two HTTP endpoints that give you full control over the desktop:
EndpointMethodDescription
/desktops/{id}/toolsGETDiscover available tools as JSON Schema
/desktops/{id}/executePOSTExecute a tool action and get a screenshot back
Base URL: https://desktop.api.getnen.ai
Your agent runs anywhere — your laptop, your cloud, your customer’s infra. Nen is just the execution backend.

Authentication

  1. Get your API key from the Nen Dashboard
  2. Pass it as the Authorization: Bearer header
export NEN_API_KEY="sk_nen_..."
That’s it — no session management needed.

Tool Discovery

GET /desktops/{id}/tools returns an array of tool definitions in JSON Schema format:
curl https://desktop.api.getnen.ai/desktops/{desktop_id}/tools \
  -H "Authorization: Bearer $NEN_API_KEY"
[
  {
    "name": "computer",
    "description": "Control the computer's mouse, keyboard, and screen.",
    "parameters": {
      "type": "object",
      "properties": {
        "action": {
          "type": "string",
          "enum": ["screenshot", "key", "type", "mouse_move", "left_click", "right_click", "double_click", "scroll", "wait"]
        },
        "text": { "type": "string", "description": "Text to type or key combo (e.g. 'Return', 'ctrl+a')" },
        "coordinate": { "type": "array", "items": { "type": "integer" }, "description": "[x, y] screen coordinates" },
        "scroll_direction": { "type": "string", "enum": ["up", "down", "left", "right"] },
        "scroll_amount": { "type": "integer" }
      },
      "required": ["action"]
    }
  }
]
Pass these directly to your LLM’s tool/function calling API.

Tool Execution

POST /desktops/{id}/execute runs a tool and returns the result plus a post-action screenshot. Request:
curl -X POST https://desktop.api.getnen.ai/desktops/{desktop_id}/execute \
  -H "Authorization: Bearer $NEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "action": {
      "tool": "computer",
      "action": "left_click",
      "params": { "coordinate": [500, 300] }
    }
  }'
Headers:
HeaderValue
AuthorizationBearer sk_nen_...
Content-Typeapplication/json
Response:
{
  "status": "ok",
  "output": "clicked at (500, 300)",
  "base64_image": "iVBORw0KGgo...",
  "coordinate": [500, 300]
}
FieldTypeDescription
statusstringAlways "ok" on a successful execution.
outputstringOptional. Short text description of what happened. Only present when the underlying action produced text output.
base64_imagestringOptional. PNG screenshot, base64-encoded. Produced by actions that inherently return a screenshot (e.g. screenshot); plain actions like left_click may not include one.
coordinate[int, int]Optional. Final cursor position, if the action moved or reported the cursor.
status is the only guaranteed field; the rest depend on the action. A successful left_click with no reported output can legitimately return just {"status": "ok"}.
The screenshot after each action is how your agent observes the desktop. Feed it back to your VLM to decide the next step.

Error Handling

When a tool execution fails, the response is a single-field error object with an HTTP 4xx/5xx status:
{
  "error": "Coordinate out of bounds: [5000, 3000]"
}
To recover and observe current state, issue a follow-up screenshot call.

Next Steps

Supported Tools

All tools and supported models

Computer Tool

Full reference for all computer actions