Skip to main content

Agent

VLM-powered controller for desktop automation.

Import

from nen import Agent

Constructor

Agent(model: str | None = None)
ParameterTypeDefaultDescription
modelstr | NoneNoneDefault VLM model for all calls

Methods

execute()

agent.execute(instruction: str, max_iterations: int | None = None, model: str | None = None) -> dict
Perform an action on the virtual desktop via natural language. Parameters:
NameTypeDefaultDescription
instructionstrWhat to do
max_iterationsint | NoneNoneMax screenshot → think → act loops. Server defaults to 10 when not set
modelstr | NoneNoneOverride default model
Returns: dict — execution result metadata Raises: WorkflowError
agent.execute("Click the Submit button", max_iterations=5)
agent.execute("Fill the signup form with all required fields", max_iterations=20)
agent.execute("Navigate to Settings", model="claude-haiku-4-5-20251001")

verify()

agent.verify(condition: str, timeout: int = 10, model: str | None = None) -> bool
Check whether a condition is visible on screen. Parameters:
NameTypeDefaultDescription
conditionstrExpected screen state
timeoutint10Seconds to wait
modelstr | NoneNoneOverride default model
Returns: boolTrue if condition met within timeout
agent.verify("Is the user logged in?")
agent.verify("Has the page loaded?", timeout=30)

extract()

agent.extract(query: str, schema: dict, model: str | None = None) -> dict | list
Extract structured data from the current screen. Parameters:
NameTypeDefaultDescription
querystrWhat data to extract
schemadictJSON Schema for output format
modelstr | NoneNoneOverride default model
Returns: dict | list — structured data matching schema Raises: WorkflowError, ValueError (if schema is empty)
data = agent.extract("Extract the order total", {"type": "object", "properties": {"total": {"type": "number"}}, "required": ["total"]})
Use YourModel.model_json_schema() to generate schemas from Pydantic models.