Computer Use

Overview

Computer Use enables agents to interact with graphical applications via screenshot analysis and mouse/keyboard input.

Requirements

Computer Use has two requirements:

Vision model — Must support image input (e.g., claude-opus-4-5, gpt-5.2, gemini-3-pro)
Desktop template — Must use a template with display server (X11/Wayland)

Desktop Templates

Use one of our desktop-enabled templates:

Template	Description
`ubuntu-desktop`	Ubuntu 24.04 with XFCE, Chrome, Firefox
`desktop-dev`	Ubuntu + VS Code, Node.js, Python
`desktop-browser`	Minimal desktop with Chrome only

Or create a custom template with supportsDesktop: true.

Enabling Computer Use

const runtime = await rt.runtimes.create({
  slug: 'browser-agent',
  model: 'claude-opus-4-5',  // Must support vision
  tools: [
    'bash',
    'read_file',
    'edit_file',
    'computer_use',  // Enable computer use
  ],
});

// Deploy with a desktop template
const deployment = await rt.deployments.create({
  runtimeSlug: 'browser-agent',
  templateSlug: 'ubuntu-desktop',  // Must be desktop-enabled
  apiSlug: 'my-browser-bot',
});

If you try to deploy a runtime with computer_use tool to a non-desktop template, deployment will fail with a validation error.

Computer Use Tools

When enabled, agents have access to:

Tool	Description
`screenshot`	Capture screen and analyze
`click`	Click at coordinates
`type`	Type text
`scroll`	Scroll up/down
`key`	Press keyboard keys
`mouse_move`	Move mouse cursor

Example: Browser Automation

const run = await rt.agents.run('browser-agent', {
  message: 'Go to google.com, search for "RunTools", and screenshot the results',
});

for await (const event of run) {
  if (event.type === 'tool_call' && event.tool === 'screenshot') {
    console.log('Agent took a screenshot');
  }
}

VNC Access

Access the sandbox desktop via VNC:

const vnc = await sandbox.getVNC();
console.log(vnc.url);      // wss://sandbox-abc123.vnc.runtools.ai
console.log(vnc.password); // Connection password

Use any VNC client or our web viewer in the dashboard.

Display Configuration

const sandbox = await rt.sandboxes.create({
  template: 'desktop',  // Template with GUI support
  display: {
    width: 1920,
    height: 1080,
    depth: 24,
  },
});

Use Cases

Browser Automation

Navigate websites, fill forms, extract data

Desktop Apps

Use applications without APIs

Testing

UI testing and screenshot comparisons

Legacy Systems

Automate systems without modern APIs

Supported Applications

Computer Use works with any GUI application:

Web browsers (Chrome, Firefox)
Office applications
IDEs (VS Code, JetBrains)
Design tools
Any X11 application

Best Practices

Use appropriate resolution

Higher resolution = more tokens for screenshots. Balance quality vs cost.

Be specific about UI elements

Tell the agent exactly what to look for: “Click the blue ‘Submit’ button”

Handle loading states

Instruct agent to wait for pages/apps to load before interacting.

Prefer APIs when available

Computer Use is slower and less reliable than direct APIs. Use only when necessary.

Getting Started

Platform

Core Features

Advanced

Self-Hosted

Overview

Requirements

Desktop Templates

Enabling Computer Use

Computer Use Tools

Example: Browser Automation

VNC Access

Display Configuration

Use Cases

Browser Automation

Desktop Apps

Testing

Legacy Systems

Supported Applications

Best Practices

Getting Started

Platform

Core Features

Advanced

Self-Hosted

​Overview

​Requirements

​Desktop Templates

​Enabling Computer Use

​Computer Use Tools

​Example: Browser Automation

​VNC Access

​Display Configuration

​Use Cases

Browser Automation

Desktop Apps

Testing

Legacy Systems

​Supported Applications

​Best Practices

Overview

Requirements

Desktop Templates

Enabling Computer Use

Computer Use Tools

Example: Browser Automation

VNC Access

Display Configuration

Use Cases

Supported Applications

Best Practices