Browser-Use

Browser-Use is an open-source solution optimized for fast, efficient browser automation. It enables AI to interact with websites naturally—clicking, typing, scrolling, and navigating just like a human would. Perfect for automating repetitive web tasks, extracting data from complex sites, or testing web applications at scale. Hyperbrowser hosts the browser-use framework so you can run agent tasks with a single API call. You can view your Browser-Use tasks in the dashboard.

Browser-Use agents run asynchronously by default. Start a task, then poll for results. Our SDKs include a startAndWait() helper that handles polling automatically and returns when the task completes.

How It Works

You can use Browser-Use in two ways:

Start and Wait: SDKs provide a startAndWait() method that blocks until the task completes and returns the result
Async Pattern: Start a task, get a job ID, then poll for status and results—useful for long-running tasks or when you want more control

Installation

npm install @hyperbrowser/sdk dotenv

Quick Start

The simplest way to run a Browser-Use task is with startAndWait(), which handles the entire lifecycle for you:

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

async function main() {
  const result = await client.agents.browserUse.startAndWait({
    task: "Go to Hacker News and tell me the title of the top post",
    llm: "gemini-2.0-flash",
    maxSteps: 20,
  });

  console.log(`Output:\n${result.data?.finalResult}`);
}

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

Async Pattern

When you need more control, use the async pattern to start a task and poll for results:

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

async function main() {
  try {
    // Start the task
    const task = await client.agents.browserUse.start({
      task: "What is the title of the first post on Hacker News today?",
      llm: "gemini-2.0-flash",
      maxSteps: 20,
    });

    console.log(`Task started: ${task.jobId}`);
    console.log(`Watch live: ${task.liveUrl}`);

    // Poll for completion
    let result;
    while (true) {
      result = await client.agents.browserUse.getStatus(task.jobId);
      console.log(`Status: ${result.status}`);

      if (result.status === "completed" || result.status === "failed") {
        break;
      }

      await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5s
    }

    const fullResult = await client.agents.browserUse.get(task.jobId);

    if (fullResult.status === "completed") {
      console.log("Result:", fullResult.data?.finalResult);
      console.log("Steps taken:", fullResult.data?.steps?.length);
    } else {
      console.error("Task failed:", fullResult.error);
    }
  } catch (err) {
    console.error(`Error: ${err.message}`);
  }
}

main();

Stop a Running Task

Stop a task before it completes:

await client.agents.browserUse.stop("job-id");

Parameters

task

string

required

Natural language description of what the agent should accomplish.

llm

string

default:"gemini-2.0-flash"

Language model to use. Options: gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, claude-sonnet-4-5, claude-sonnet-4-20250514, claude-3-7-sonnet-20250219, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, gemini-2.0-flash, gemini-2.5-flash

maxSteps

number

default:"20"

Maximum number of steps the agent can take. Increase if tasks aren’t able to complete within the given number of steps.

sessionId

string

ID of an existing browser session to reuse. Useful for multi-step workflows that need to maintain the same browser session.

useVision

boolean

default:"true"

Enable screenshot analysis for better context understanding.

validateOutput

boolean

default:"false"

Validate agent output against a schema.

useVisionForPlanner

boolean

default:"false"

Provide screenshots to the planning component.

maxActionsPerStep

number

default:"10"

Maximum actions per step before reassessing.

maxInputTokens

number

default:"128000"

Maximum tokens for LLM input.

plannerLlm

string

default:"gemini-2.0-flash"

Separate language model for planning (can be different from main LLM).

pageExtractionLlm

string

default:"gemini-2.0-flash"

Separate language model for extracting structured data from pages.

plannerInterval

number

default:"10"

How often (in steps) the planner reassesses strategy.

maxFailures

number

default:"3"

Maximum consecutive failures before aborting the task.

initialActions

array

List of actions to execute before starting the main task.

sensitiveData

object

Key-value pairs to mask the data sent to the LLM. The LLM only sees placeholders (x_user, x_pass), browser-use filters your sensitive data from the input text. Real values are injected directly into form fields after the LLM call.

outputModelSchema

object

Valid JSON schema for structured output.

keepBrowserOpen

boolean

default:"false"

Keep session alive after task completes.

sessionOptions

object

Session configuration (proxy, stealth, captcha solving, etc.). Only applies when creating a new session. If you provide an existing sessionId, these options are ignored.

useCustomApiKeys

boolean

default:"false"

Use your own LLM API keys instead of Hyperbrowser’s. You will only be charged for browser usage.

apiKeys

object

API keys for openai, anthropic, and google. Required when useCustomApiKeys is true. Must provide keys based on the LLMs you are using.

{
  openai: "...",
  anthropic: "...",
  google: "..."
}

The agent may not complete the task within the specified maxSteps. If that happens, try increasing the maxSteps parameter.Additionally, the browser session used by the AI Agent will time out based on your team’s default Session Timeout settings or the session’s timeoutMinutes parameter if provided. You can adjust the default Session Timeout in the Settings page.

Reuse Browser Sessions

You can pass in an existing sessionId to the Browser Use task so that it can execute the task on an existing session. Also, if you want to keep the session open after executing the task, you can supply the keepBrowserOpen parameter.

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const session = await client.sessions.create();

  try {
    const result = await client.agents.browserUse.startAndWait({
      task: "What is the title of the first post on Hacker News today?",
      sessionId: session.id,
      keepBrowserOpen: true,
    });

    console.log(`Output:\n${result.data?.finalResult}`);

    const result2 = await client.agents.browserUse.startAndWait({
      task: "Tell me how many upvotes the first post has.",
      sessionId: session.id,
    });

    console.log(`\nOutput:\n${result2.data?.finalResult}`);
  } catch (err) {
    console.error(`Error: ${err}`);
  } finally {
    await client.sessions.stop(session.id);
  }
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

Always set keepBrowserOpen: true on tasks that you want to reuse the session from. Otherwise, the session will be automatically closed when the task completes.

Use Your Own API Keys

You can provide your own API Keys to the Browser Use task so that it doesn’t charge credits to your Hyperbrowser account for the steps it takes during execution. Only the credits for the usage of the browser itself will be charged. Depending on which model you select for the llm, plannerLlm, and pageExtractionLlm parameters, the API keys from those providers will need to be provided when useCustomApiKeys is set to true.

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await client.agents.browserUse.startAndWait({
    task: "What is the title of the first post on Hacker News today?",
    llm: "gpt-4o",
    plannerLlm: "gpt-4o",
    pageExtractionLlm: "gpt-4o",
    useCustomApiKeys: true,
    apiKeys: {
      openai: "<OPENAI_API_KEY>",
      // Below are needed if Claude or Gemini models are used
      // anthropic: "<ANTHROPIC_API_KEY>",
      // google: "<GOOGLE_API_KEY>",
    },
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

You can provide keys for multiple providers:

{
  "apiKeys": {
    "openai": "sk-...",
    "anthropic": "sk-ant-...",
    "google": "..."
  }
}

Session Configuration

Configure the browser environment with proxies, stealth mode, CAPTCHA solving, and more:

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await client.agents.browserUse.startAndWait({
    task: "go to Hacker News and summarize the top 5 posts of the day",
    sessionOptions: {
      acceptCookies: true,
    }
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

sessionOptions only apply when creating a new session. If you provide an existing sessionId, these options are ignored.

Proxies and CAPTCHA solving add latency. Only enable them when necessary for your use case.

Get Started

Browser Sessions

Session Configuration

Scraping

Agents

Integrations

How It Works

Installation

Quick Start

Async Pattern

Stop a Running Task

Parameters

Reuse Browser Sessions

Use Your Own API Keys

Session Configuration

Get Started

Browser Sessions

Session Configuration

Scraping

Agents

Integrations

​How It Works

​Installation

​Quick Start

​Async Pattern

​Stop a Running Task

​Parameters

​Reuse Browser Sessions

​Use Your Own API Keys

​Session Configuration

How It Works

Installation

Quick Start

Async Pattern

Stop a Running Task

Parameters

Reuse Browser Sessions

Use Your Own API Keys

Session Configuration