Skip to main content
Browser-Use is an open-source solution optimized for fast, efficient browser automation. It enables AI to interact with websites naturally—clicking, typing, scrolling, and navigating just like a human would. Perfect for automating repetitive web tasks, extracting data from complex sites, or testing web applications at scale. Hyperbrowser hosts the browser-use framework so you can run agent tasks with a single API call. You can view your Browser-Use tasks in the dashboard.
Browser-Use agents run asynchronously by default. Start a task, then poll for results. Our SDKs include a startAndWait() helper that handles polling automatically and returns when the task completes.

How It Works

You can use Browser-Use in two ways:
  1. Start and Wait: SDKs provide a startAndWait() method that blocks until the task completes and returns the result
  2. Async Pattern: Start a task, get a job ID, then poll for status and results—useful for long-running tasks or when you want more control

Installation

npm install @hyperbrowser/sdk dotenv

Quick Start

The simplest way to run a Browser-Use task is with startAndWait(), which handles the entire lifecycle for you:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

async function main() {
  const result = await client.agents.browserUse.startAndWait({
    task: "Go to Hacker News and tell me the title of the top post",
    llm: "gemini-2.0-flash",
    maxSteps: 20,
  });

  console.log(`Output:\n${result.data?.finalResult}`);
}

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

Async Pattern

When you need more control, use the async pattern to start a task and poll for results:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

async function main() {
  try {
    // Start the task
    const task = await client.agents.browserUse.start({
      task: "What is the title of the first post on Hacker News today?",
      llm: "gemini-2.0-flash",
      maxSteps: 20,
    });

    console.log(`Task started: ${task.jobId}`);
    console.log(`Watch live: ${task.liveUrl}`);

    // Poll for completion
    let result;
    while (true) {
      result = await client.agents.browserUse.getStatus(task.jobId);
      console.log(`Status: ${result.status}`);

      if (result.status === "completed" || result.status === "failed") {
        break;
      }

      await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5s
    }

    const fullResult = await client.agents.browserUse.get(task.jobId);

    if (fullResult.status === "completed") {
      console.log("Result:", fullResult.data?.finalResult);
      console.log("Steps taken:", fullResult.data?.steps?.length);
    } else {
      console.error("Task failed:", fullResult.error);
    }
  } catch (err) {
    console.error(`Error: ${err.message}`);
  }
}

main();

Stop a Running Task

Stop a task before it completes:
await client.agents.browserUse.stop("job-id");

Parameters

task
string
required
Natural language description of what the agent should accomplish.
llm
string
default:"gemini-2.0-flash"
Language model to use. Options: gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, claude-sonnet-4-5, claude-sonnet-4-20250514, claude-3-7-sonnet-20250219, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, gemini-2.0-flash, gemini-2.5-flash
maxSteps
number
default:"20"
Maximum number of steps the agent can take. Increase if tasks aren’t able to complete within the given number of steps.
sessionId
string
ID of an existing browser session to reuse. Useful for multi-step workflows that need to maintain the same browser session.
useVision
boolean
default:"true"
Enable screenshot analysis for better context understanding.
validateOutput
boolean
default:"false"
Validate agent output against a schema.
useVisionForPlanner
boolean
default:"false"
Provide screenshots to the planning component.
maxActionsPerStep
number
default:"10"
Maximum actions per step before reassessing.
maxInputTokens
number
default:"128000"
Maximum tokens for LLM input.
plannerLlm
string
default:"gemini-2.0-flash"
Separate language model for planning (can be different from main LLM).
pageExtractionLlm
string
default:"gemini-2.0-flash"
Separate language model for extracting structured data from pages.
plannerInterval
number
default:"10"
How often (in steps) the planner reassesses strategy.
maxFailures
number
default:"3"
Maximum consecutive failures before aborting the task.
initialActions
array
List of actions to execute before starting the main task.
sensitiveData
object
Key-value pairs to mask the data sent to the LLM. The LLM only sees placeholders (x_user, x_pass), browser-use filters your sensitive data from the input text. Real values are injected directly into form fields after the LLM call.
outputModelSchema
object
Valid JSON schema for structured output.
keepBrowserOpen
boolean
default:"false"
Keep session alive after task completes.
sessionOptions
object
Session configuration (proxy, stealth, captcha solving, etc.). Only applies when creating a new session. If you provide an existing sessionId, these options are ignored.
useCustomApiKeys
boolean
default:"false"
Use your own LLM API keys instead of Hyperbrowser’s. You will only be charged for browser usage.
apiKeys
object
API keys for openai, anthropic, and google. Required when useCustomApiKeys is true. Must provide keys based on the LLMs you are using.
{
  openai: "...",
  anthropic: "...",
  google: "..."
}
The agent may not complete the task within the specified maxSteps. If that happens, try increasing the maxSteps parameter.Additionally, the browser session used by the AI Agent will time out based on your team’s default Session Timeout settings or the session’s timeoutMinutes parameter if provided. You can adjust the default Session Timeout in the Settings page.

Reuse Browser Sessions

You can pass in an existing sessionId to the Browser Use task so that it can execute the task on an existing session. Also, if you want to keep the session open after executing the task, you can supply the keepBrowserOpen parameter.
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const session = await client.sessions.create();

  try {
    const result = await client.agents.browserUse.startAndWait({
      task: "What is the title of the first post on Hacker News today?",
      sessionId: session.id,
      keepBrowserOpen: true,
    });

    console.log(`Output:\n${result.data?.finalResult}`);

    const result2 = await client.agents.browserUse.startAndWait({
      task: "Tell me how many upvotes the first post has.",
      sessionId: session.id,
    });

    console.log(`\nOutput:\n${result2.data?.finalResult}`);
  } catch (err) {
    console.error(`Error: ${err}`);
  } finally {
    await client.sessions.stop(session.id);
  }
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});
Always set keepBrowserOpen: true on tasks that you want to reuse the session from. Otherwise, the session will be automatically closed when the task completes.

Use Your Own API Keys

You can provide your own API Keys to the Browser Use task so that it doesn’t charge credits to your Hyperbrowser account for the steps it takes during execution. Only the credits for the usage of the browser itself will be charged. Depending on which model you select for the llm, plannerLlm, and pageExtractionLlm parameters, the API keys from those providers will need to be provided when useCustomApiKeys is set to true.
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await client.agents.browserUse.startAndWait({
    task: "What is the title of the first post on Hacker News today?",
    llm: "gpt-4o",
    plannerLlm: "gpt-4o",
    pageExtractionLlm: "gpt-4o",
    useCustomApiKeys: true,
    apiKeys: {
      openai: "<OPENAI_API_KEY>",
      // Below are needed if Claude or Gemini models are used
      // anthropic: "<ANTHROPIC_API_KEY>",
      // google: "<GOOGLE_API_KEY>",
    },
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});
You can provide keys for multiple providers:
{
  "apiKeys": {
    "openai": "sk-...",
    "anthropic": "sk-ant-...",
    "google": "..."
  }
}

Session Configuration

Configure the browser environment with proxies, stealth mode, CAPTCHA solving, and more:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await client.agents.browserUse.startAndWait({
    task: "go to Hacker News and summarize the top 5 posts of the day",
    sessionOptions: {
      acceptCookies: true,
    }
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});
sessionOptions only apply when creating a new session. If you provide an existing sessionId, these options are ignored.
Proxies and CAPTCHA solving add latency. Only enable them when necessary for your use case.