Introduction

HyperAgent is an open-source browser automation framework that extends Playwright with AI capabilities. Write natural language commands instead of complex selectors, and let HyperAgent handle the tedious parts of web automation.

GitHub Repository

View source code

npm Package

Install node SDK

Templates

View curated list of templates

The Challenge with Browser Automation

Browser automation tools like Puppeteer and Playwright offer powerful functionality for scripting clicks, typing, scrolling, and more. But they require you to understand the DOM structure and locate elements through HTML attributes, CSS selectors, or complex XPath queries. This gets harder fast:

Selectors break when websites update their markup
Iframes isolate content, requiring nested queries to reach elements inside them
Shadow DOM encapsulation makes elements even harder to access
Dynamic content means selectors that worked yesterday might fail today

You end up spending more time maintaining selectors than building features.

What HyperAgent Does

HyperAgent lets you describe what you want in plain English. The AI figures out how to interact with the page—no matter how the DOM is structured.

// Instead of this:
await page.locator('/html[1]/body[1]/c-wiz[2]/div[1]/div[2]/c-wiz[1]/div[1]/c-wiz[1]/div[2]/div[1]/div[1]/div[1]/div[1]/div[2]/div[1]/div[6]/div[2]/div[2]/div[1]/div[1]/input[1]').fill('Miami');

// Write this:
await page.perform("type Miami into the departure city field");

HyperAgent handles iframes, shadow DOM, and dynamic content automatically. When a site changes, your automation keeps working.

Core Methods

page.ai()

Execute complex multi-step tasks with natural language

page.perform()

Fast, single-action execution

page.extract()

Pull structured data with Zod schemas

Playwright Compatible

Use standard Playwright when you need deterministic control

import { HyperAgent } from "@hyperbrowser/agent";
import { z } from "zod";

const agent = new HyperAgent();
const page = await agent.newPage();

await page.goto("https://flights.google.com");

// AI handles the complexity
await page.ai("search for flights from Miami to LAX on Dec 15");

// Single actions when you know what you need
await page.perform("click the first result");

// Extract structured data
const flight = await page.extract(
  "get the price and duration of the selected flight",
  z.object({
    price: z.number(),
    duration: z.string(),
  })
);

// use Playwright
await page.locator('css=button').click();

await agent.closeAgent();

Key Features

Automatic Element Location

Describe the element in natural language. HyperAgent finds it regardless of DOM structure, iframes, or shadow DOM.

Action Caching

Record your automation once, replay it without LLM calls. Deterministic execution at a fraction of the cost.

Multiple LLM Providers

Use OpenAI, Anthropic, Google Gemini. Switch providers with one line of code.

Cloud Ready

Run locally for development, scale to hundreds of sessions with Hyperbrowser in production.

CDP-First Architecture

Native Chrome DevTools Protocol integration for precise coordinates, deep iframe tracking, and automatic ad filtering.

Get Started

npm install @hyperbrowser/agent

import { HyperAgent } from "@hyperbrowser/agent";

const agent = new HyperAgent();
const page = await agent.newPage();

await page.goto("https://news.ycombinator.com");
await page.ai("find the top story and summarize it");

await agent.closeAgent();

Quickstart

Build your first automation

Getting Started

Core Methods

Configuration

Action Caching

Advanced

GitHub Repository

npm Package

Templates

The Challenge with Browser Automation

What HyperAgent Does

Core Methods

page.ai()

page.perform()

page.extract()

Playwright Compatible

Key Features

Get Started

Quickstart

Getting Started

Core Methods

Configuration

Action Caching

Advanced

GitHub Repository

npm Package

Templates

​The Challenge with Browser Automation

​What HyperAgent Does

​Core Methods

page.ai()

page.perform()

page.extract()

Playwright Compatible

​Key Features

​Get Started

Quickstart

The Challenge with Browser Automation

What HyperAgent Does

Core Methods

Key Features

Get Started