You can also see dedicated pages for Scrape, Crawl, and Extract, and try them in the Playground.
For session configuration details, see Configuration Parameters.
For full schemas, see the API Reference.
Scraping a web page
With just a URL, you can extract page contents in your chosen formats using the/scrape endpoint.
Session Options
All Scraping APIs (scrape, crawl, extract) support session parameters. See Session Parameters for all options.Scrape Options
Output formats to include in the response. One or more of: 
"html", "links", "markdown", "screenshot".CSS selectors (tags, classes, IDs) to explicitly include. Only matching elements are returned.
CSS selectors (tags, classes, IDs) to exclude from the scraped content.
When 
true, attempts to extract only main content (omits headers/nav/footers).Milliseconds to wait after initial load before scraping (useful for dynamic content and CAPTCHA detection when 
sessionOptions.solveCaptchas is enabled).Maximum time (ms) to wait for navigation to complete. Equivalent to 
page.goto(url, { waitUntil: "load", timeout }).Load condition: 
"load", "domcontentloaded", or "networkidle".Screenshot settings (effective only when 
formats includes "screenshot").
Properties:- fullPage(- boolean, default- false) — capture full page beyond viewport
- format(- "webp" | "jpeg" | "png", default- "webp")
Set the storage state of the page before scraping.
Properties:
- localStorage(- object, optional) — Local storage data (key-value pairs where both keys and values must be strings)
- sessionStorage(- object, optional) — Session storage data (key-value pairs where both keys and values must be strings)
Example with options
By configuring these options when making a scrape request, you can control the format and content of the scraped data, as well as the behavior of the scraper itself. For example, to scrape a page with the following:- In stealth mode
- Automatically accept cookies
- Return only the main content as HTML
- 
Exclude any <span>elements
- Wait 2 seconds after the page loads and before scraping
Crawl a site
Instead of scraping a single page, you can collect content across multiple pages using the/crawl endpoint. You can use the same sessionOptions and scrapeOptions as in /scrape, along with additional crawl-specific options below.
Crawl Options
The URL of the page to crawl.
Maximum number of pages to crawl before stopping (minimum: 1).
When 
true, follow links discovered on pages to expand the crawl.When 
true, skip pre-generating URLs from sitemaps at the target origin.Regex or wildcard patterns for URL paths to exclude from the crawl.
Regex or wildcard patterns for URL paths to include (only matching pages will be crawled).
Session configuration used during the crawl. See Session Parameters.
Scrape options used during the crawl. See Scrape Options.
Example with options
Structured extraction
The Extract API fetches data in a well-defined structure from any set of pages. Provide a list of URLs, and Hyperbrowser will collect relevant content (including optional crawling) and return data that fits your schema or prompt.Extract Options
List of page URLs. To crawl an origin for a URL, append 
/* (e.g., https://example.com/*) to follow relevant links up to maxLinks.JSON Schema for the desired output.
Instructional prompt describing how to structure the extracted data. If no 
schema is provided, we will try to generate a schema based on the prompt.Additional instructions to guide extraction behavior.
When crawling for any given 
/* URL, the maximum number of links to follow.Milliseconds to wait after page load before extraction (useful for dynamic content and CAPTCHA detection when 
sessionOptions.solveCaptchas is enabled).Session configuration used during extraction. See Session Parameters.
You can provide a schema, or a prompt, or both. For best results, provide both a schema and a prompt. The schema should define exactly how you want the extracted data formatted, and the prompt should include any information that can help guide the extraction. If no schema is provided, we will try to automatically generate a schema based on the prompt.