Action Caching records every action taken during a page.ai() call. Replay these recordings later for deterministic, LLM-free automation—dramatically reducing costs and improving execution speed.
Why Use Action Caching?
| Without Caching | With Caching |
|---|
| LLM call every run | LLM call once, replay free |
| Variable behavior | Deterministic execution |
| Higher latency | Near-instant replay |
| Higher cost | Pay once |
Recording Actions
Every page.ai() call automatically returns an actionCache:
import { HyperAgent } from "@hyperbrowser/agent";
import fs from "fs";
const agent = new HyperAgent();
const page = await agent.newPage();
await page.goto("https://flights.google.com");
// Execute task - actionCache is automatically generated
const { output, actionCache } = await page.ai(
"Search for flights from Miami to LAX on December 15"
);
console.log(`Recorded ${actionCache.actionCache.steps.length} steps`);
Replaying Actions
Use agent.runFromActionCache() to replay recorded actions:
import { HyperAgent, ActionCacheOutput } from "@hyperbrowser/agent";
import fs from "fs";
const agent = new HyperAgent();
// Replay without LLM calls
const result = await agent.runFromActionCache(actionCache.steps, {
maxXPathRetries: 3,
});
console.log("Replay status:", result.status);
await agent.closeAgent();
Generate Script from Action Cache
Instead of replaying actions programmatically, you can generate a standalone TypeScript script from recorded actions using agent.createScriptFromActionCache():
import { HyperAgent } from "@hyperbrowser/agent";
const agent = new HyperAgent({
llm: { provider: "anthropic", model: "claude-sonnet-4-0" },
});
const page = await agent.newPage();
// Record the automation
const { actionCache } = await page.ai(
"Go to https://demo.automationtesting.in/Frames.html, " +
"select the iframe within iframe tab, " +
"and fill in the text box in the nested iframe"
);
// Generate a reusable script
const script = agent.createScriptFromActionCache(actionCache.steps);
console.log(script);
await agent.closeAgent();
This outputs a standalone script you can save and run directly—no LLM calls needed:
// Generated script
import { HyperAgent } from "@hyperbrowser/agent";
async function main() {
const agent = new HyperAgent();
const page = await agent.newPage();
await page.goto("https://demo.automationtesting.in/Frames.html");
await page.performClick("/html/body/section/div/div/div/ul/li[2]/a", {
performInstruction: "Click the iframe within iframe tab"
});
await page.performFill("/html/body/section/div/div/div/input", "Hello", {
performInstruction: "Fill in the text box",
frameIndex: 2
});
await agent.closeAgent();
}
main();
The generated script uses the cache perform actions to execute the task without LLM calls.
How Replay Works
- XPath First: Attempts to find elements using cached XPaths
- Retry on Failure: Retries up to
maxXPathRetries times
- LLM Fallback: If XPath fails, falls back to AI using the cached instruction
- Continue or Stop: Stops on first failure by default
const result = await page.runFromActionCache(cache, {
maxXPathRetries: 3, // Retry XPath 3 times before LLM fallback
debug: true, // Log execution details
});
// Check what happened
for (const step of result.steps) {
console.log(`Step ${step.stepIndex}:`, {
usedXPath: step.usedXPath,
fallbackUsed: step.fallbackUsed,
success: step.success,
});
}
The cache is a JSON structure containing all recorded steps:
{
"taskId": "abc-123",
"createdAt": "2025-01-15T10:30:00Z",
"status": "completed",
"steps": [
{
"stepIndex": 0,
"actionType": "actElement",
"instruction": "Click the departure city input",
"method": "click",
"arguments": [],
"frameIndex": 0,
"xpath": "/html/body/div[2]/div[4]/input[1]",
"success": true
},
{
"stepIndex": 1,
"actionType": "actElement",
"instruction": "Type 'Miami' into the input",
"method": "fill",
"arguments": ["Miami"],
"frameIndex": 0,
"xpath": "/html/body/div[2]/div[4]/input[1]",
"success": true
}
]
}
Direct XPath Execution
For maximum control, use the perform helpers to execute actions directly:
const page = await agent.newPage();
await page.goto("https://example.com");
// Execute by XPath with LLM fallback
await page.performClick(
"/html/body/button[1]",
{ performInstruction: "Click the submit button" }
);
await page.performFill(
"/html/body/input[1]",
"[email protected]",
{ performInstruction: "Fill the email field" }
);
| Helper | Description |
|---|
performClick(xpath) | Click an element |
performFill(xpath, text) | Clear and fill an input |
performType(xpath, text) | Type into an element |
performPress(xpath, key) | Press a keyboard key |
performSelectOption(xpath, option) | Select from dropdown |
performCheck(xpath) | Check a checkbox |
performUncheck(xpath) | Uncheck a checkbox |
performHover(xpath) | Hover over an element |
performScrollToElement(xpath) | Scroll element into view |
Each helper accepts an options object:
await page.performClick(xpath, {
performInstruction: "Click the login button", // Fallback instruction
frameIndex: 0, // Target iframe (0 = main frame)
maxSteps: 3, // Retries before fallback
});
When to Use Action Caching
| Scenario | Recommendation |
|---|
| Repetitive tasks (daily scraping, scheduled jobs) | ✅ Record once, replay indefinitely |
| E2E testing | ✅ Fast, deterministic test runs |
| High-volume automation | ✅ Eliminate per-run LLM costs |
| Stable page structures | ✅ XPaths remain valid longer |
| Dynamic pages with frequent layout changes | ⚠️ May require frequent re-recording |
| One-time tasks | ❌ Just use page.ai() directly |
Monitoring Fallback Rates
When a cached XPath no longer matches the page, HyperAgent falls back to the LLM to find the element if the performInstruction is provided. You’ll see logs like this:
⚠️ [runCachedStep] Cached action failed. Falling back to LLM...
Instruction: "Select the LATAM/Delta flight with the lowest carbon emissions"
❌ Cached XPath Failed: "/html[1]/body[1]/c-wiz[2]/div[1]/.../li[5]/div[1]/div[1]"
✅ LLM Resolved New XPath: "/html[1]/body[1]/c-wiz[2]/div[1]/.../li[4]/div[1]/div[1]"
What this means:
- The cached XPath pointed to
li[5] but the element moved to li[4]
- The LLM successfully found the correct element using the instruction
- The action completed, but with added latency and cost
When to re-record:
- If you see fallback warnings frequently, the page structure has changed
- Re-run the original
page.ai() task to capture fresh XPaths
- Save the new
actionCache to replace your stale recording
Enable debug: true on your agent to see more detailed logging.