page.ai.extract()
Extract structured, typed data from the current page without running a full autonomous flow.
Extracts structured data from the current page and returns it as a typed object — without running a full autonomous flow. Use this when you want to read information from the page rather than interact with it.
Signature
page.ai.extract<Schema extends z.ZodObject>(
schema: Schema,
options?: {
instruction?: string;
gptClient?: GptClient | LanguageModel;
timeout?: number;
}
): Promise<z.infer<Schema>>
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
schema | z.ZodObject | — | Zod schema describing the shape of the data to extract |
options.instruction | string | — | Optional hint that narrows what the AI focuses on (e.g. "Only read the hero section at the top") |
options.gptClient | GptClient | LanguageModel | Project default | Override the AI provider for this call |
options.timeout | number | 60000 (60 s) | Maximum milliseconds to wait for the AI response before aborting |
Basic usage
import { test, expect } from 'donobu';
import { z } from 'zod';
test('read pricing plans', async ({ page }) => {
await page.goto('https://app.example.com/pricing');
const pricing = await page.ai.extract(
z.object({
plans: z.array(
z.object({
name: z.string(),
monthlyPrice: z.string(),
features: z.array(z.string()),
}),
),
}),
);
expect(pricing.plans).toHaveLength(3);
expect(pricing.plans[0].name).toBe('Starter');
});
With an instruction hint
const hero = await page.ai.extract(
z.object({
heading: z.string(),
subheading: z.string().optional(),
ctaLabel: z.string(),
}),
{
instruction: 'Only read the hero section at the very top of the page, not any other sections',
},
);
page.ai.extract vs. page.ai with a schema
page.ai.extract | page.ai with schema | |
|---|---|---|
| Interaction | Read-only — takes a screenshot and reads text | Full flow — the AI may click, scroll, navigate, etc. |
| Cache | Not cached | Cached |
| Speed | Fast (single LLM call) | Slower (multiple tool calls) |
| Use when | The data you need is already visible on the page | The data requires navigation or interaction to reach |
Throws
Schema mismatch: throws a Zod ZodError if the AI's response does not conform to the provided schema. This can happen when the page does not contain the expected data, or when the AI misinterprets the page content. Adding an instruction hint often resolves ambiguous cases.
Timeout: throws an AbortError if the AI response takes longer than options.timeout milliseconds (default: 60 seconds). Increase the timeout for complex pages or slow network conditions.
Neither error type is a PageAiException — page.ai.extract does not run a full autonomous flow and therefore does not produce flow-level failure states.