Extract structured fields from text
2 min · 6 steps
The 'Extract from text' node reads any size of TEXT and pulls typed fields / records against your criteria. For files (PDF, image, Word) use 'File Extract' instead.
Use Extract from text (palette: AI → Extract from text — node kind `text.extract`) when the input is text already in the run: an email body, a fetched web page, a previous step's text output. It chunks the text to the active model's context window, runs per-chunk extraction with constrained decoding, and reconciles the parts into one output — so it works for ANY size of text. Every field is nullable, so 'not present' comes back as `null` instead of a hallucination.
For a file on disk (PDF, image, Word, scanned doc), reach for File Extract instead — it adds layout-aware OCR.
Steps
- Drop an 'Extract from text' node onto the canvas.
From the Builder palette: AI → Extract from text.
- Point it at the text (optional).
Reference an upstream step with `{{nodes.<id>.output.text}}` (use the 'Insert upstream output…' dropdown), paste a literal, or leave it blank — when blank the node auto-uses the previous step's text output, so a File Extract → Extract from text chain wires up with no manual binding.
- Add the fields you want, each with a type.
Click '+ Add field' for each piece of information. Give it a name (snake_case works best — `customer_email`, `invoice_total`, `due_date`) and pick a type: Text, Number, True/false, Date, or List of text. The type drives the model's output schema so numbers arrive as real numbers and dates in ISO 8601. An optional description disambiguates similar fields.
- Add extraction criteria (optional).
Plain English steering — e.g. 'only line items with a quantity and unit price' or 'prefer the billing block over the shipping block'. The field list is always sent; this flows alongside it.
- Many records? Turn on 'Multiple records'.
When the text holds many rows of the same shape — statement transactions, table rows, a list of people — flip 'Multiple records'. The node then returns `output.records[]` (one object per row) instead of one object, and the Runs tab renders the result as a table. Use the 'Max records' cap for very long inputs (default 1000, max 10000). The chunk size is sized automatically from the active model's context window; override it under Advanced if you have a larger-window local model.
- Run the agent.
The Runs tab streams progress; the node's output panel shows the structured object once it lands (or the records table when 'Multiple records' is on). Downstream nodes bind via `{{nodes.<id>.output.<field>}}` or `{{nodes.<id>.output.records}}`.
Live recipes need the desktop
This article is a static preview. The in-app Help sidecar inside Avery NXR can fire each step against your live project — install the desktop to use it interactively.