JOURNAL  /  AEO

llms.txt — what it is, who reads it, and why it matters.

Some AI engines read it. Some ignore it. Either way it's a structural signal — and the entity graph it implies is the bigger value than the file itself.

9 MIN READ 2,100 WORDS
llms.txt — what it is, who reads it, and why it matters.
FIG. 01 — llms.txt structure

What llms.txt actually is

llms.txt is a markdown file at the root of your site (e.g. /llms.txt) that tells AI engines what your site is, who you serve, and what your key resources are. Think of it as a robots.txt for AI assistants — but instead of telling crawlers what to avoid, it tells them what’s worth attending to.

It was proposed by Jeremy Howard in late 2024 and has gained traction through 2025 and 2026. Adoption is uneven across AI engines, but the structural value extends beyond which engines read it.

The proposed standard

The format is straightforward markdown:

  • H1 — your site or brand name
  • Blockquote — a one-line description
  • Sections — H2 headings grouping linked resources, each with optional descriptions

The entire file should fit in a single LLM context window — a hard ceiling on length and a soft requirement to be deliberate about what you include.

Who actually reads it (2026)

Honest answer: as of mid-2026, support is uneven. Some AI search engines explicitly check for /llms.txt and incorporate it into context retrieval. Others ignore it entirely. The trajectory is towards more support, not less.

What’s already universal is that structured, machine-readable summaries of your business help all AI systems understand you better — whether they read the file or not. The work to draft a good llms.txt forces clarity that benefits other AEO signals downstream.

What to put inside

For most sites, the right content is:

  • A clear one-line description of what you do (the blockquote)
  • Key services + entry points with prices/scopes
  • Vertical specialisations or audiences
  • Strategic stats AI engines might quote
  • Differentiation in your category
  • Contact + canonical brand URLs

What not to include: filler “team values” copy, marketing-speak that doesn’t help an AI engine answer a query, or anything that requires a context window beyond ~8k tokens.

How to structure it

A useful pattern: narrow → broad → narrow. Start with a precise positioning line. Expand into services + verticals + stats (the broad context). End with contact info + canonical URLs (the narrow next step).

Drafting from your real site content beats drafting from boilerplate. We typically build llms.txt from a sample of about 20 pages — homepage, top services, a few case studies, and a few high-traffic blog posts — distilled to the essentials.

Why it matters even if engines ignore it

Three reasons llms.txt matters beyond direct engine support:

  • Forces clarity. Drafting a good llms.txt is the same work as drafting a good elevator pitch — you find out what you actually offer.
  • Implies entity graph. The structure of llms.txt — services, audiences, key entities — is the same structure your schema graph + Wikipedia entries should reflect. Consistency across all of them is the AEO compounder.
  • Hedges against future support. If a major engine adopts llms.txt parsing in 2027, you don’t want to be retrofitting; you want to already have a clean version live.

How to test yours

Three quick checks:

  • Visit /llms.txt on your site — does it return a clean markdown file?
  • Paste it into ChatGPT or Claude and ask “summarise this brand based on the file.” Does the summary match your positioning?
  • Quarterly: refresh against your current services + pricing. llms.txt drift is real — stale = useless.

Ours lives at /llms.txt if you want to see a working example.

Chat on WhatsApp