What is llms.txt and how does it work?

llms.txt is a Markdown file served at /llms.txt on your domain. It describes your product, API endpoints, and documentation links in a structured format that LLMs can parse with minimal tokens. Think of it as robots.txt for AI: robots.txt tells crawlers what to index, llms.txt tells language models what your site offers.

How many tokens does llms.txt save compared to HTML docs?

Markdown uses roughly 6x fewer tokens than equivalent HTML content. A documentation page that costs 12,000 tokens in HTML can compress to under 2,000 tokens in Markdown. This matters because LLMs have finite context windows, and every token spent on formatting is a token not spent on understanding your API.

Do I need both llms.txt and llms-full.txt?

llms.txt is the summary: product name, description, and a list of links. llms-full.txt contains the full documentation content that the links point to, embedded in a single file. Start with llms.txt. Add llms-full.txt when you want LLMs to have deep context without following multiple links.

Which AI tools read llms.txt today?

Claude, ChatGPT, Perplexity, and Cursor all read llms.txt when browsing or referencing documentation. MCP-based agents and coding assistants also fetch it for tool discovery. Over 849 sites have adopted the format as of early 2026, including Anthropic, Cloudflare, Mintlify, and Stripe.

Where should I host the llms.txt file?

Serve it at your domain root: https://yourdomain.com/llms.txt. Use text/markdown or text/plain as the Content-Type. If you have an API subdomain, consider serving it at both the main domain and the API subdomain. Keep the file under 10KB for the summary version.

tutorial

Add llms.txt to your API for AI discoverability

Apr 3, 2026 | 8 min read

Digital matrix code representing machine-readable text — Photo by Markus Spiske on Unsplash

You publish API docs in HTML. A developer reads them, copies a curl command, and integrates your API. That workflow still works. But an increasing share of your "readers" are LLMs, and they process your HTML docs at a cost: a typical documentation page burns 12,000 tokens before the model extracts a single endpoint URL. The HTML tags, navigation chrome, and JavaScript your site ships add zero value for an LLM.

llms.txt solves this. It's a plain Markdown file at your domain root that describes what your API does, lists your endpoints, and links to detailed docs. LLMs parse it in 6x fewer tokens than the equivalent HTML. Over 849 sites have adopted it, including Anthropic, Cloudflare, Stripe, and Mintlify.

What llms.txt is (and what it's not)

The llms.txt specification, proposed by Jeremy Howard, defines a standard file at /llms.txt on any website. It uses Markdown headings, blockquotes, and link lists to describe a product to language models.

Think of it this way: robots.txt tells search crawlers which pages to index. llms.txt tells language models what your product does and where to find the details.

# robots.txt - tells crawlers what to index
User-agent: *
Allow: /

# llms.txt - tells LLMs what your product does
# Served at /llms.txt as structured Markdown

llms.txt is not a replacement for your documentation site. It's a concise index, formatted for machine consumption. Humans can read it too, but the primary audience is any LLM that needs to understand your API before generating code or answering questions about it.

The spec format

The format is intentional about every line. Here's the structure:

# Product Name

> One-line product description with key capabilities.

## Section Name

- [Link Title](https://example.com/page): Brief description
- [Another Link](https://example.com/other): Brief description

## Optional: Additional sections

More structured content as needed.

H1 heading: Your product or project name. One per file.
Blockquote: A single-line description of what the product does.
H2 sections: Group your links by category (Documentation, Endpoints, Tools).
Link lists: Each line has a Markdown link followed by a colon and description.

No HTML. No frontmatter. No custom syntax. Any Markdown parser can read it, and any LLM can extract structured information from it without special tooling.

Why tokens matter: Markdown vs. HTML

Every LLM has a context window. Tokens spent parsing <div class="nav-wrapper"> and <script src="analytics.js"> are tokens the model can't spend understanding your API schema. Here's the math:

HTML documentation page:     ~12,000 tokens
Same content as Markdown:     ~2,000 tokens
Savings:                      ~83% fewer tokens

An HTML documentation page carries navigation bars, footers, sidebars, meta tags, and embedded scripts. Strip all of that away and convert to Markdown, and you get the same information in roughly one-sixth the tokens. For LLMs operating near their context limit, that difference determines whether your full API reference fits in a single prompt.

AI language model interface with glowing text — LLMs consume API docs before humans do; llms.txt gives them a clean, token-efficient format Photo by Growtika on Unsplash

The two-tier approach: llms.txt + llms-full.txt

The specification defines two files with distinct roles:

llms.txt is the summary. Product name, one-line description, and a list of links with brief descriptions. Target size: under 10KB. An LLM reads this to understand what your API offers and decide which links to follow for details.
llms-full.txt is the complete reference. It embeds the content those links point to into a single file. Request schemas, response examples, authentication flows, error codes. Target size: under 100KB. An LLM reads this when it needs to generate working code against your API.

Start with llms.txt. Add llms-full.txt once you want LLMs to generate integration code without following external links.

Step-by-step: write your own llms.txt

1. Start with product identity

Open with an H1 containing your product name, followed by a blockquote that describes your product in one sentence. Be specific. Include the number of endpoints, key capabilities, and any differentiators.

# Acme API

> REST API for payment processing. 12 endpoints covering charges, refunds, subscriptions, and webhooks.

## Documentation

- [API Reference](https://docs.acme.com/api): Full endpoint reference with request/response schemas
- [Authentication](https://docs.acme.com/auth): API key setup, OAuth 2.0 flows, and webhook signing
- [Quickstart](https://docs.acme.com/quickstart): Send your first charge in 5 minutes

## Endpoints

- Create charge: POST https://api.acme.com/v1/charges
- Get charge: GET https://api.acme.com/v1/charges/:id
- Create refund: POST https://api.acme.com/v1/refunds
- List subscriptions: GET https://api.acme.com/v1/subscriptions
- Create webhook: POST https://api.acme.com/v1/webhooks

That file takes under 800 tokens to parse. An LLM now knows Acme API handles payments, has 12 endpoints, and can follow three links for deeper information.

2. List your endpoints with methods and URLs

For API products, list every endpoint with its HTTP method and full URL. This is the most valuable section for code generation. An LLM that knows POST https://api.acme.com/v1/charges can generate a working curl command or SDK call without reading your full docs.

3. Link to machine-readable specs

If you publish an OpenAPI spec, link it in your llms.txt. LLMs can parse OpenAPI JSON and extract parameter schemas, required fields, and response types. Same for MCP tool manifests or GraphQL introspection endpoints.

4. Build llms-full.txt for deep context

Take each endpoint from your summary and expand it with request/response examples:

# Botoi API - Full Documentation

> Complete endpoint reference for all 150+ developer utility endpoints.

## IP Geolocation

POST https://api.botoi.com/v1/ip/lookup

Returns city, region, country, ISP, coordinates, and timezone for any IP address.

**Request:**
```json
{ "ip": "8.8.8.8" }
```

**Response:**
```json
{
  "success": true,
  "data": {
    "ip": "8.8.8.8",
    "city": "Mountain View",
    "region": "California",
    "country": "US",
    "isp": "Google LLC",
    "lat": 37.386,
    "lon": -122.0838
  }
}
```

## Email Validation

POST https://api.botoi.com/v1/email/validate
...full documentation for every endpoint

Include realistic payloads, not placeholder data. An LLM generating integration code needs to see the exact field names, types, and nesting your API returns.

5. Serve with the right headers

Serve both files with text/markdown or text/plain Content-Type. If you use Nginx:

# Serve llms.txt with correct Content-Type
location = /llms.txt {
    default_type text/markdown;
    add_header X-Robots-Tag "noindex";
}

location = /llms-full.txt {
    default_type text/markdown;
    add_header X-Robots-Tag "noindex";
}

On Cloudflare Pages, Vercel, or Netlify, drop the files in your public/ directory. The hosting platform serves them with the correct MIME type from the file extension.

Real example: botoi's llms.txt

Botoi serves llms.txt at botoi.com/llms.txt and llms-full.txt at botoi.com/llms-full.txt. Here's a condensed view of the summary file:

# Botoi - Developer Utility API & MCP Server

> One API key, 150+ developer utility endpoints, and a 49-tool
> MCP server for AI agents. IP geolocation, email validation,
> DNS, hashing, JWT, QR codes, PDF generation, and more.
> Sub-50ms from Cloudflare's edge. Free tier included.

## Free Online Tools

- [JSON Formatter](https://botoi.com/tools/json-formatter): Format, beautify, minify, and validate JSON data
- [Base64 Encoder/Decoder](https://botoi.com/tools/base64-encoder-decoder): Encode and decode Base64 strings
- [Hash Generator](https://botoi.com/tools/hash-generator): Generate SHA-1, SHA-256, SHA-384, SHA-512 hashes

## Botoi API

- [API Documentation](https://api.botoi.com/docs): Full API reference with interactive playground
- [OpenAPI Spec](https://api.botoi.com/openapi.json): Machine-readable OpenAPI 3.1 specification
- [MCP Tool Manifest](https://api.botoi.com/v1/mcp/tools.json): MCP tool definitions for AI agents

## API Endpoints

- IP geolocation: POST https://api.botoi.com/v1/ip/lookup
- Email validation: POST https://api.botoi.com/v1/email/validate
- DNS lookup: POST https://api.botoi.com/v1/dns/lookup
- Hash generation: POST https://api.botoi.com/v1/hash
...150+ more endpoints listed with method and URL

The full file lists every tool, all 150+ API endpoints with methods and URLs, the MCP server configuration for five AI editors, and the TypeScript SDK install command. An LLM reading this file can answer "How do I validate an email with botoi?" without visiting a single web page.

Fetch it yourself:

curl -s https://botoi.com/llms.txt | head -20

Generative engine optimization: beyond traditional SEO

Traditional SEO optimizes for Google's crawler. Generative engine optimization (GEO) optimizes for AI models that synthesize answers from multiple sources. When a developer asks ChatGPT "What API can I use for email validation?", the model draws from sources it has parsed or can fetch.

llms.txt is one part of a GEO strategy. The full picture includes:

llms.txt and llms-full.txt for direct LLM consumption.
OpenAPI spec at a public URL so LLMs can parse your endpoint schemas.
MCP server discovery via /.well-known/mcp/server-card.json so AI assistants can find and connect to your tools.
Structured data (JSON-LD) on your pages for richer extraction.
robots.txt configured to allow AI crawlers (GPTBot, ClaudeBot, PerplexityBot).

Each layer targets a different way LLMs discover and consume your API. llms.txt is the lowest-effort, highest-impact starting point.

Deployment checklist

1. Create /llms.txt with product name, description, and key links
2. Create /llms-full.txt with full endpoint documentation
3. Add both files to your robots.txt sitemap (optional)
4. Set Content-Type to text/markdown or text/plain
5. Keep llms.txt under 10KB and llms-full.txt under 100KB
6. Update both files whenever you ship new endpoints
7. Test: curl -s https://yourdomain.com/llms.txt | wc -c

Keep your llms.txt in version control alongside your API source code. Update it in the same pull request where you add a new endpoint. Stale documentation, whether for humans or machines, erodes trust.

Key points

LLMs read your docs before humans do. AI agents, coding assistants, and chat interfaces parse API documentation to generate code and answer questions. Give them a clean format.
Markdown costs 6x fewer tokens than HTML. Strip navigation, scripts, and styling. Serve the content LLMs need in the format they process most efficiently.
Two files cover both use cases. llms.txt is the summary index. llms-full.txt is the full reference. Start with the summary; add the full version when you want LLMs to generate working integration code.
849+ sites have adopted the format. Anthropic, Cloudflare, Stripe, and Mintlify all serve llms.txt. The format is gaining traction as a GEO standard.
See it in action. Fetch botoi.com/llms.txt to see a 150+ endpoint API described in a single Markdown file.

Frequently asked questions

What is llms.txt and how does it work?: llms.txt is a Markdown file served at /llms.txt on your domain. It describes your product, API endpoints, and documentation links in a structured format that LLMs can parse with minimal tokens. Think of it as robots.txt for AI: robots.txt tells crawlers what to index, llms.txt tells language models what your site offers.
How many tokens does llms.txt save compared to HTML docs?: Markdown uses roughly 6x fewer tokens than equivalent HTML content. A documentation page that costs 12,000 tokens in HTML can compress to under 2,000 tokens in Markdown. This matters because LLMs have finite context windows, and every token spent on formatting is a token not spent on understanding your API.
Do I need both llms.txt and llms-full.txt?: llms.txt is the summary: product name, description, and a list of links. llms-full.txt contains the full documentation content that the links point to, embedded in a single file. Start with llms.txt. Add llms-full.txt when you want LLMs to have deep context without following multiple links.
Which AI tools read llms.txt today?: Claude, ChatGPT, Perplexity, and Cursor all read llms.txt when browsing or referencing documentation. MCP-based agents and coding assistants also fetch it for tool discovery. Over 849 sites have adopted the format as of early 2026, including Anthropic, Cloudflare, Mintlify, and Stripe.
Where should I host the llms.txt file?: Serve it at your domain root: https://yourdomain.com/llms.txt. Use text/markdown or text/plain as the Content-Type. If you have an API subdomain, consider serving it at both the main domain and the API subdomain. Keep the file under 10KB for the summary version.

Start building with botoi

150+ API endpoints for lookup, text processing, image generation, and developer utilities. Free tier, no credit card.

View API docs Try free tools

Add llms.txt to your API for AI discoverability

What llms.txt is (and what it's not)

The spec format

Why tokens matter: Markdown vs. HTML

The two-tier approach: llms.txt + llms-full.txt

Step-by-step: write your own llms.txt

1. Start with product identity

2. List your endpoints with methods and URLs

3. Link to machine-readable specs

4. Build llms-full.txt for deep context

5. Serve with the right headers

Real example: botoi's llms.txt

Generative engine optimization: beyond traditional SEO

Deployment checklist

Key points

Frequently asked questions

More tutorial posts

How to use the Botoi TypeScript SDK with 5 real examples

Webhook security: HMAC signatures, idempotency, and replay protection

How to add developer tools to Claude with MCP

Start building with botoi