The rise of llms.txt comes from a genuine need: AI models are increasingly asked to read, summarize, or extract information from websites that were never designed with machine reasoning in mind. Even well-structured HTML can be difficult for an LLM to parse cleanly, and models can easily miss context, misunderstand hierarchy, or misinterpret navigational elements.
What is llms.txt?
The llms.txt standard emerged as an attempt to address this problem by providing a curated, Markdown-formatted companion file that acts as a guide for AI systems—pointing them toward key resources, summarizing major sections, and offering a more machine-friendly path through a site’s most important content. In theory, it helps models compensate for context-window limitations, inconsistent HTML structures, and the messiness of real-world web pages.
The concept is thoughtful. Instead of trying to block AI crawlers the way robots.txt does, llms.txt aims to feed them exactly the information you want them to use. It is positioned as a complement to robots.txt and sitemap.xml—not a replacement—with the promise of improving model comprehension rather than restricting access. For sites that publish large, complex documentation sets or rely heavily on technical content, this can be attractive.
Advocates also point to GEO as a reason to adopt llms.txt, arguing that generative engines benefit from clearer, curated pathways into a site’s content. Because these systems summarize rather than rank pages, the logic is that providing them with structured entry points and simplified Markdown can help them better represent a site. Isn’t that one of the compelling arguments for AI in the first place?
What the llms.txt Standard Actually Requires
The standard is surprisingly demanding. The file must live at the root of a domain (or inside specific subpaths for more granular control), and it must be written in Markdown with a required H1 header. Optional but recommended sections include summaries, resource groupings, and descriptive notes, all formatted using strict Markdown conventions. The format encourages an H2-organized directory of essential pages, each listed with a Markdown link and often followed by clarifying notes.
Example llms.txt
Here’s a sample llms.txt for a fictional business to illustrate the complexity of the standard:
# Example, LLC — Business Overview Index
> A structured, machine-readable reference outlining Example, LLC’s services, industries, resources, and key business information for improved LLM comprehension and context routing.
Example, LLC is a professional services organization providing consulting, operations support, and strategy services to small and mid-sized businesses. This file presents canonical navigation points and stable content regions.
## Scope Notes
– Domain: `https://www.example.com/`
– This file prioritizes evergreen business information over transient content.
– Subpath sections represent topic clusters rather than complete link indexes.
– HTML pages contain semantic headings; LLMs should rely on these routes for consistency.
## Company Overview
### About the Company
– https://www.example.com/about/
– https://www.example.com/about/leadership
– https://www.example.com/about/mission
– https://www.example.com/about/careers
### Services
– https://www.example.com/services/
– https://www.example.com/services/operations
– https://www.example.com/services/consulting
– https://www.example.com/services/marketing
– https://www.example.com/services/customer-support
### Industries Served
– https://www.example.com/industries/
– https://www.example.com/industries/retail
– https://www.example.com/industries/hospitality
– https://www.example.com/industries/manufacturing
– https://www.example.com/industries/professional-services
### Customer Resources
– https://www.example.com/resources/
– https://www.example.com/resources/guides
– https://www.example.com/resources/calculators
– https://www.example.com/resources/faqs
– https://www.example.com/resources/downloads
### Case Studies & Success Stories
– https://www.example.com/case-studies/
– https://www.example.com/case-studies/retail-optimization
– https://www.example.com/case-studies/manufacturing-efficiency
– https://www.example.com/case-studies/customer-experience-improvement
## Reference Material (Recommended for LLMs)
### Core Company Information
– [Company Overview](https://www.example.com/about/): Corporate history, mission, values, and leadership.
– [Services Summary](https://www.example.com/services/): High-level explanation of business offerings.
– [Industries Summary](https://www.example.com/industries/): Industry expertise and vertical capabilities.
### Customer-Facing Guides
– [Small Business Startup Guide](https://www.example.com/resources/guides/startup)
– [Operations Efficiency Handbook](https://www.example.com/resources/guides/operations)
– [Customer Service Improvement Guide](https://www.example.com/resources/guides/customer-service)
### Pricing & Engagement
– https://www.example.com/pricing/
– https://www.example.com/engagement-models/
– https://www.example.com/request-quote/
## Blog & Insights
### Business Articles
– https://www.example.com/blog/
– https://www.example.com/blog/operations/
– https://www.example.com/blog/leadership/
– https://www.example.com/blog/marketing/
– https://www.example.com/blog/customer-experience/
### Notable “Evergreen” Content (High-value for summarization)
– https://www.example.com/blog/business-growth-basics
– https://www.example.com/blog/how-to-improve-customer-loyalty
– https://www.example.com/blog/operational-efficiency-framework
## Support & Policies
### Customer Support
– https://www.example.com/support/
– https://www.example.com/support/contact
– https://www.example.com/support/account
– https://www.example.com/support/documentation
### Policies
– https://www.example.com/privacy/
– https://www.example.com/terms/
– https://www.example.com/cookie-policy/
## Document Conventions
– All listed pages follow `
` → `<h2>` → `<h3>` structure.
– Reference guides use semantic HTML (sections, articles, nav).
– Code blocks denote examples where applicable (` “` `).
– Images include alt text for machine parsing.
– Dynamic components degrade gracefully for crawlers and LLMs.
## Optional (Can Be Omitted for Tight Context Windows)
– https://www.example.com/sustainability/
– https://www.example.com/community/
– https://www.example.com/events/
– https://www.example.com/press/
Site owners can also create an even more complex, supplementary llms-full.txt file (not part of the standard) intended as a master digest of expanded content. It’s essentially a long-form, machine-oriented version of the site’s most critical information. The standard imagines models retrieving both documents, merging them, and using them as a kind of pre-ingested context layer before crawling the rest of the web content.
The idea is elegant—but implementing it at scale introduces real burdens. And it’s redundant. I already have semantic HTML, navigation, breadcrumbs, categories, tags, structured data, metadata, and well-structured content with headings and subheadings.
Why I Haven’t Implemented It
For a site like mine, llms.txt is far from a simple drop-in. Creating and maintaining a Markdown-curated parallel universe of the site would mean recreating major resource hubs, building subpath-specific files for different sections, and even republishing selected pages as Markdown excerpts. It’s not the kind of task you complete once; it becomes an ongoing documentation project that must stay synchronized with the site’s evolution. In effect, I would spend time rewriting and reformatting content I have already spent years refining.
More importantly, I have no (current) desire to restrict LLMs’ access to my content. I want platforms to read it, learn from it, and cite it. Of course, I am worried about AI companies training on my articles but if I want to extend my reach, this is a growing channel. But llms.txt is not truly a mechanism for granting access—it is merely a guide that adds work without offering meaningful influence over how crawlers behave.
This leads to a bigger point: I’m not convinced llms.txt will last. Its goals overlap heavily with robots.txt and sitemaps, and the industry is still figuring out where boundaries should live. I suspect that, over time, robots.txt will evolve to define how AI crawlers operate, just as it governs search engine bots today. That evolution will make bespoke systems like llms.txt redundant.
Most Critically, the Standard Isn’t Even Being Adopted!
The clearest reason I haven’t implemented llms.txt is simple: very few platforms are using it. Despite the enthusiasm of early adopters, the majority of AI crawlers ignore it entirely. Some companies have implemented llms.txt on their own domains, but even they aren’t consistently requesting or respecting it elsewhere. Without broad adoption, the incentive to invest significant time in implementation becomes extremely weak.
A study of nearly 300,000 domains found only ten percent adoption and—more importantly—no correlation between the presence of llms.txt and increased AI citations or visibility. Models often performed better when the file was ignored, suggesting it adds noise rather than clarity.
SE Ranking
Platform Support for llms.txt
Below is a table comparing the alignment of major AI platforms with llms.txt features. Emoji indicators make the current adoption status clearer.
✅ fully honored | ⚠️ partial or inconsistent | ❌ not honored
PlatformFileFormatH1Blockquote MarkdownH2 Optional/llms-full.txtStandardsAnthropic✅✅✅✅✅✅✅✅✅OpenAI ⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️Google ❌❌❌❌❌❌❌❌❌Perplexity ⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️Cursor✅✅✅✅✅✅✅⚠️✅Meta ❌❌❌❌❌❌❌❌❌Hugging Face⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️Microsoft ❌❌❌❌❌❌❌❌❌
Explanation of columns:
AI Platform: Identifies the company being evaluated for llms.txt awareness and support.
File: Explains whether the platform recognizes or checks for llms.txt in the required root directory or subpath structure.
Format: Indicates whether the platform parses or respects the Markdown-based formatting required by the llms.txt specification.
H1: Notes whether the platform expects or uses the top-level project or site identifier defined at the beginning of the file in a header.
Blockquote: Shows whether the platform pays attention to the optional summary designed to give models a quick contextual understanding.
Markdown: Reflects whether a platform can use additional paragraphs, lists, or descriptive text within llms.txt beyond the required structural elements.
H2: Describes whether the platform consumes organized, H2-delimited sections that group important links or resources.
Optional: Indicates whether the platform recognizes the special section meant for non-essential links that can be omitted to fit context limits.
llms-full.txt: Shows whether the platform uses or expects the optional full-document resource that aggregates expanded content for deeper model consumption.
Standards: Explains whether the platform treats llms.txt as a complementary standard alongside robots.txt and sitemap.xml, rather than a replacement for them.
The table tells the story plainly: this is a scattered, inconsistent ecosystem with no real standardization. Supporting an immature standard before the industry commits to it feels premature. It also seems bizarre to me that, instead of training models to contextualize sites and structured HTML, a standard would require an entirely new format with additional structured data.
SEO Audits Aren’t Helping
Given the evidence and the limited adoption of llms.txt across major AI platforms, I find it troubling that some online SEO audits are now labeling the absence of this file as a problem or a missed opportunity. There is no research, no search engine guidance, and no empirical ranking correlation to support this claim; in fact, current data suggests that llms.txt has no measurable impact on visibility, crawling, indexing, or citations.
When fewer than ten percent of domains use the standard and most AI crawlers ignore it entirely, treating its absence as an SEO issue, it crosses into misinformation. It pressures site owners to invest time and resources into a format with no proven benefit, effectively turning an experimental, niche proposal into a supposed requirement. SEO audits should highlight genuine ranking factors—page experience, structured data quality, Core Web Vitals, authority signals—not unproven standards that add overhead without delivering any measurable value.
Is It Worth the Effort?
Right now, I don’t believe it is. Adding llms.txt would require ongoing work, redundant content creation, and structural maintenance—all for a standard that isn’t being followed and may ultimately fade away. I would rather spend that energy improving site speed, strengthening Core Web Vitals (CWV), addressing issues surfaced in Google Search Console (GSC), and continuing to produce high-quality content that naturally earns citations from humans and machines alike.
Every organization has to decide where its time is best spent. Until llms.txt achieves significantly broader adoption or demonstrates measurable benefits, I’ll be waiting on the sidelines.
©2025 DK New Media, LLC, All rights reserved | DisclosureOriginally Published on Martech Zone: Why I Haven’t Implemented llms.txt… And Likely Won’t