Google adds llms.txt check to Chrome Lighthouse

Google’s new Lighthouse “Agentic Shopping” audits now verify for the presence of an llms.txt file. The brand new experimental Lighthouse documentation frames llms.txt as a discoverability and effectivity sign for AI brokers, not a standard crawling directive.

The audits are a part of Chrome’s rising “Agentic Shopping” class, which evaluates whether or not websites are structured for machine interplay.
This doc comes lower than every week after Google revealed new steering on optimizing for AI search options like AI Overviews and AI Mode, during which it mentioned you don’t want llms.txt recordsdata in a mythbusting part of its new guide on optimizing for generative AI features.

What Lighthouse now checks. Lighthouse’s Agentic Shopping class evaluates “how properly your website is constructed for machine interplay” utilizing deterministic audits, in keeping with Google’s documentation. Among the many checks:

WebMCP integration.
Accessibility tree integrity.
Structure stability by CLS.
Presence of an llms.txt file.

Lighthouse checks for “the presence of a machine-readable abstract on the area root.” Google additionally defined why the file issues for brokers:

“With out llms.txt, brokers might spend extra time crawling the location to grasp its high-level construction and first content material.”

The audit class doesn’t produce a standard Lighthouse rating (0-100). As an alternative, Google surfaces a fractional cross ratio together with cross/fail checks tied to agentic readiness indicators.

The stress. The brand new Lighthouse documentation doesn’t straight battle with Google’s recommendation on optimizing your web site for generative AI options as a result of these audits give attention to AI brokers and browser instruments, not Google Search rankings. Nonetheless, seeing llms.txt talked about in Chrome’s personal readiness checks might trigger some SEOs to rethink earlier doubts concerning the file.

Agentic engine optimization. The Lighthouse audits additionally align with concepts Google Cloud AI engineering director Addy Osmani outlined in April round Agentic Engine Optimization. Osmani mentioned AI brokers with restricted context home windows might minimize off lengthy pages or miss necessary data buried too deep in content material. Amongst his suggestions:

Cleaner semantic construction.
Token-efficient content material.
Markdown supply.
llms.txt discovery layers.
Functionality signaling recordsdata like AGENTS.md.

search engine optimization vs. llms.txt. Right here’s precisely what Google recommends in Mythbusting generative AI search: what you don’t need to do:

LLMS.txt recordsdata and different “particular” markup: You don’t must create new machine readable recordsdata, AI textual content recordsdata, markup, or Markdown to seem in generative AI search. Observe that Google might uncover, crawl, and index many sorts of recordsdata along with HTML on an internet site: this doesn’t imply that the file is handled in a particular manner.

Right here’s what Google’s John Mueller mentioned about Google utilizing llms.txt, in response to Lily Ray asking him on Bluesky “Hey @johnmu.com – for those who can reply, many people are stating the irony that Google makes use of LLMs.txt recordsdata, plus markdown pages, regardless of additionally saying this stuff should not wanted for efficiency in search. Might you share why Google may publish these recordsdata, if to not make crawling these pages/websites simpler for brokers? (I’m positive I’ll be getting this query a ton quickly!)”:

The quick reply is that it’s not carried out for search. There’s extra to web sites than simply search engine optimization :-).
The longer & nuanced model is that it’s price separating “discovery” (discovering the web site or pages with a world search engine) vs “performance” (there’s most likely a extra correct time period for this, however mainly: as soon as somebody has discovered the web page, serving to them to finest do the duty they need to do).
Maybe that’s just like CTA’s on conventional pages? You don’t “do them” for search engine optimization (to be discovered), however for those who’re answerable for the web site general, guaranteeing a excessive “discovery charge” (search engine optimization) along with a excessive conversion charge is helpful to justify your work.
To get again to the builders.google.com website, AI coding has gotten very talked-about, and these coding methods may be (I feel) environment friendly and correct with the code they produce if they’ll simply learn / parse reference materials, reminiscent of developer documentation.
In these instances, it may well assist to provide them a technique to perceive the context of the documentation they’re taking a look at, in addition to a simplified model of the reference web page (eg, in markdown). OF COURSE they’ll learn HTML simply high-quality, so that is imo extra of a short lived crutch, maybe to avoid wasting tokens.
For non-developer websites, I don’t assume this makes a lot sense, even with extra agentic site visitors sooner or later (and for those who verify your logs, you’re not getting numerous that in the mean time). Making a markdown model of a shoe’s specs shouldn’t be going to get you extra gross sales (rivals admire it tho).
And (I do know, no one reads this far), for those who assume that is necessary to arrange for when brokers are all over the place: your website (all websites) have way more necessary issues to do for search engine optimization than to arrange for a possible future scenario which will or might not come. Prioritize wants earlier than goals.

What Google says brokers depend on. Past llms.txt, Google’s new Lighthouse class strongly emphasizes accessibility and interface stability. The documentation says brokers depend on the accessibility tree as their “main information mannequin.” Lighthouse particularly evaluates:

Programmatic labels for interactive parts.
Legitimate accessibility tree construction.
Whether or not interactive content material is hidden from assistive methods.
Structure stability by CLS.

Google additionally warns that dynamically registered WebMCP instruments and enormous DOM adjustments can have an effect on audit outcomes.

Why we care. Google says you don’t want llms.txt for Search, however Chrome is now checking whether or not the file exists. On the similar time, Google’s agentic instruments seem to favor websites which are simpler for machines to learn and use, particularly websites with robust accessibility, secure layouts, and clear agent entry.

Google’s assist doc. Lighthouse agentic browsing scoring

Dig deeper.

Search Engine Land is owned by Semrush. We stay dedicated to offering high-quality protection of promoting matters. Except in any other case famous, this web page’s content material was written by both an worker or a paid contractor of Semrush Inc.

Danny Goodwin is Editorial Director of Search Engine Land & Search Marketing Expo – SMX. He joined Search Engine Land in 2022 as Senior Editor. Along with reporting on the most recent search advertising and marketing information, he manages Search Engine Land’s SME (Topic Matter Knowledgeable) program. He additionally helps program U.S. SMX occasions.

Goodwin has been enhancing and writing concerning the newest developments and traits in search and digital advertising and marketing since 2007. He beforehand was Govt Editor of Search Engine Journal (from 2017 to 2022), managing editor of Momentology (from 2014-2016) and editor of Search Engine Watch (from 2007 to 2014). He has spoken at many main search conferences and digital occasions, and has been sourced for his experience by a variety of publications and podcasts.

#Google #provides #llms.txt #verify #Chrome #Lighthouse

SocialSignalCounter

Leave a Reply Cancel reply

Login