For roughly 20 years, the search engine optimization self-discipline operated on a quiet assumption that turned out to be certainly one of its most precious options. Steerage from one search engine traveled. If Google stated sitemaps mattered, Bing stated sitemaps mattered. If Bing stated structured knowledge deserved actual effort, Google stated the identical. Practitioners optimized for Google with affordable confidence that the work would carry throughout the opposite engines, and more often than not it did. That portability was not luck. It was the product of a structurally massive overlap layer that the most important engines like google had collectively constructed, brick by brick, over twenty years.
That world doesn’t exist in LLM-land. The main suppliers practice on totally different corpora, run totally different crawlers below totally different insurance policies, route totally different queries by totally different retrieval methods, and apply totally different alignment processes that form the ultimate response in methods the upstream alerts can’t predict. Steerage from anybody supplier, together with Google’s steering about its personal Gemini merchandise, is one knowledge level. Practitioners carrying the search engine optimization behavior ahead, the behavior of treating one engine’s steering as roughly the entire map, will optimize confidently for one platform and miss the others.
Sidebar: As I used to be finalizing this piece, Google printed recent steering on optimizing for their generative AI features. Their framing is specific: from Google Search’s perspective, optimizing for AI search continues to be search engine optimization. That framing is correct for Google Search. It doesn’t prolong to ChatGPT, Claude, Perplexity, or another LLM, and that’s exactly the lure this text is about.
The Shared Requirements That Made search engine optimization Steerage Moveable
The period of transportable steering was constructed on precise collaboration, not coincidence. The Sitemaps protocol grew to become the joint property of Google, Yahoo, and Microsoft in November 2006, when the three engines formally agreed to help a typical protocol at model 0.90, constructing on Google’s earlier Sitemaps 0.84 from June 2005. 5 years later, on June 2, 2011, the identical three engines launched Schema.org, with Yandex becoming a member of shortly after, to create a typical vocabulary for structured knowledge markup. That was the announcement that acquired made on stage at SMX Superior. I used to be on the Bing crew on the time, and what struck me then is what nonetheless issues now. The engines had been opponents, however that they had determined {that a} shared vocabulary served all of them. Site owners acquired one algorithm. The online acquired cleaner knowledge. The engines acquired higher alerts. All people gained.
The sample repeated with robots.txt, the 1994 conference that grew to become RFC 9309 on the IETF in 2022, formalizing what each severe crawler already honored. And it repeated once more, extra not too long ago, with IndexNow, the protocol Microsoft Bing and Yandex launched in October 2021. IndexNow is now supported by Bing, Yandex, Naver, Seznam, and Yep. Google has examined the protocol since 2021, however has not adopted it.
That overlap layer is strictly why Google’s steering felt protected to observe, even if you happen to cared about Bing site visitors. The alerts the engines used weren’t an identical, however the inputs they accepted, the protocols they honored, and the requirements they marketed had been. Optimization had a shared substrate.
The place The LLM Stacks Truly Diverge
The LLM setting doesn’t have a shared substrate of comparable dimension. The variations usually are not beauty, and they aren’t short-term. They’re baked into how the methods are constructed.
Begin with coaching knowledge. OpenAI has signed disclosed licensing offers with News Corp worth up to $250 million over five years, Axel Springer at roughly $13 million per yr, Reddit at an estimated $70 million per yr, plus the Monetary Instances, Condé Nast, Hearst, Vox Media, The Atlantic, the Related Press, Le Monde, and others. Google has its own Reddit deal, estimated at $60 million per yr, granting real-time knowledge API entry. Anthropic has not publicly disclosed equal writer licensing offers, and that undisclosed standing is itself the practitioner-facing level. The corpora that fed these fashions, and that proceed to refresh them, usually are not the identical paperwork. Practitioners can not know what any given supplier has paid for and what it hasn’t.
The crawler infrastructure diverges subsequent. OpenAI runs three separate bots: GPTBot for coaching, OAI-SearchBot for search indexing, and ChatGPT-Consumer for user-initiated retrieval. Anthropic runs three of its own: ClaudeBot for coaching, Claude-SearchBot for search, and Claude-Consumer for user-initiated retrieval. Perplexity runs PerplexityBot and Perplexity-Consumer. Google launched Google-Prolonged in September 2023 because the user-agent that controls whether or not Google can use a web site’s content material to coach Gemini, separate solely from the Googlebot that handles conventional search indexing. There is no such thing as a single AI user-agent. Each supplier requires a separate rule, and the foundations don’t translate cleanly throughout suppliers as a result of the bots don’t do equal jobs in equal methods.
The retrieval architectures diverge structurally. ChatGPT has historically used Bing’s index as its main net search supply, and that connection seems to nonetheless be main, although OpenAI continues to construct out extra infrastructure alongside it. Perplexity constructed its retrieval system on a Vespa-based pipeline that treats paperwork and sub-document chunks as first-class retrievable items. Google’s Gemini makes use of Google’s personal index plus Data Graph grounding. Claude makes use of Courageous Search as a retrieval associate. Identical question, 4 totally different retrieval methods, 4 totally different views of which sources exist and which sources are price surfacing.
Then comes the alignment layer, which is the place search engine optimization had no equal in any respect. After a mannequin is educated on its corpus, suppliers run post-training to form how the mannequin truly behaves: tone, refusal patterns, format, security posture, what counts as an excellent reply. OpenAI’s main strategy has been RLHF, or Reinforcement Learning from Human Feedback, the place human raters rating mannequin outputs and the mannequin learns to provide extremely rated responses. Anthropic developed Constitutional AI, which trains fashions to critique and revise their very own outputs in opposition to a written set of rules. These methodologies produce demonstrably totally different conduct within the last merchandise. The identical retrieved content material, fed into two fashions aligned by two methodologies, can yield two materially totally different responses about the identical model.
When One Supplier’s Steerage Demonstrably Fails To Port
The clearest single instance of steering that doesn’t port is llms.txt. Jeremy Howard of Reply.AI proposed the file in September 2024 as a markdown manifest, positioned at a web site’s root, that would guide LLMs to the most important content. The proposal acquired picked up throughout the search engine optimization group. Yoast constructed a generator. Companies added llms.txt creation to their service catalogs. Convention audio system declared it important.
As of mid-2026, no major LLM provider has confirmed they consume the file. Not OpenAI. Not Anthropic. Not Google. Server-log analyses throughout a whole bunch of hundreds of domains present main AI crawlers don’t routinely request /llms.txt in any respect. Google’s John Mueller publicly compared it to the deprecated meta keywords tag. Gary Illyes confirmed at Search Central Dwell in July 2025 that Google doesn’t help llms.txt and isn’t planning to.
I’ve written about this elsewhere, so I gained’t repeat the technicalities right here. What issues for this argument is the structural lesson. Schema.org succeeded as a result of three engines constructed it collectively after which enforced it collectively. Llms.txt was proposed by one researcher, picked up by tooling distributors, and ignored by the platforms it was alleged to serve. The shared-standards mannequin that gave search engine optimization its transportable steering shouldn’t be obtainable to LLM practitioners on the identical scale, as a result of the platforms usually are not constructing the requirements collectively. They’re constructing their very own pipelines.
The Gemini Inversion
The cleanest illustration of how far steering portability has degraded sits inside one firm. Google publishes its personal SEO documentation at Search Central, the canonical steering the business has adopted for 20 years. These paperwork emphasize conventional rating alerts, E-E-A-T, content material high quality, technical accessibility, and structured knowledge. That steering continues to be helpful for Google Search itself.
Google additionally makes Gemini, the mannequin that powers AI Overviews and Google’s separate AI Mode floor. And the quotation conduct of these surfaces doesn’t seem to trace the steering the identical firm publishes for its personal search outcomes.
In late 2024, roughly three-quarters of pages cited in AI Overviews additionally ranked in Google’s prime 12 for a similar question. By early 2026, after Google upgraded AI Overviews to Gemini 3 in January, Ahrefs analyzed 4 million AI Overview URLs and located that solely 38% of cited pages additionally appeared within the prime 10 for a similar question. A separate BrightEdge analysis put the overlap nearer to 17%. SE Rating’s post-upgrade work discovered that Gemini 3 changed roughly 42% of the domains beforehand cited below earlier mannequin variations and generates 32% extra sources per response.
The hole widens additional once you take a look at Google’s AI Mode, which is a separate conversational floor that runs on the identical Gemini household. Semrush data reveals AI Mode and AI Overviews attain semantically related conclusions 86% of the time, however cite the identical URLs solely 13.7% of the time. Solely 14% of AI Mode citations rank in Google’s conventional prime 10.
It seems, to this point, that the canonical relationship has shifted. Google’s printed search engine optimization steering continues to be the cleanest path to rating in Google Search. However that rating is not a dependable proxy for being cited by Google’s personal AI surfaces. The identical steering, the identical content material, the identical area, can produce three meaningfully totally different outcomes throughout Google Search, AI Overviews, and AI Mode, though all three reside inside the identical firm. The outdated playbook of following the search engine’s steering and trusting that the engine’s different surfaces would behave constantly doesn’t seem like delivering the identical returns it used to.
What Nonetheless Ports, And Why It’s Smaller Than It Seems
A common layer does survive. Crawler accessibility nonetheless issues throughout each supplier. Primary-source factual content still wins extra citations than aggregator restatement. Clear retrievable construction nonetheless helps each system perceive what a web page is about. Presence on the high-authority sources that every one main LLMs disproportionately cite, Wikipedia, YouTube, Reddit, main information retailers, nonetheless features as a force multiplier across platforms. Incomes visibility on these sources offers content material an opportunity to floor in any LLM that attracts on them.
However the common layer is way smaller than it was within the search engine optimization period. Qwairy’s analysis of 118,000 AI responses throughout ChatGPT, Perplexity, Google AI Mode, and Claude discovered that solely 11% of cited domains appeared across multiple platforms. The opposite 89% had been platform-specific. A model that wins citations on Perplexity could also be largely invisible on Claude. A model that’s a daily reference on ChatGPT might not present up in AI Overviews in any respect. The identical content material will be the suitable reply for one system and the mistaken reply for the system subsequent to it.
What This Means For The Work
The sensible implication shouldn’t be abandoning all hope. It’s that practitioners must cease treating any single LLM supplier’s steering because the common map and begin treating it as one enter amongst a number of. Learn what each main supplier publishes about their very own methods. Test your visibility across platforms, not simply on the platform you occur to make use of most. Deal with divergence because the default and overlap because the exception, not the opposite approach round.
This isn’t how search engine optimization labored, and the distinction issues. The outdated reflex was to optimize for Google and belief the portability. The brand new actuality is that following one LLM’s steering, even Google’s steering about Gemini, will depart you optimized for a slice of the panorama and doubtlessly blind to the remainder. The self-discipline is being rebuilt on platform-specific work that didn’t exist within the search engine optimization period, and the practitioners who acknowledge that first are going to spend the following two years setting the requirements everybody else follows.
The overlap has shrunk. You now have extra work than ever to perform.
When you’ve got ideas on the place the divergence between suppliers is sharpest in your individual work, attain out instantly. I’d genuinely like to listen to what’s displaying up within the knowledge.
Extra Sources:
This submit was initially printed on Duane Forrester Decodes.
Featured Picture: Rawpixel.com/Shutterstock; Paulo Bobita/Search Engine Journal
Source link


