How AI Agents See Your Website (And How To Build For Them)

Each main AI platform can now browse web sites autonomously. Chrome’s auto browse scrolls and clicks. ChatGPT Atlas fills types and completes purchases. Perplexity Comet researches throughout tabs. However none of those brokers sees your web site the best way a human does.

That is Half 4 in a five-part sequence on optimizing web sites for the agentic internet. Part 1 lined the evolution from search engine optimization to AAIO. Half 2 defined tips on how to get your content material cited in AI responses. Half 3 mapped the protocols forming the infrastructure layer. This text will get technical: how AI brokers truly understand your web site, and what to construct for them.

The core perception is one which retains developing in my analysis: Essentially the most impactful factor you are able to do for AI agent compatibility is identical work internet accessibility advocates have been pushing for many years. The accessibility tree, initially constructed for display readers, is changing into the first interface between AI brokers and your web site.

In accordance with the 2025 Imperva Bad Bot Report (Imperva is a cybersecurity firm), automated visitors surpassed human visitors for the primary time in 2024, constituting 51% of all internet interactions. Not all of that’s agentic looking, however the course is obvious: the non-human viewers to your web site is already bigger than the human one, and it’s rising. All through this text, we draw solely from official documentation, peer-reviewed analysis, and bulletins from the businesses constructing this infrastructure.

Three Methods Brokers See Your Web site

When a human visits your web site, they see colours, structure, photos, and typography. When an AI agent visits, it sees one thing solely totally different. Understanding what brokers truly understand is the inspiration for constructing web sites that work for them.

The key AI platforms use three distinct approaches, and the variations have direct implications for a way you must construction your web site.

Imaginative and prescient: Studying Screenshots

Anthropic’s Computer Use takes essentially the most literal strategy. Claude captures screenshots of the browser, analyzes the visible content material, and decides what to click on or kind primarily based on what it “sees.” It’s a steady suggestions loop: screenshot, cause, act, screenshot. The agent operates on the pixel stage, figuring out buttons by their visible look and studying textual content from the rendered picture.

Google’s Project Mariner follows an analogous sample with what Google describes as an “observe-plan-act” loop: observe captures visible parts and underlying code constructions, plan formulates motion sequences, and act simulates person interactions. Mariner achieved an 83.5% success fee on the WebVoyager benchmark.

The imaginative and prescient strategy works, however it’s computationally costly, delicate to structure adjustments, and restricted by what’s visually rendered on display.

Accessibility Tree: Studying Construction

OpenAI took a distinct path with ChatGPT Atlas. Their Publishers and Developers FAQ is express:

ChatGPT Atlas makes use of ARIA tags, the identical labels and roles that help display readers, to interpret web page construction and interactive parts.

Atlas is constructed on Chromium, however fairly than analyzing rendered pixels, it queries the accessibility tree for parts with particular roles (“button”, “hyperlink”) and accessible names. This is identical knowledge construction that display readers like VoiceOver and NVDA use to assist folks with visible disabilities navigate the net.

Microsoft’s Playwright MCP, the official MCP server for browser automation, takes the identical strategy. It supplies accessibility snapshots fairly than screenshots, giving AI fashions a structured illustration of the web page. Microsoft intentionally selected accessibility knowledge over visible rendering for his or her browser automation normal.

Hybrid: Each At As soon as

In apply, essentially the most succesful brokers mix approaches. OpenAI’s Computer-Using Agent (CUA), which powers each Operator and Atlas, layers screenshot evaluation with DOM processing and accessibility tree parsing. It prioritizes ARIA labels and roles, falling again to textual content content material and structural selectors when accessibility knowledge isn’t obtainable.

Perplexity’s analysis confirms the identical sample. Their BrowseSafe paper, which particulars the security infrastructure behind Comet’s browser agent, describes utilizing “hybrid context administration combining accessibility tree snapshots with selective imaginative and prescient.”

Platform	Major Strategy	Particulars
Anthropic Laptop Use	Imaginative and prescient (screenshots)	Screenshot, cause, act suggestions loop
Google Challenge Mariner	Imaginative and prescient + code construction	Observe-plan-act with visible and structural knowledge
OpenAI Atlas	Accessibility tree	Explicitly makes use of ARIA tags and roles
OpenAI CUA	Hybrid	Screenshots + DOM + accessibility tree
Microsoft Playwright MCP	Accessibility tree	Accessibility snapshots, no screenshots
Perplexity Comet	Hybrid	Accessibility tree + selective imaginative and prescient

The sample is obvious. Even platforms that began with vision-first approaches are incorporating accessibility knowledge. And the platforms optimizing for reliability and effectivity (Atlas, Playwright MCP) lead with the accessibility tree.

Your web site’s accessibility tree isn’t a compliance artifact. It’s more and more the first interface brokers use to grasp and work together along with your web site.

Final yr, earlier than the European Accessibility Act took impact, I half-joked that it will be ironic if the factor that lastly obtained folks to care about accessibility was AI brokers, not the folks accessibility was designed for. That’s now not a joke.

The Accessibility Tree Is Your Agent Interface

The accessibility tree is a simplified illustration of your web page’s DOM that browsers generate for assistive applied sciences. The place the total DOM accommodates each div, span, model, and script, the accessibility tree strips away the noise and exposes solely what issues: interactive parts, their roles, their names, and their states.

For this reason it really works so effectively for brokers. A typical web page’s DOM would possibly comprise hundreds of nodes. The accessibility tree reduces that to the weather a person (or agent) can truly work together with: buttons, hyperlinks, kind fields, headings, landmarks. For AI fashions that course of internet pages inside a restricted context window, that discount is critical.

OpenAI’s Publishers and Developers FAQ may be very clear about this:

Observe WAI-ARIA greatest practices by including descriptive roles, labels, and states to interactive parts like buttons, menus, and types. This helps ChatGPT acknowledge what every aspect does and work together along with your web site extra precisely.

And:

Making your web site extra accessible helps ChatGPT Agent in Atlas perceive it higher.

Analysis knowledge backs this up. Essentially the most rigorous knowledge on this comes from a UC Berkeley and College of Michigan study printed for CHI 2026, the premier tutorial convention on human-computer interplay. The researchers examined Claude Sonnet 4.5 on 60 real-world internet duties below totally different accessibility circumstances, amassing 40.4 hours of interplay knowledge throughout 158,325 occasions. The outcomes have been putting:

Situation	Process Success Charge	Avg. Completion Time
Normal (default)	78.33%	324.87 seconds
Keyboard-only	41.67%	650.91 seconds
Magnified viewport	28.33%	1,072.20 seconds

Beneath normal circumstances, the agent succeeded practically 80% of the time. Limit it to keyboard-only interplay (simulating how display reader customers navigate) and success drops to 42%, taking twice as lengthy. Limit the viewport (simulating magnification instruments), and success drops to twenty-eight%, taking up thrice as lengthy.

The paper identifies three classes of gaps:

Notion gaps: brokers can’t reliably entry display reader bulletins or ARIA state adjustments that may inform them what occurred after an motion.
Cognitive gaps: brokers battle to trace activity state throughout a number of steps.
Motion gaps: brokers underutilize keyboard shortcuts and fail at interactions like drag-and-drop.

The implication is direct. Web sites that current a wealthy, well-labeled accessibility tree give brokers the knowledge they should succeed. Web sites that depend on visible cues, hover states, or complicated JavaScript interactions with out accessible options create the circumstances for agent failure.

Perplexity’s search API architecture paper from September 2025 reinforces this from the content material aspect. Their indexing system prioritizes content material that’s “prime quality in each substance and kind, with info captured in a way that preserves the unique content material construction and structure.” Web sites “heavy on well-structured knowledge in listing or desk kind” profit from “extra formulaic parsing and extraction guidelines.” Construction isn’t simply useful. It’s what makes dependable parsing attainable.

Semantic HTML: The Agent Basis

The accessibility tree is constructed out of your HTML. Use semantic parts, and the browser generates a helpful accessibility tree robotically. Skip them, and the tree is sparse or deceptive.

This isn’t new recommendation. Net requirements advocates have been screaming “use semantic HTML” for 20 years. Not everybody listened. What’s new is that the viewers has expanded. It was about display readers and a comparatively small proportion of customers. Now it’s about each AI agent that visits your web site.

Use native parts. A

Search flights

Label your types. Each enter wants an related label. Brokers learn labels to grasp what knowledge a area expects.


Electronic mail handle


The autocomplete attribute deserves consideration. It tells brokers (and browsers) precisely what kind of information a area expects, utilizing standardized values like identify, e-mail, tel, street-address, and group. When an agent fills a kind on somebody’s behalf, autocomplete attributes make the distinction between assured area mapping and guessing.
Set up heading hierarchy. Use h1 via h6 in logical order. Brokers use headings to grasp web page construction and find particular content material sections. Skip ranges (leaping from h1 to h4) create confusion about content material relationships.
Use landmark areas. HTML5 landmark parts (
        
        

            

        

        
        
, , 
, 
, 
) inform brokers the place they're on the web page. A 
 aspect is unambiguously navigation. A 
 requires interpretation. Readability for the win, all the time.

  



  
    
    
  


Microsoft’s Playwright take a look at brokers, introduced in October 2025, generate take a look at code that makes use of accessible selectors by default. When the AI generates a Playwright take a look at, it writes:
const todoInput = web page.getByRole('textbox', { identify: 'What must be finished?' });

Not CSS selectors. Not XPath. Accessible roles and names. Microsoft constructed its AI testing instruments to search out parts the identical method display readers do, as a result of it’s extra dependable.
The ultimate slide of my Conversion Hotel keynote about optimizing web sites for AI brokers. (Picture Credit score: Slobodan Manic)
ARIA: Helpful, Not Magic
OpenAI recommends ARIA (Accessible Wealthy Web Purposes), the W3C normal for making dynamic internet content material accessible. However ARIA is a complement, not a substitute. Like protein shakes: helpful on high of an actual weight-reduction plan, counterproductive as a alternative for precise meals.
The first rule of ARIA, as outlined by the W3C:
        
        

            

        
        
        

If you need to use a local HTML aspect or attribute with the semantics and conduct you require already in-built, as an alternative of re-purposing a component and including an ARIA position, state or property to make it accessible, then accomplish that.

The truth that the W3C needed to make “don’t use ARIA” the primary rule of ARIA tells you every thing about how usually it will get misused.
Adrian Roselli, a acknowledged internet accessibility skilled, raised an necessary concern in his October 2025 analysis of OpenAI’s steering. He argues that recommending ARIA with out ample context dangers encouraging misuse. Web sites that use ARIA are generally less accessible in accordance with WebAIM’s annual survey of the highest million web sites, as a result of ARIA is commonly utilized incorrectly as a band-aid over poor HTML construction. Roselli warns that OpenAI’s steering may incentivize practices like keyword-stuffing in aria-label attributes, the identical type of gaming that plagued meta key phrases in early search engine optimization.
The suitable strategy is layered:

Begin with semantic HTML. Use 

ARIA is a complement, not a substitute. Use it for dynamic states and customized elements. However begin with semantic HTML and add ARIA solely the place native parts fall brief. Misused ARIA makes web sites much less accessible, no more.
Server-side rendering is an agent visibility requirement. AI crawlers that don’t execute JavaScript can’t see content material in blank-shell SPAs. In case your content material isn’t within the preliminary HTML, it doesn’t exist within the AI ecosystem.
Display screen reader testing is the most effective proxy for agent compatibility. If VoiceOver or NVDA can navigate your web site, brokers most likely can too. For direct inspection, Playwright MCP accessibility snapshots present precisely what brokers see.

The primary three components of this sequence lined why the shift issues, tips on how to get cited, and what protocols are being constructed. This text lined the implementation layer. The encouraging information is that these aren’t separate workstreams. Accessible, well-structured web sites carry out higher for people, rank higher in search, get cited extra usually by AI, and work higher for brokers. It’s the identical work serving 4 audiences.
And the work builds on itself. The semantic HTML and structured knowledge lined listed below are precisely what WebMCP builds on for its declarative kind strategy. The accessibility tree your web site exposes in the present day turns into the inspiration for the structured instrument interfaces of tomorrow.
Up subsequent in Half 5: the commerce layer. How Stripe, Shopify, and OpenAI are constructing the infrastructure for AI brokers to finish purchases, and what it means to your checkout movement.
Extra Assets:

This submit was initially printed on No Hacks.

Featured Picture: Collagery/Shutterstock

Slobodan Manic
Host of the No Hacks Podcast and machine-first internet optimization marketing consultant at No Hacks

Slobodan “Sani” Manić is a web site optimisation marketing consultant with over 15 years of expertise serving to companies make their websites quicker, ...

How AI Agents See Your Website (And How To Build For Them)

Three Methods Brokers See Your Web site

Imaginative and prescient: Studying Screenshots

Accessibility Tree: Studying Construction

Hybrid: Each At As soon as

The Accessibility Tree Is Your Agent Interface

Semantic HTML: The Agent Basis

[email protected]

Leave a Reply Cancel reply

Prism Rain – HTML5 Game Template

Bill Cassidy: Unaccountable & Tone Deaf On RFK Jr.

CSS Syntax Blaster | Html5 Coding Game

Press ESC to close

Three Methods Brokers See Your Web site

Imaginative and prescient: Studying Screenshots

Accessibility Tree: Studying Construction

Hybrid: Each At As soon as

The Accessibility Tree Is Your Agent Interface

Semantic HTML: The Agent Basis

Share Article:

Ecommerce Back in Stock – Product Availability Notifications for Botble E-commerce

Yoga Workout Challenge – Lose weight with yoga template | Flutter | Android + ios app template

Leave a Reply Cancel reply