Complete Crawler List For AI User-Agents [Dec 2025]

AI visibility performs a vital function for SEOs, and this begins with controlling AI crawlers. If AI crawlers can’t entry your pages, you’re invisible to AI discovery engines.

On the flip facet, unmonitored AI crawlers can overwhelm servers with extreme requests, inflicting crashes and sudden internet hosting payments.

Consumer-agent strings are important for controlling which AI crawlers can entry your web site, however official documentation is commonly outdated, incomplete, or lacking completely. So, we curated a verified checklist of AI crawlers from our precise server logs as a helpful reference.

Each user-agent is validated in opposition to official IP lists when out there, making certain accuracy. We’ll keep and replace this checklist to catch new crawlers and modifications to current ones.

The Full Verified AI Crawler Listing (December 2025)

Title	Function	Crawl Fee of SEJ (pages/hour)	Verified IP Listing	Robots.txt disallow	Full Consumer Agent
GPTBot	AI coaching knowledge assortment for GPT fashions (ChatGPT, GPT-4o)	100	Official IP List	Consumer-agent: GPTBot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; GPTBot/1.3; +https://openai.com/gptbot)
ChatGPT-User	AI agent for real-time net searching when customers work together with ChatGPT	2400	Official IP List	Consumer-agent: ChatGPT-Consumer Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); appropriate; ChatGPT-Consumer/1.0; +https://openai.com/bot
OAI-SearchBot	AI search indexing for ChatGPT search options (not for coaching)	150	Official IP List	Consumer-agent: OAI-SearchBot Permit: / Disallow: /private-folder	Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; appropriate; OAI-SearchBot/1.3; +https://openai.com/searchbot
ClaudeBot	AI coaching knowledge assortment for Claude fashions	500	Official IP List	Consumer-agent: ClaudeBot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; ClaudeBot/1.0; [email protected])
Claude-User	AI agent for real-time net entry when Claude customers browse		Not out there	Consumer-agent: Claude-Consumer Disallow: /sample-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; Claude-Consumer/1.0; [email protected])
Claude-SearchBot	AI search indexing for Claude search capabilities		Not out there	Consumer-agent: Claude-SearchBot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; Claude-SearchBot/1.0; +https://www.anthropic.com)
Google-CloudVertexBot	AI agent for Vertex AI Agent Builder (website homeowners’ request solely)		Official IP List	Consumer-agent: Google-CloudVertexBot Permit: / Disallow: /private-folder	Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Construct/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.7390.122 Cell Safari/537.36 (appropriate; Google-CloudVertexBot; +https://cloud.google.com/enterprise-search)
Google-Extended	Token controlling AI coaching utilization of Googlebot-crawled content material.			Consumer-agent: Google-Prolonged Permit: / Disallow: /private-folder
Gemini-Deep-Research	AI analysis agent for Google Gemini’s Deep Analysis characteristic		Official IP List	Consumer-agent: Gemini-Deep-Analysis Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; Gemini-Deep-Analysis; +https://gemini.google/overview/deep-research/) Chrome/135.0.0.0 Safari/537.36
Google	Gemini’s chat when a person asks to open a webpage				Google
Bingbot	Powers Bing Search and Bing Chat (Copilot) AI solutions	1300	Official IP List	Consumer-agent: BingBot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36
Applebot-Extended	Doesn’t crawl however controls how Apple makes use of Applebot knowledge.		Official IP List	Consumer-agent: Applebot-Prolonged Permit: / Disallow: /private-folder	Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Model/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
PerplexityBot	AI search indexing for Perplexity’s reply engine	150	Official IP List	Consumer-agent: PerplexityBot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
Perplexity-User	AI agent for real-time searching when Perplexity customers request info		Official IP List	Consumer-agent: Perplexity-Consumer Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; Perplexity-Consumer/1.0; +https://perplexity.ai/perplexity-user)
Meta-ExternalAgent	AI coaching knowledge assortment for Meta’s LLMs (Llama, and so forth.)	1100	Not out there	Consumer-agent: meta-externalagent Permit: / Disallow: /private-folder	meta-externalagent/1.1 (+https://builders.fb.com/docs/sharing/site owners/crawler)
Meta-WebIndexer	Used to enhance Meta AI search.		Not out there	Consumer-agent: Meta-WebIndexer Permit: / Disallow: /private-folder	meta-webindexer/1.1 (+https://builders.fb.com/docs/sharing/site owners/crawler)
Bytespider	AI coaching knowledge for ByteDance’s LLMs for merchandise like TikTok		Not out there	Consumer-agent: Bytespider Permit: / Disallow: /private-folder	Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Cell Safari/537.36 (appropriate; Bytespider; https://zhanzhang.toutiao.com/)
Amazonbot	AI coaching for Alexa and different Amazon AI providers	1050	Not out there	Consumer-agent: Amazonbot Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; Amazonbot/0.1; +https://developer.amazon.com/help/amazonbot) Chrome/119.0.6045.214 Safari/537.36
DuckAssistBot	AI search indexing for DuckDuckGo search engine	20	Official IP List	Consumer-agent: DuckAssistBot Permit: / Disallow: /private-folder	DuckAssistBot/1.2; (+http://duckduckgo.com/duckassistbot.html)
MistralAI-Consumer	Mistral’s real-time quotation fetcher for “Le Chat” assistant		Not out there	Consumer-agent: MistralAI-Consumer Permit: / Disallow: /private-folder	Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; MistralAI-Consumer/1.0; +https://docs.mistral.ai/robots)
Webz.io	Information extraction and net scraping utilized by different AI coaching firms. Previously referred to as Omgili.		Not out there	Consumer-agent: webzio Permit: / Disallow: /private-folder	webzio (+https://webz.io/bot.html)
Diffbot	Information extraction and net scraping utilized by firms everywhere in the world.		Not out there	Consumer-agent: Diffbot Permit: / Disallow: /private-folder	Mozilla/5.0 (Home windows; U; Home windows NT 5.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729; Diffbot/0.1; +http://www.diffbot.com)
ICC-Crawler	AI and machine studying knowledge assortment		Not out there	Consumer-agent: ICC-Crawler Permit: / Disallow: /private-folder	ICC-Crawler/3.0 (Mozilla-compatible; ; https://ucri.nict.go.jp/en/icccrawler.html)
CCBot	Open-source net archive used as coaching knowledge by a number of AI firms		Official IP List	Consumer-agent: CCBot Permit: / Disallow: /private-folder	CCBot/2.0 (https://commoncrawl.org/faq/)

The user-agent strings above have all been verified in opposition to Search Engine Journal server logs.

Fashionable AI Agent Crawlers With Unidentifiable Consumer Agent

We’ve discovered that the next didn’t determine themselves:

you.com.
ChatGPT’s agent Operator.
Bing’s Copilot chat.
Grok.
DeepSeek.

There is no such thing as a option to monitor this crawler from accessing webpages aside from by figuring out the specific IP.

We arrange a lure web page (e.g., /specific-page-for-you-com/) and used the on-page chat to immediate you.com to go to it, permitting us to find the corresponding go to file and IP handle in our server logs. Under is the screenshot:

What About Agentic AI Browsers?

Sadly, AI browsers comparable to Comet or ChatGPT’s Atlas don’t differentiate themselves within the person agent string, and you may’t determine them in server logs and mix with regular customers’ visits.

Chatgpt's Atlas browser user agetn string from server logs records — ChatGPT’s Atlas browser person agent string from server logs information (Screenshot by writer, December 2025)

That is disappointing for SEOs as a result of monitoring agentic browser visits to a web site is essential for reporting POV.

How To Verify What’s Crawling Your Server

Some internet hosting firms provide a person interface (UI) that makes it straightforward to entry and take a look at server logs, relying on what hosting service you’re utilizing.

In case your internet hosting doesn’t provide this, you will get server log information (often positioned /var/log/apache2/entry.log in Linux-based servers) by way of FTP or request it out of your server help to ship it to you.

After you have the log file, you’ll be able to view and analyze it in both Google Sheets (if the file is in CSV format), Screaming Frog’s log analyzer, or, in case your log file is less than 100 MB, you’ll be able to strive analyzing it with Gemini AI.

How To Confirm Official Vs. Pretend Bots

Pretend crawlers can spoof authentic person brokers to bypass restrictions and scrape content material aggressively. For instance, anybody can impersonate ClaudeBot from their laptop computer and provoke crawl request from the terminal. In your server log, you will notice it as Claudebot is crawling it:

curl -A 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; ClaudeBot/1.0; [email protected])' https://instance.com

Verification will help to save lots of server bandwidth and stop harvesting content material illegally. Essentially the most dependable verification technique you’ll be able to apply is checking the request IP.

Verify all IPs and scan to match if it’s one of many formally declared IPs listed above. In that case, you’ll be able to enable the request; in any other case, block.

Numerous forms of firewalls can help you with this by way of allowlist verified IPs (which permits authentic bot requests to go via), and all different requests impersonating AI crawlers of their person agent strings are blocked.

For instance, in WordPress, you should use Wordfence free plugin to allowlist authentic IPs from the official lists (as above) and add blocking customized guidelines as beneath:

Block User agent setting in Wordfance — Block Consumer agent setting in Wordfence

The allowlist rule is superior, and it’ll let authentic crawlers go via and block any impersonation request which comes from completely different IPs.

Nevertheless, please word that it’s doable to spoof an IP address, and in that case, when bot person agent and IPs are spoofed, you received’t be capable to block it.

Conclusion: Keep In Management Of AI Crawlers For Dependable AI Visibility

AI crawlers are actually a part of our net ecosystem, and the bots listed right here symbolize the most important AI platforms presently indexing the online, though this checklist is prone to develop.

Verify your server logs frequently to see what’s truly hitting your website and ensure you inadvertently don’t block AI crawlers if visibility in AI search engines is essential for your enterprise. If you happen to don’t need AI crawlers to entry your content material, block them by way of robots.txt utilizing the user-agent title.

We’ll maintain this checklist up to date as new crawlers emerge and replace current ones, so we advocate you bookmark this URL, or revisit this text frequently to maintain your AI crawler checklist updated.

Extra Assets:

Featured Picture: BestForBest/Shutterstock

Source link

Complete Crawler List For AI User-Agents [Dec 2025]

The Full Verified AI Crawler Listing (December 2025)

Fashionable AI Agent Crawlers With Unidentifiable Consumer Agent

What About Agentic AI Browsers?

How To Verify What’s Crawling Your Server

How To Confirm Official Vs. Pretend Bots

Conclusion: Keep In Management Of AI Crawlers For Dependable AI Visibility

[email protected]

Leave a Reply Cancel reply

Rio Help Desk Ticketing Flutter App UI Template

Why Generative Engine Optimisation Will Change the Way You Generate B2B Leads

Most messaging problems start with the wrong audience

Press ESC to close

The Full Verified AI Crawler Listing (December 2025)

Fashionable AI Agent Crawlers With Unidentifiable Consumer Agent

What About Agentic AI Browsers?

How To Verify What’s Crawling Your Server

How To Confirm Official Vs. Pretend Bots

Conclusion: Keep In Management Of AI Crawlers For Dependable AI Visibility

Share Article:

Material Kit UI With Bootstrap 4 With Clean Architecture .NET Core 2x

UI/UX IONIC – NBL Ecommerce Components Starter

Leave a Reply Cancel reply