What is Pay-Per-Crawl and the New Bot Web?

Pay-Per-Crawl and the New Bot Web

AI Web Crawlers: GPTBot, ClaudeBot, PerplexityBot, and Pay‑Per‑Crawl As of 2026, AI‑driven web crawlers such as GPTBot, ClaudeBot, and PerplexityBot now account for a…

402 · Pay-per-Crawl$0.02

This page is a free summary. The complete machine-readable dataset — every data point, the full analysis and source set — is available to AI agents as structured JSON via the open HTTP 402 payment protocol.

Unlock full data → agents: crawlerAutoPrice=true · verify at https://esa.aisa.one/api/v1/access/verify

AI Web Crawlers: GPTBot, ClaudeBot, PerplexityBot, and Pay‑Per‑Crawl

As of 2026, AI‑driven web crawlers such as GPTBot, ClaudeBot, and PerplexityBot now account for a significant share of bot traffic. Developers and AI‑agent operators must understand how these crawlers behave, how they differ by purpose, and how emerging monetization models like HTTP 402 and pay‑per‑crawl reshape access control.

What GPTBot, ClaudeBot, and PerplexityBot Do

GPTBot (OpenAI) is a training‑focused crawler that harvests content to improve foundation models. It generates little referral traffic but can be controlled via robots.txt and User‑Agent rules.
ClaudeBot (Anthropic) similarly crawls for training and retrieval, with Cloudflare data showing tens of thousands of pages crawled per referral, indicating heavy training‑oriented activity.
PerplexityBot (Perplexity AI) is an indexing/retrieval crawler that builds the corpus for Perplexity’s answer engine, often returning cited answers and some referral traffic.

These crawlers are distinct from user‑triggered fetchers such as ChatGPT‑User, Claude‑User, and Perplexity‑User, which fetch pages on‑demand when a human queries an AI assistant.

Training vs. Search vs. User‑Triggered Crawlers

Modern AI vendors have split their crawlers into functional tiers:

Training crawlers (GPTBot, ClaudeBot, CCBot, Google‑Extended) harvest data for model training and rarely send back clicks.
Search/retrieval crawlers (OAI‑SearchBot, Claude‑SearchBot, PerplexityBot) build indexes used by AI search and citation features.
User‑triggered fetchers (e.g., ChatGPT‑User, Claude‑Web, Perplexity‑User) represent real user intent and drive qualified traffic.

Blocking training crawlers while allowing search and user‑triggered bots is now a common strategy to preserve visibility and reduce bandwidth costs.

Pay‑Per‑Crawl, HTTP 402, and AI Agents

Cloudflare’s pay‑per‑crawl model lets publishers respond to training crawlers with HTTP 402 Payment Required, signaling licensing terms instead of a flat 403 or 404. AI agents that respect this pattern can:

Check for 402 responses and follow licensing instructions.
Integrate payment or authorization flows before fetching training‑relevant pages.
Use robots.txt directives (e.g., Content‑Signal, TDM) to align with publisher preferences on training, citation, or research use.

Over a billion 402 responses are served daily, and early adopters are setting norms for fair compensation of training data.

Key takeaways

GPTBot, ClaudeBot, and PerplexityBot are specialized crawlers: training‑focused, retrieval‑oriented, or answer‑engine indexing, respectively.
Treat training, search, and user‑triggered crawlers separately; block or monetize training bots while allowing user‑facing ones.
HTTP 402 and pay‑per‑crawl let publishers charge for training access; AI agents should respect 402 and licensing metadata.
Use robots.txt and User‑Agent rules to implement a tiered policy that balances visibility, cost, and fair compensation.

Synthesized by the AISA LLM layer with live web sources (AISA Perplexity + Tavily APIs). 2026-06-23.