AI Crawler Allowlist

AI crawler User-Agent allowlist + robots.txt + ai.txt + Cloudflare WAF / nginx policy +…

A complete User-Agent allowlist and edge-enforcement system covering 24+ AI crawlers across a three-tier reputation matrix. It ships ready-to-use robots.txt, ai.txt, Cloudflare Worker and nginx configs with reverse-DNS verification, so you give the right AI bots access while rate-limiting or blocking the bandwidth-hungry ones. Critically, it treats the User-Agent header as an unverified claim and forces reverse-DNS plus official IP-range checks before granting trust.

$15 one-time
Add to a kit →

Prices include 20% VAT. · Forged on real agency work · one-time, no lock-in

  • Type Skill
  • Category Search & AEO
  • Delivery Email · instant
  • License One-time
Run preview
forgehouse, ai-crawler-allowlist

Inside the run · no black box

See the actual work before you buy it.

Not every AI bot deserves the same door. The skill sorts 24+ crawlers into reputation tiers, then keeps robots.txt, ai.txt and the edge rules all telling the same story:

  1. Sorts every known AI bot into the 3-tier reputation matrix: TIER 1 high value (GPTBot, ClaudeBot, PerplexityBot, Google-Extended and 7 more), TIER 2 conditional (CCBot, Amazonbot, AppleBot-Extended), TIER 3 low value or suspect (Bytespider, Diffbot), 24+ bots total.
  2. Writes the four enforcement layers in sync: robots.txt allow/disallow blocks, ai.txt training opt-in/opt-out, the edge worker or nginx tier rules, and rate-limit zones (100, 30 and 10 requests per minute by tier). One layer contradicting another is the classic failure it prevents.
  3. Verifies every UA claim with a reverse-DNS lookup plus the vendor's official IP range JSON; a failed check downgrades the bot to TIER 3 instead of trusting the header.
  4. Exempts real-time user-query bots (ChatGPT-User, Claude-User, Perplexity-User) from rate-limits entirely, because a 429 there means the user waiting for an answer never sees your brand cited.
  5. Blocks or throttles bandwidth-heavy, zero-citation crawlers like Bytespider, and keeps sensitive paths (/admin, /staging, /customer) closed to every bot.
  6. Runs the live verification (a GPTBot UA curl must return 200, a Bytespider curl must return 403) and a monthly DarkVisitors registry diff so new bots get a tier assignment before they hit your origin.
Use cases · what happens when you plug it in

One power source. 6 lines out.

ai-crawler-allowlist · core

core active · 6 lines

  1. Open robots.txt and ai.txt so ChatGPT, Claude and Perplexity can cite you

    ✓ open robots.txt and ai.txt
  2. Speed up AI recrawl after a major content relaunch

    ✓ speed up ai recrawl after
  3. Block aggressive crawlers that cost bandwidth but deliver no citations

    ✓ block aggressive crawlers
  4. Stop real-time user-query bots from ever hitting a rate limit

    ✓ stop real-time user-query
  5. Diagnose why an AI engine isn't citing you (leftover Disallow rules)

    ✓ diagnose why an ai engine
  6. Enforce GDPR/KVKK AI-training opt-out on paid or user-generated content

    ✓ enforce gdpr/kvkk ai-tra…
Benefits · what you walk away with

Yours to keep.

Drag time forward. Watch what stays.

Forever

That's what owning means.

The rented stack

ai writing tool: subscription

expired · access lost

analytics suite: subscription

expired · access lost

design platform: subscription

expired · access lost

(nothing left)

Your forge

  1. Become citable in AI search by giving verified bots clean, fast access

    license: perpetual
  2. Defend against User-Agent spoofing with mandatory reverse-DNS verification

    license: perpetual
  3. Keep four signals (robots.txt, ai.txt, WAF, edge) in sync so policy never contradicts itself

    license: perpetual
  4. Trim recurring bandwidth cost by tiering bots instead of one-size-fits-all limits

    license: perpetual

subscriptions expire · deeds don't

What's included · the full manifest

Everything in the box.

Pick a piece up. Watch it work.

24+ bot master list across Tier 1 (high value), Tier 2 (conditional), Tier 3 (block)

part 01 of 06 · in the box

6 parts · one working system · ships instantly by email

Who it's for

This wasn't forged for everyone.

  • Not for you if you'd rather rent a tool than own one.
  • Not for you if you want someone else to run your stack.
  • Not for you if you're happy guessing.
Still here? Good.

Site owners and SEO teams who want AI search visibility without leaking bandwidth to spoofers or worthless crawlers.

then this was forged for you.

Works with

Universal by design: these run in any AI. Delivered in the open Agent Skills + MCP format (native in Claude); ChatGPT, Gemini, Cursor and Copilot adapt the same files their own way.

  • Claude Native format
  • ChatGPT Adapts via open standards
  • Gemini Adapts via open standards
  • Cursor Adapts via open standards
  • Copilot Adapts via open standards
Questions · still in the air

Catch what's on your mind.

the air is clear. nothing between you and the forge.
catch a spark: the forge will answer

  1. Do I just drop the robots.txt and ai.txt in, or is there more to it?

    The files are ready to use, so opening access to ChatGPT, Claude and Perplexity is a drop-in. The edge-enforcement layer is the part that actually acts on aggressive crawlers, and that you deploy in front of your site.

  2. robots.txt is only a request a crawler can ignore, so how does this stop the bad ones?

    That is the gap the edge-enforcement layer closes, acting on requests rather than politely asking. The robots and ai.txt files signal intent; the edge rules enforce it.

  3. Does the allowlist tell a real ChatGPT crawler from one spoofing its name?

    The allowlist works from the three-tier reputation matrix and User-Agent identity. Proving a claimed crawler is genuinely that bot, against a spoofer, is what a dedicated classifier handles.

  4. How is it delivered?

    By email right after purchase: ready to run, downloaded instantly, no setup wait.

  5. One-time or subscription?

    A one-time purchase; no subscription or hidden fees. VAT (20%) is included.

  6. Can I get a refund?

    As a digital product, it can’t be refunded once downloaded. That’s why we show exactly what’s inside and who it’s for, right here.