Book Sale! Code WP2025 takes 20% OFF our Pro Plugins & Books »
Web Dev + WordPress + Security

Ultimate Block List to Stop AI Bots

More than you might think, AI (Artificial Intelligence) and ML (Machine Learning) bots are crawling your site and scraping your content. They are collecting and using your data to train software like ChatGPT, OpenAI, DeepSeek, and thousands of other AI creations. Whether you or anyone approves of all this is not my concern for this post. The focus of this post is aimed at website owners who want to stop AI bots from crawling their web pages, as much as possible. To help people with this, I’ve been collecting data and researching AI bots for many months now, and have put together a “Mega Block List” to help stop AI bots from devouring your content.

The ultimate block list for stopping AI bots from crawling your site.

Contents

If you can edit a file, you can block a ton of AI bots.

Thanks: Special Thanks to Kristina Ponting for help with researching AI bots and sharing with the community. Find Kristina at Teskedsgumman and on Github.

Block AI Bots via robots.txt

The easiest way for most website owners to block AI bots, is to append the following list to their site’s robots.txt file. There are many resources explaining the robots.txt file, and I encourage anyone not familiar to take a few moments to learn more.

In a nutshell, the robots.txt file is a file that contains rules for bots to obey. So you can add rules that limit where bots can crawl, whether individual pages or the entire site. Once you have added some rules, simply upload the robots file to the public root directory of your website. For example, here is my robots.txt for Perishable Press.

Using WordPress? Block bad bots automatically with my free plugin, Blackhole for Bad Bots. Trap bad bots in a virtual black hole :)

To block AI bots via your site’s robots.txt file, append the following rules. Understand that bots are not required to obey robots.txt rules. Robots rules are merely suggestions. Good bots will follow the rules, bad bots will ignore the rules and do whatever they want. To force compliance, you can add blocking rules via Apache/.htaccess. That in mind, here are the robots rules to block AI bots..

Blocks over 400+ AI bots and user agents.

Block list for robots.txt

Before using, read the Notes and Disclaimer.

# Ultimate AI Block List v1.3 20250310
# https://perishablepress.com/ultimate-ai-block-list/

User-agent: .ai 
User-agent: Agentic
User-agent: AI Article Writer
User-agent: AI Content Detector
User-agent: AI Dungeon
User-agent: AI Search Engine
User-agent: AI SEO Crawler
User-agent: AI Writer
User-agent: AI21 Labs
User-agent: AI2Bot
User-agent: AIBot
User-agent: AIMatrix
User-agent: AISearchBot
User-agent: AI Training
User-agent: AITraining
User-agent: Alexa
User-agent: Alpha AI
User-agent: AlphaAI
User-agent: Amazon Bedrock
User-agent: Amazon-Kendra
User-agent: Amazon Lex
User-agent: Amazon Comprehend
User-agent: Amazon Sagemaker
User-agent: Amazon Silk
User-agent: Amazon Textract
User-agent: AmazonBot
User-agent: Amelia
User-agent: AndersPinkBot
User-agent: Anthropic
User-agent: AnyPicker
User-agent: Anyword
User-agent: Applebot
User-agent: Aria Browse
User-agent: Articoolo
User-agent: Automated Writer
User-agent: AwarioRssBot
User-agent: AwarioSmartBot
User-agent: BardBot
User-agent: BingAI
User-agent: Bingbot-chat
User-agent: Brave Leo
User-agent: ByteDance
User-agent: Bytespider
User-agent: CatBoost
User-agent: CC-Crawler
User-agent: CCBot
User-agent: ChatGLM
User-agent: Chinchilla
User-agent: Claude
User-agent: ClearScope
User-agent: Cohere
User-agent: Common Crawl
User-agent: CommonCrawl
User-agent: Content Harmony
User-agent: Content King
User-agent: Content Optimizer
User-agent: Content Samurai
User-agent: ContentAtScale
User-agent: ContentBot
User-agent: Contentedge
User-agent: Conversion AI
User-agent: Copilot
User-agent: CopyAI
User-agent: Copymatic
User-agent: Copyscape
User-agent: Cotoyogi
User-agent: CrawlQ AI
User-agent: Crawlspace
User-agent: Crew AI
User-agent: CrewAI
User-agent: DALL-E
User-agent: DataForSeoBot
User-agent: DataProvider
User-agent: DeepAI
User-agent: DeepL
User-agent: DeepMind
User-agent: DeepSeek
User-agent: Diffbot
User-agent: Doubao AI
User-agent: DuckAssistBot
User-agent: FacebookBot
User-agent: FacebookExternalHit
User-agent: Firecrawl
User-agent: Flyriver
User-agent: Frase AI
User-agent: FriendlyCrawler
User-agent: Gemini
User-agent: Gemma
User-agent: GenAI
User-agent: Google Bard AI
User-agent: Google-CloudVertexBot
User-agent: Google-Extended
User-agent: GoogleOther
User-agent: Goose
User-agent: GPT
User-agent: Grammarly
User-agent: Grendizer
User-agent: Grok
User-agent: GT Bot
User-agent: GTBot
User-agent: Hemingway Editor
User-agent: Hugging Face
User-agent: Hypotenuse AI
User-agent: iaskspider
User-agent: ICC-Crawler
User-agent: ImagesiftBot
User-agent: img2dataset
User-agent: INK Editor
User-agent: INKforall
User-agent: IntelliSeek
User-agent: Inferkit
User-agent: ISSCyberRiskCrawler
User-agent: JasperAI
User-agent: Kafkai
User-agent: Kangaroo
User-agent: Keyword Density AI
User-agent: KomoBot
User-agent: LLaMA
User-agent: magpie-crawler
User-agent: MarketMuse
User-agent: Meltwater
User-agent: Meta AI
User-agent: Meta-AI
User-agent: Meta-External
User-agent: MetaAI
User-agent: MetaTagBot
User-agent: Mistral
User-agent: Narrative
User-agent: NeevaBot
User-agent: Neural Text
User-agent: NeuralSEO
User-agent: OAI-SearchBot
User-agent: Omgili
User-agent: Open AI
User-agent: OpenAI
User-agent: OpenBot
User-agent: OpenText AI
User-agent: Outwrite
User-agent: Page Analyzer AI
User-agent: PanguBot
User-agent: Paperlibot
User-agent: Paraphraser.io
User-agent: PerplexityBot
User-agent: PetalBot
User-agent: Phindbot
User-agent: PiplBot
User-agent: ProWritingAid
User-agent: QuillBot
User-agent: RobotSpider
User-agent: Rytr
User-agent: SaplingAI
User-agent: Scalenut
User-agent: Scraper
User-agent: Scrapy
User-agent: ScriptBook
User-agent: SEO Content Machine
User-agent: SEO Robot
User-agent: Sentibot
User-agent: Sidetrade
User-agent: Simplified AI
User-agent: Skydancer
User-agent: SlickWrite
User-agent: Spin Rewriter
User-agent: Spinbot
User-agent: Stability
User-agent: StableDiffusionBot
User-agent: Sudowrite
User-agent: Surfer AI
User-agent: Text Blaze
User-agent: TextCortex
User-agent: The Knowledge AI
User-agent: Timpibot
User-agent: Vidnami AI
User-agent: Webzio
User-agent: Whisper
User-agent: WordAI
User-agent: Wordtune
User-agent: WormsGTP
User-agent: WPBot
User-agent: Writecream
User-agent: WriterZen
User-agent: Writescope
User-agent: Writesonic
User-agent: xAI
User-agent: xBot
User-agent: YouBot
User-agent: Zero GTP
User-agent: Zerochat
User-agent: Zimm
Disallow: /
Important: Whenever making changes to your robots.txt file, take a few moments to validate the rules using a free online robots checker.

Block AI Bots via Apache/.htaccess

To actually enforce the “Ultimate AI Block List”, you can add the following rules to your Apache configuration or main .htaccess file. Like many others, I’ve written extensively on Apache and .htaccess. So if you’re unfamiliar, there are plenty of great resources, including my book .htaccess made easy.

In a nutshell, you can add rules via Apache/.htaccess to customize the functionality of your website. For example, you can add directives that help control traffic, optimize caching, improve performance, and even block bad bots. And these rules operate at the server level. So while bots may ignore rules added via robots.txt, they can’t ignore rules added via Apache/.htaccess (unless they falsify their user agent).

Using Apache? Check out my free, open-source 8G Firewall. 8G is lightweight, fast, and protects your site against a wide range of threats.

To block AI bots via Apache/.htaccess, add the following rules to either your server configuration file, or add to the main (public root) .htaccess file. Before making any changes, be on the safe side and make a backup of your files. Just in case something unexpected happens, you can easily roll back. That in mind, here are the Apache rules to block AI bots..

Blocks over 400+ AI bots and user agents.

Block list for Apache/.htaccess

Before using, read the Notes and Disclaimer.

# Ultimate AI Block List v1.3 20250310
# https://perishablepress.com/ultimate-ai-block-list/

<IfModule mod_rewrite.c>

	RewriteEngine On

	RewriteCond %{HTTP_USER_AGENT} (\.ai\ |Agentic|AI\ Article\ Writer|AI\ Content\ Detector|AI\ Dungeon|AI\ Search\ Engine|AI\ SEO\ Crawler|AI\ Writer|AI21\ Labs|AI2Bot|AIBot|AIMatrix|AISearchBot|AI\ Training|AITraining|Alexa|Alpha\ AI|AlphaAI|Amazon\ Bedrock|Amazon-Kendra) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Amazon\ Lex|Amazon\ Comprehend|Amazon\ Sagemaker|Amazon\ Silk|Amazon\ Textract|AmazonBot|Amelia|AndersPinkBot|Anthropic|AnyPicker|Anyword|Applebot|Aria\ Browse|Articoolo|Automated\ Writer|AwarioRssBot|AwarioSmartBot|BardBot|BingAI|Bingbot-chat) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Brave\ Leo|ByteDance|Bytespider|CatBoost|CC-Crawler|CCBot|ChatGLM|Chinchilla|Claude|ClearScope|Cohere|Common\ Crawl|CommonCrawl|Content\ Harmony|Content\ King|Content\ Optimizer|Content\ Samurai|ContentAtScale|ContentBot|Contentedge) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Conversion\ AI|Copilot|CopyAI|Copymatic|Copyscape|Cotoyogi|CrawlQ\ AI|Crawlspace|Crew\ AI|CrewAI|DALL-E|DataForSeoBot|DataProvider|DeepAI|DeepL|DeepMind|DeepSeek|Diffbot|Doubao\ AI|DuckAssistBot) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (FacebookBot|FacebookExternalHit|Firecrawl|Flyriver|Frase\ AI|FriendlyCrawler|Gemini|Gemma|GenAI|Google\ Bard\ AI|Google-CloudVertexBot|Google-Extended|GoogleOther|Goose|GPT|Grammarly|Grendizer|Grok|GT\ Bot|GTBot) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Hemingway\ Editor|Hugging\ Face|Hypotenuse\ AI|iaskspider|ICC-Crawler|ImagesiftBot|img2dataset|INK\ Editor|INKforall|IntelliSeek|Inferkit|ISSCyberRiskCrawler|JasperAI|Kafkai|Kangaroo|Keyword\ Density\ AI|KomoBot|LLaMA|magpie-crawler|MarketMuse) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Meltwater|Meta\ AI|Meta-AI|Meta-External|MetaAI|MetaTagBot|Mistral|Narrative|NeevaBot|Neural\ Text|NeuralSEO|OAI-SearchBot|Omgili|Open\ AI|OpenAI|OpenBot|OpenText\ AI|Outwrite|Page\ Analyzer\ AI|PanguBot) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Paperlibot|Paraphraser\.io|PerplexityBot|PetalBot|Phindbot|PiplBot|ProWritingAid|QuillBot|RobotSpider|Rytr|SaplingAI|Scalenut|Scraper|Scrapy|ScriptBook|SEO\ Content\ Machine|SEO\ Robot|Sentibot|Sidetrade|Simplified\ AI) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (Skydancer|SlickWrite|Spin\ Rewriter|Spinbot|Stability|StableDiffusionBot|Sudowrite|Surfer\ AI|Text\ Blaze|TextCortex|The\ Knowledge\ AI|Timpibot|Vidnami\ AI|Webzio|Whisper|WordAI|Wordtune|WormsGTP|WPBot|Writecream) [NC,OR]
	RewriteCond %{HTTP_USER_AGENT} (WriterZen|Writescope|Writesonic|xAI|xBot|YouBot|Zero\ GTP|Zerochat|Zimm) [NC]

	RewriteRule (.*) - [F,L]

</IfModule>
Important: Remember to test well before going live. You can use a free user-agent request tool to make requests posing as various AI bots.

Notes

Note: The two block lists above (robots.txt and Apache/.htaccess) are synchronized and include/block the same AI bots.

Note: Numerous user agents are omitted from the block lists because the names are matched in wild-card fashion. Here is a list showing wild-card blocked AI bots.

Note: Both block lists focus on AI-related bots. Some of those bots are used by major search companies like Google and Bing. Likewise the lists block AI bots that are used by other giant corporations like Apple and Amazon. Every caution has been made to ensure that only AI-related bots are blocked (and not regular web crawlers), but search engines like Google and Bing increasingly are using AI-collected data in their search results. So keep this in mind and feel free to remove any bots that you think should be allowed access. Make sure to check the wild-card blocked AI bots.

Note: Apparently it’s been reported that blocking Google-Extended blocks Google in general. So maybe remove if you want to be extra careful.

Note: Both block lists are case-insensitive. The robots.txt rules are case-insensitive by default, and the Apache rules are case-insensitive due to the inclusion of the [NC] flag. So don’t worry about mixed-case bot names, their user agents will be blocked, whether uppercase, lowercase, or mIxeD cAsE.

Learn more: According to Google documentation, the value of the user-agent line (in robots.txt) is case-insensitive.

Changelog

Robots.txt

  • Version 1.3 – 2025/03/10 – Adds more AI bots, refines list to make better use of wild-card pattern matching of user-agent names.
  • Version 1.2 – 2025/02/12 – Adds 73 AI bots (Thanks to Robert DeVore)
  • Version 1.0 – 2025/02/11 – Initial release.

Apache/.htaccess

  • Version 1.3 – 2025/03/10 – Adds more AI bots, refines list to make better use of wild-card pattern matching of user-agent names.
  • Version 1.2 – 2025/02/12 – Adds 73 AI bots (Thanks to Robert DeVore)
  • Version 1.1 – 2025/02/11 – Replaces REQUEST_URI with HTTP_USER_AGENT
  • Version 1.0 – 2025/02/11 – Initial release.

Disclaimer

The information shared on this page is provided “as-is”, with the intention of helping people protect their sites against AI bots. The two block lists (robots.txt and Apache/.htaccess) are open-source and free to use and modify without condition. By using either block list, you assume all risk and responsibility for anything that happens. So use wisely, test thoroughly, and enjoy the benefits of my work :)

Support my work

I spend countless hours digging through server logs, researching user agents, and compiling block lists to stop AI and other unwanted bots. I share my work freely with the hope that it will help make the Web a more secure place for everyone.

If you benefit from my work and want to show support, please make a donation or buy one of my books, such as .htaccess made easy. You’ll get a complete guide to .htaccess and a ton of awesome techniques for optimizing and securing your site.

Of course, tweets, likes, links, and shares also are super helpful and very much appreciated. Your generous support enables me to continue developing AI block lists and other awesome resources for the community. Thank you kindly :)

Show support! Donate via PayPal, Stripe, or your favorite digital coin »

References

Thanks to the following resources for sharing their work with identifying and blocking AI bots.

Feedback

Got more? Leave a comment below with your favorite AI bots to block. Or send privately via my contact form. Cheers! :)

About the Author
Jeff Starr = Web Developer. Security Specialist. WordPress Buff.
SAC Pro: Unlimited chats.

9 responses to “Ultimate Block List to Stop AI Bots”

  1. Super cool! Thank you so much for your tireless work with the nG firewall and now this very comprehensive AI user agent blocking list. I tried to recognize and block the bots one by one, but there were obviously still many missing.

  2. This block list is solid for keeping AI bots out, but with how fast AI is evolving, do you think blocking them entirely is the best long-term move?

    Some bots might be scraping content, but others could be indexing sites in ways that drive traffic. Is there a case for selectively allowing certain AI bots while blocking the rest?

    • Jeff Starr 2025/02/14 6:56 am Reply

      “..do you think blocking them entirely is the best long-term move?”

      Good question. Trying to plan “long-term” is folly given “how fast AI is evolving”. Literally anything can happen. As mentioned in the intro, this post is not about arguing pros and cons; it’s for people who want to block AI bots like now. Long-term yes, resistance (probably) is futile. Short-term you can block a lot of them with just a few clicks.

      “Is there a case for selectively allowing certain AI bots while blocking the rest?”

      Technically speaking, you can do that with either robots.txt or Apache/.htaccess (and other languages I’m sure). As for whether or not it makes sense for any given site, depends on myriad factors. So not a one-size-fits-all strategy imo.

      Note: I edited my original reply after giving it more thought.

  3. Thanks Jeff, Longtime fan here. I appreciate your ongoing contributions to a safer internet. I appreciate the extensive notes here too. I’ll be implementing these minus Google-Extended and see what happens.

  4. Jeff, thanks a lot for your great work. Appreciate very much your efforts.

    I want to ask you if we can apply both, robots.txt and .htaccess lists, to strengthen security. Or only one of the two needs to be installed.

    Again, thanx a lot Jeff!

    • Jeff Starr 2025/02/28 6:25 am Reply

      Hi Prince, yes you can install both although it’s kind of unnecessary. The robots file isn’t going to stop anything that’s not blocked via Apache/.htaccess, but the converse isn’t true: the .htaccess rules will stop any listed bots that choose to ignore robots rules.

  5. Thanks a lot Jeff for your patience

  6. Rick Beckman 2025/03/03 6:52 pmReply

    I’ve been using a small block list that I’ve put together to achieve this, so I’m glad to see this larger list.

    But I’ve also seen people using a tool that they can direct AI bots to which catches them in a link loop, loading page after page of nonsense text that is close to seeming like real human text but is just generated gobbledygook. I love the idea of this, but have no idea how to build it.

    It’s similar to the honey pots that Project Honeypot facilitates for spam bots — giving them thousands of fake emails, making their harvested lists bloated. I want to poison the well of AI. (Although, in doing so, it may force the devs of said bots to improve teh whole thing, causing AI bot evolution rather than being a true hurdle. Huh… Well, maybe I just talked myself out of this idea. Ah well.

    Thanks for your list, Jeff!

    • Jeff Starr 2025/03/04 6:35 am Reply

      Hey Rick, yes you’re referring to “tarpits”, a fun way to confuse and trap bots in an endless maze of nonsense. I posted about it on Mastodon here for anyone else who may be interested. Tarpits take some time to setup/configure and chew up a LOT of server resources, but certainly are a fun way to pass some time :)

Leave a reply

Name and email required. Email kept private. Basic markup allowed. Please wrap any small/single-line code snippets with <code> tags. Wrap any long/multi-line snippets with <pre><code> tags. For more info, check out the Comment Policy and Privacy Policy.

Subscribe to comments on this post

Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
Blackhole Pro: Trap bad bots in a virtual black hole.
Thoughts
Launching my new plugin, Head Meta Pro 🚀 Complete meta tags for WordPress.
Migrating sites to a new server, so far so good. Please report any bugs, thank you.
Arc browser looked good but lost me at “account required”. No browsers do that.
Finishing up the pro version of Head Meta Data plugin, launch planned this month.
Finally finished my ultimate block list to stop AI bots :) Blocks over 400+ AI bots!
After 10 years working late at night, my schedule has changed. I am now a “morning person”, starting my day at 6am or earlier.
Nice update for Wutsearch search engine launchpad. Now with 19 engines including Luxxle AI-powered search.
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.