Latest TweetsGreat post about the latest power grab: www.eff.org/deeplinks/2018/09/…
Perishable Press

Ultimate htaccess Blacklist

[ Image: Solar Eclipse ] For those of us running Apache, htaccess rewrite rules provide an excellent way to block spammers, scrapers, and other scumbags easily and effectively. While there are many htaccess tricks involving blocking domains, preventing access, and redirecting traffic, Apache’s mod_rewrite module enables us to target bad agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any matches are immediately and quietly denied access.

Update: Check out the new and improved 6G Blacklist/Firewall »

Building the Blacklist

There are many ways to obtain an effective htaccess blacklist. There are several excellent forums around the web that provide a plethora of priceless htaccess advice. Highly suggested. Additionally, after copying and pasting your favorite forum blacklist examples to your domain’s root htaccess file, you will want to continue with its development by tracking bandwidth thieves, comment spammers, and site scrapers and adding them to the list.

Or, you may wish to skip the tedious grunt work and simply grab a copy of the Ultimate htaccess Blacklist!

The Ultimate htaccess Blacklist began as a short list of only the most heinous offenders. Blocking scum was such an enjoyable activity that we soon added to the list the identity of every nasty agent we could find. The result has been a very low-stress, spam-free site with virtually zero stolen bandwidth. The list is fairly comprehensive and attempts to blacklist as many site rippers, grabbers, spammers and bad bots as possible. While no blacklist could ever block them all (nor would they want to using this method)1, an elaborate htaccess blacklist can do wonders to improve overall performance, decrease site maintenance, and reduce server expense. Overall, we consider this blacklist a great foundation on which to build and customize your own ultimate htaccess blacklist!2

The Ultimate .htaccess Blacklist

So without further ado, here is our version of the ultimate htaccess blacklist, as promised. Simply copy and paste the following code into the root htaccess file of your site to enjoy a serious reduction in wasted bandwidth, stolen resources, and comment spam. Don’t forget to backup your data and test everything, etc. — After that, you’re good to go!

# Ultimate htaccess Blacklist from Perishable Press
# Deny domain access to spammers and other scumbags
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} almaden [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EasyDL/2.99 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go\!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^grub-client [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Indy*Library [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MS\ FrontPage* [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^sproose [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^URLSpiderPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^webcollage [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]

Update

To reduce confusion and consolidate htaccess rules, the last two lines have been removed from the blacklist. These two lines are not required for the blacklist to work as intended:

RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.perishablepress.com.* - [F,L]

Footnotes

  • 1 Although this blacklist is highly effective at eliminating unwanted scum, its immense length requires extra processing and may affect the performance of your server. In our experience, employing this list (along with several other htaccess directives) for over two years has resulted in zero noticeable performance issues. Nonetheless, this may not be an ideal solution for sites with extreme levels of visitor traffic.
  • 2 To begin building your own customized blacklist, you may want to check out the excellent list offered at joemaller.com. Thanks, Joe!

Jeff Starr
About the Author Jeff Starr = Fullstack Developer. Book Author. Teacher. Human Being.
Archives
72 responses
  1. Jeff Starr

    @Milan: I have not tested this list of directives specifically (in terms of performance), but have seen much bigger lists (and htaccess files) in play that don’t seem to have much of a negative impact on performance. But then again, I’m not going to sit here and tell you that it doesn’t have an effect – the server has to process all of those matches for every valid page request, so probably not advisable for high-volume traffic sites and/or slow servers. One thing you can do to improve performance in general is to add the following line to your root htaccess file:

    AllowOverride None

    For more info on this method, see my article Stupid htaccess Tricks.

  2. I am interested to know if there is a way that your Ultimate htaccess Blacklist wont work on certain hosts yet other basic .htaccess does?
    I can add the redirect to www rule, yet I can’t use any version of your Ultimate htaccess Blacklists?

    What do I need to look for to make one of the Ultimate htaccess Blacklists work?

    Regards.

  3. @Jeff Starr
    Good point, thanks!

  4. Jeff Starr

    Hi Json, here is the easiest way I have found for troubleshooting problematic code.

    Basically just crop out half of it, see if it works, then remove another chunk, and so on until you isolate the source of the issue.

  5. Jeff DiLette May 8, 2010 @ 5:00 pm

    To tend to the speed issue, put the list into your httpd.conf so it loads at apache restart instead of every pageload. I apply it globally to all sites on my cPanel server:

    #Statements here

    I added dome bandwidth killers toy our list:

    RewriteEngine on
    RewriteBase /
    RewriteCond %{HTTP_USER_AGENT} almaden [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
    RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot mailto:craftbot@yahoo.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Accelerator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Alligator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackStreet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Charon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DataFountains [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sakura [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DDD [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BilderSauge [OR]
    RewriteCond %{HTTP_USER_AGENT} ^dlman [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Druid [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Express [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Master [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download-Tip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Wonder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download.exe [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DownloadDirect [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Extreme Picture Finder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeCatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FDM [OR]
    RewriteCond %{HTTP_USER_AGENT} ^libfetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Fielhound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^download-link [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FreshDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Atomic_Email_Hunter [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BPImageWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CoBITSProbe [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CydralSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Doubanbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Exabot-Images [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 2 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 7 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 8 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 9 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 10 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 11 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA 12 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 2 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 7 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 8 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 9 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 10 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 11 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP 12 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download Wonder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EasyDL/2.99 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^grub-client [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
    RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Indy*Library [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC Web Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
    RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MS FrontPage* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sproose [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URLSpiderPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web Image Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcollage [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    RewriteRule ^.* - [F,L]

  6. Benjamin "balupton" Lupton July 18, 2010 @ 8:07 pm

    You could make it more efficient by merging the regexs:

    RewriteCond %{HTTP_USER_AGENT} ^(a(narchie|lmaden|spseek|ttach|utoemailspider)|(b(a(ckweb|ndit|tchftp)|lackwidow|ot mailto:craftbot@yahoo.com|u(ddy|mblebee)))|(c(h(errypicker|inaclaw))|icc|o(llector|opier)|rescent|usto)|(d.....)) [NC]

    I will do a script to compact them all sometime.

  7. Jeff Starr

    Sounds awesome, Benjamin – let me know what you come up with :)

  8. Hi Jeff

    Thanks very much for your posts and especially this one….

    I was wondering if instead of blocking those people (or sending’em away to a website like poison spam), would’nt it be a good idea to redirect them to one of my pages (the lightest one to avoid bandwith consumption) indicating that I’ve banned them, and giving them the opportunity to leave me a message in case they would be legit users ? Cuz, the trouble is: if you ban someone erroneously, there’s no way for him/her to let you know.
    I think I know how to do that but do you think that would be a good idea in your opinion ?

    Thanks in advance for your insight.

  9. Jeff Starr

    Hi Francois,

    That’s actually a good idea if you don’t mind the extra bandwidth, server resources, etc. I may look into adding this option in the next update.

    Thanks for the idea!

  10. I have a pesky internet site with a hyperlink pointing to my site from their site.

    I politely asked they remove that hyperlink and they refuse to respect the
    wishes of others.

    Is there something I could do? I definitely tried what I have read in other posts on your site like, blacklist by IP address. Unfortunately, for whatever reason, when I test their hyperlink to my site it still comes through to my site instead of redirecting them back to their own site.

    Please advise.

  11. Hi Michael,

    Try this:

    RewriteCond %{HTTP_REFERER} ^.*(the_bad_site_goes_here.tld).* [NC]
    RewriteRule ^(.*)$ http://%{REMOTE_ADDR}/ [L]

    Juste replace “the_bad_site_goes_here.tld” by the site with the good extension (tld)
    Then each of request from his site to yours will instantly back to his own site.
    In other words, he will hit himslef each time.

  12. @ Jeff and François

    The feature what you’re think about, is almost like the HoneyPot system do.
    Conditional landig page, with notification for the banned ones.
    Already try thet, but this is not really accurate, too much legitimate users were blocked so I stop with that.

    For the bad users agent, I prefer to look at my logs and block only what is need.
    The lists that we can find on the web are often obsolete. Make your fine tune list yourself is the best way.
    Look at your logs first…
    IMUO

[ Comments are closed for this post ]