Ultimate htaccess Blacklist

Posted on June 28, 2007 in Function by

[ Image: Solar Eclipse ] For those of us running Apache, htaccess rewrite rules provide an excellent way to block spammers, scrapers, and other scumbags easily and effectively. While there are many htaccess tricks involving blocking domains, preventing access, and redirecting traffic, Apache’s mod_rewrite module enables us to target bad agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any matches are immediately and quietly denied access.

There are many ways to obtain an effective htaccess blacklist. There are several excellent forums around the web that provide a plethora of priceless htaccess advice. Highly suggested. Additionally, after copying and pasting your favorite forum blacklist examples to your domain’s root htaccess file, you will want to continue with its development by tracking bandwidth thieves, comment spammers, and site scrapers and adding them to the list. Or, you may wish to skip the tedious grunt work and simply grab a copy of the Ultimate htaccess Blacklist!

The Ultimate htaccess Blacklist began as a short list of only the most heinous offenders. Blocking scum was such an enjoyable activity that we soon added to the list the identity of every nasty agent we could find. The result has been a very low-stress, spam-free site with virtually zero stolen bandwidth. The list is fairly comprehensive and attempts to blacklist as many site rippers, grabbers, spammers and bad bots as possible. While no blacklist could ever block them all (nor would they want to using this method)1, an elaborate htaccess blacklist can do wonders to improve overall performance, decrease site maintenance, and reduce server expense. Overall, we consider this blacklist a great foundation on which to build and customize your own ultimate htaccess blacklist!2

So without further ado, here is our version of the ultimate htaccess blacklist, as promised. Simply copy and paste the following code into the root htaccess file of your site to enjoy a serious reduction in wasted bandwidth, stolen resources, and comment spam. Don’t forget to backup your data and test everything, etc. — After that, you’re good to go!

The Ultimate htaccess Blacklist

# Ultimate htaccess Blacklist from Perishable Press
# Deny domain access to spammers and other scumbags
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} almaden [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EasyDL/2.99 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go\!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^grub-client [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Indy*Library [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MS\ FrontPage* [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^sproose [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^URLSpiderPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^webcollage [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]

Footnotes

  • 1 Note: although this blacklist is highly effective at eliminating unwanted scum, its immense length requires extra processing and may affect the performance of your server. In our experience, employing this list (along with several other htaccess directives) for over two years has resulted in zero noticeable performance issues. Nonetheless, this may not be an ideal solution for sites with extreme levels of visitor traffic. [ ^ ]
  • 2 To begin building your own customized blacklist, you may want to check out the excellent list offered at joemaller.com. Thanks, Joe! [ ^ ]
  • 3 Update (October 14,2007): To reduce confusion and consolidate htaccess rules, the last two lines have been removed from the blacklist. These two lines are not required for the blacklist to work as intended:
    RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
    RewriteRule !^http://[^/.]\.perishablepress.com.* - [F,L]

Related articles

67 Responses

  1. [ Gravatar Icon ] Phil says:

    Probably just being dim, but I don’t get the final two lines (where the condition is that the referrer is http://www.iaea.org). Is there some spambot that pretends to be an incoming link from the IAEA?

  2. [ Gravatar Icon ] Perishable says:

    Yes, apparently there is a nasty bot from somewhere in Southeast Asia that uses iaea.org as the referrer. Not sure if it is still crawling the web — you may be safe removing it from the list..

  3. [ Gravatar Icon ] Phil says:

    OK, thanks.

    I ran into a problem with the blacklist provided, however, and think there might be a typo.

    The following line causes an internal error on my server. If I add a space after the DISCo slash, then it works fine again:
    RewriteCond %{HTTP_USER_AGENT} ^DISCo\Pump [OR]

    Thanks for the great blacklist. I’m now using it in place of the less comprehensive version I obtained from here:
    http://www.javascriptkit.com/howto/htaccess13.shtml

    Phil.

  4. [ Gravatar Icon ] Perishable says:

    Phil,

    Yes, good catch. The DISCo condition should definitely have a space before the Pump:

    RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]

    I have corrected the blacklist so others should not experience any problems.
    Thank you for your help!

    Regards,
    Jeff

  5. [ Gravatar Icon ] WillMacc says:

    Excellent blacklist for the htaccess file!
    It even had a few on there I haven’t seen before! :)
    I rant on this subject on my blog all the time:
    http://www.a-daily-rant.com/

  6. [ Gravatar Icon ] Perishable says:

    WillMacc,

    We are honored by your presence! Your work at http://www.a-daily-rant.com/ is excellent and much appreciated. I highly recommend your site for anyone serious about winning the war against spam, scrapers, and other online scum!

    Many thanks!

  7. [ Gravatar Icon ] Greg says:

    Thanks so much, with any doubt, the best list ever.Your site is fantastic !!
    I’m wondering if you could privude the same list but in a “compress version” ?
    Likethis for exemple: RewriteCond %{HTTP_USER_AGENT} ADSARobot|Anarchie|ASPSeek|Atomz|BackWeb|Bandit|…and so much more… with the [OR] and/or the [NC,OR] at the ends of lines.
    Thank you agnain !

  8. [ Gravatar Icon ] Perishable says:

    Greg,

    By all means, that is a great idea. I will rewrite the blacklist and post a complete, compressed version featuring even more agents within the next week or so. Stay tuned..

  9. [ Gravatar Icon ] Robs says:

    Thanks - just a heads up that this line causes the whole list to fail on my server (it ignores everything and lets every bot through). Comment it out and there is no problem

    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]

  10. [ Gravatar Icon ] Perishable says:

    Robs,

    Thanks for the heads up — may I ask what type of server/software are you using?

    Thanks,
    Jeff

  11. [ Gravatar Icon ] Perishable says:

    Greg,
    Thanks again for the suggestion to release a compressed version of the ultimate htaccess blacklist. I finally managed to find the time to rewrite the list, and decided to add another fifty or so agents. Even with the new entries, the compressed blacklist is almost half the size of the original. Here is a link to the new article.
    Cheers,
    Jeff

  12. [ Gravatar Icon ] Greg says:

    Well done !

    Your Welcome :-)

  13. [ Gravatar Icon ] Sanstenarios says:

    Hi,

    glad to find up to date info about referer spam. If i get the point, referer spam is created using forged referer info in the http request, but can’t the user agent be forged also ?

  14. [ Gravatar Icon ] Perishable says:

    Sanstenarios,
    Thanks for the comments. Unfortunately, any blacklist will only stop only those agents that admit to being on the list. Of course, forging identifying data such as referrer info and user agent is indeed possible, rendering virtually all such lists at least partially ineffective. Nonetheless, extensive blacklists such as the one provided in this article remain quite effective in denying access to spam whores and other scumbags. So, until that perfect solution manifests, we continue to employ any and all tools at our disposal.

  15. [ Gravatar Icon ] Sanstenarios says:

    I set up the compress list and gonna check my logs ;)

    Thanx

  16. [ Gravatar Icon ] khalifa says:

    great list thnx

  17. [ Gravatar Icon ] Lisa says:

    Wow! This is one big list :) By default, my .htaccess file already has the line “RewriteEngine on” and
    “RewriteBase /”. I don’t need to rewrite them right?

    I’m a little confused with how .htaccess works.

  18. [ Gravatar Icon ] Perishable says:

    Yes, Lisa, that is correct — you only need to specify each of those directives once per directory. If your htaccess file is in the root of your site, then you only need to declare them in that file, even if you place additional rewrite rules in subdirectories further downstream. The RewriteEngine on is simply telling the Apache server software to enable the rewrite module so that it can process rewrite rules, while the RewriteBase / declaration explicitly sets the base URL for per-directory rewrites.

  19. [ Gravatar Icon ] Proximuz says:

    Hi.

    Thanks for your list :) It’s your list up to date?

  20. [ Gravatar Icon ] Perishable says:

    Hi Proximuz,

    I last updated this list on October 15, 2007. Blacklists such as this are difficult to keep updated because the bad bots are constantly changing — there are new bots popping up every day, and old ones that simply disappear. Currently, I am constructing a fresh blacklist that I will post later this year.. until then, you may want to check out the Ultimate Blacklist 2 — last updated on November 5th, 2007, and also compressed for easier handling :)

  21. [ Gravatar Icon ] Proximuz says:

    Hi Perishable :)

    Thanks for your answers:)
    I’m gonna take a look at your link.

    Thanks

  22. [ Gravatar Icon ] Perishable says:

    My pleasure, Proximuz :)
    Best of luck!! ;)

  23. [ Gravatar Icon ] Rasheed says:

    Hello Jef,

    I suffer a lot from site scrappers.

    I have this list in my site.

    For test purposes i tried to (scrap) my site using Offline Explorer but it was not blocked.

    I also tried the same thing with your site (just for 30 sec) and did not get Offline Explorer blocked.

    What should i do to block it ?

    How can i determine which software the scrapper is using ? The raw log file shows that ?

    Thanks,

    Rasheed.

  24. [ Gravatar Icon ] Perishable says:

    Hi Rasheed,

    I feel your pain!
    Perhaps this will work:

    # block Offline Explorer
    SetEnvIfNoCase User-Agent "Offline Explorer" keep_out
    <Limit GET POST>
       order allow,deny
       allow from all
       deny from env=keep_out
    </Limit>

    It takes a different approach by using SetEnvIfNoCase, but I am not certain that “Offline Explorer” is the actual user agent for that client. If it doesn’t work, you may also want to try alternate names for the user agent.

    I hope it helps!

  25. [ Gravatar Icon ] Rasheed says:

    Hi Jeff,

    I found they changed the agent name to IE.

    If i block it also Internet Explorer will be blocked (403).

    Frustrating !

  26. [ Gravatar Icon ] Perishable says:

    Ouch — That sucks! I guess I can remove that line from the blacklist now.. ;)

  27. [ Gravatar Icon ] Jeff Mez says:

    If you have RSS feeds, something in this list blocks Google and Yahoo from being able to read your RSS XML. Feedburner was still able to update. I put this on my site on Feb-16 and just today checked my feeds on My-Google and Yahoo. Both stopped updating on Feb-16. I commented out just this blacklist from my .htaccess file, waited about 30 minutes and sure enough, Google and Yahoo both updated to today. :\

  28. [ Gravatar Icon ] Perishable says:

    Thanks for the input, Jeff. I will be looking into this. In the meantime, you may want to check out a lighter, “friendlier” type of blacklist. :)

  29. [ Gravatar Icon ] Peter says:

    Hi
    Does this even have to go into a .htaccess file? Can’t it be used more globally by having it in a apache config file?

  30. [ Gravatar Icon ] Perishable says:

    Yes, the blacklist may definitely be placed directly into Apache’s httpd.conf file.

  31. [ Gravatar Icon ] Peter says:

    Thanks for the follow up. With several domains its much easier to maintain doing this globally.

    I log all activity to a database and the amount of 403 I’m getting is impressive between this and the 2G Blacklist.

    :)

  32. [ Gravatar Icon ] Perishable says:

    Absolutely! I love to watch the idiot bots bounce off the walls ;)

  33. [ Gravatar Icon ] HR Blog says:

    Thanks for this list and the updated list. I love watching all the denials in the logs.. Is that creepy?

  34. [ Gravatar Icon ] TechJammer says:

    I have noticed that comment spammers are bypassing this security step, and just forging my sites own referer ID. Has anyone discovered a way to detect a forged referer ID?

  35. [ Gravatar Icon ] Mark says:

    My ISP just upgraded to PHP5, and I discovered that an “old” version of this list was caused 500 server errors. I replaced it with the new list, and it’s smiles all around again. I’m just leaving this note as a a thank you and as a “heads up” to anyone else running into a PHP5 panic attack.
    Thanks again !!!

  36. [ Gravatar Icon ] Json says:

    Hi Mark

    I get the same with both lists.
    Which list did you use?

    Cheers

  37. [ Gravatar Icon ] Json says:

    Fixed, it was a syntax issue!

    “Order allow,deny” not “order allow,deny” Case sensitive.

    Cheers

  38. [ Gravatar Icon ] Mark says:

    Hello !

    I can’t remember now, but it seems you located the error already anyway,
    so it’s all good now, I suppose :)

    Thanks again,

    Mark

  39. [ Gravatar Icon ] Jeff Starr says:

    @Json: It might be because I just woke up, but I am not seeing where there are any instances of order allow,deny in the blacklist.. what am I missing here?

  40. [ Gravatar Icon ] Json says:

    @Jeff Starr: Sorry I didn’t post any code. My 500 error was from the wrong case in “order - should be Order

    Order allow,deny
    allow from all
    deny from eudora.com
    deny from bravenet.com
    deny from tripod.com
    deny from lethadinan.pib.ir
    deny from xanga.com
    deny from iblogme.com

  41. [ Gravatar Icon ] Jeff Starr says:

    @Json: Okay thanks, now I understand. But keep in mind that by posting that comment on this thread, you insinuate that the faulty code was obtained from, or otherwise related to, this article. As far as I can tell, no such directives are provided on either this post or in the related (compressed version) blacklist. In any case, I am glad that you have resolved the issue with your code. :)

  42. [ Gravatar Icon ] Json says:

    Cool will do for next time. I still can’t figure out how to test it though.
    I have searched for a .htaccess testing system but couldn’t find one.

    Cheers

  43. [ Gravatar Icon ] John says:

    Silly Question: Couldn’t the spammers just set an user-agent of mozilla? or are you assuming they are not that smart?

  44. [ Gravatar Icon ] Jeff Starr says:

    @John: Not silly at all, really. As with any group of people, there are those with intelligence and those lacking in it. In my experience, there are many spammers who declare mozilla and other common user-agents to bypass filters, but the vast majority do not. Unfortunately, user-agent blacklists are based on assumptions, but they continue to prove useful nonetheless.

  45. [ Gravatar Icon ] John says:

    Hi Jeff,

    Good deal, makes sense.

    Thanks!
    -John

  46. [ Gravatar Icon ] Milan says:

    This seems very useful, thanks.

    Unfortunately, my hosting provider (GoDaddy) doesn’t provide detailed logs, so it is hard to know how much difference this is making.

    Are there any disadvantages to having a long .htaccess file, in terms of site performance?

  47. [ Gravatar Icon ] Jeff Starr says:

    @Milan: I have not tested this list of directives specifically (in terms of performance), but have seen much bigger lists (and htaccess files) in play that don’t seem to have much of a negative impact on performance. But then again, I’m not going to sit here and tell you that it doesn’t have an effect - the server has to process all of those matches for every valid page request, so probably not advisable for high-volume traffic sites and/or slow servers. One thing you can do to improve performance in general is to add the following line to your root htaccess file:

    AllowOverride None

    For more info on this method, see my article Stupid htaccess Tricks.

  48. [ Gravatar Icon ] Json says:

    I am interested to know if there is a way that your Ultimate htaccess Blacklist wont work on certain hosts yet other basic .htaccess does?
    I can add the redirect to www rule, yet I can’t use any version of your Ultimate htaccess Blacklists?

    What do I need to look for to make one of the Ultimate htaccess Blacklists work?

    Regards.

  49. [ Gravatar Icon ] Jeff Starr says:

    Hi Json, here is the easiest way I have found for troubleshooting problematic code:

    http://perishablepress.com/press/2009/01/11/the-halving-method-of-identifying-problematic-code/

    Basically just crop out half of it, see if it works, then remove another chunk, and so on until you isolate the source of the issue.

  50. [ Gravatar Icon ] Json says:

    @Jeff Starr
    Good point, thanks!

  51. [ Gravatar Icon ] Jeff DiLette says:

    To tend to the speed issue, put the list into your httpd.conf so it loads at apache restart instead of every pageload. I apply it globally to all sites on my cPanel server:

    #Statements here

    I added dome bandwidth killers toy our list:

    RewriteEngine on
    RewriteBase /
    RewriteCond %{HTTP_USER_AGENT} almaden [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
    RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Accelerator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Alligator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackStreet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Charon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DataFountains [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sakura [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DDD [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BilderSauge [OR]
    RewriteCond %{HTTP_USER_AGENT} ^dlman [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Druid [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Express [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Master [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download-Tip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download.exe [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DownloadDirect [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Extreme\ Picture\ Finder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeCatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FDM [OR]
    RewriteCond %{HTTP_USER_AGENT} ^libfetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Fielhound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^download-link [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FreshDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Atomic_Email_Hunter [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BPImageWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CoBITSProbe [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CydralSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Doubanbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Exabot-Images [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 2 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 7 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 8 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 9 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 10 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 11 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA\ 12 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 2 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 6 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 7 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 8 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 9 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 10 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 11 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DAP\ 12 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EasyDL/2.99 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go\!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^grub-client [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
    RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Indy*Library [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
    RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MS\ FrontPage* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sproose [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URLSpiderPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcollage [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    RewriteRule ^.* - [F,L]

  52. You could make it more efficient by merging the regexs:

    RewriteCond %{HTTP_USER_AGENT} ^(a(narchie|lmaden|spseek|ttach|utoemailspider)|(b(a(ckweb|ndit|tchftp)|lackwidow|ot\ mailto:craftbot@yahoo.com|u(ddy|mblebee)))|(c(h(errypicker|inaclaw))|icc|o(llector|opier)|rescent|usto)|(d.....)) [NC]

    I will do a script to compact them all sometime.

  53. [ Gravatar Icon ] Jeff Starr says:

    Sounds awesome, Benjamin – let me know what you come up with :)

  54. [ Gravatar Icon ] Francois says:

    Hi Jeff

    Thanks very much for your posts and especially this one….

    I was wondering if instead of blocking those people (or sending’em away to a website like poison spam), would’nt it be a good idea to redirect them to one of my pages (the lightest one to avoid bandwith consumption) indicating that I’ve banned them, and giving them the opportunity to leave me a message in case they would be legit users ? Cuz, the trouble is: if you ban someone erroneously, there’s no way for him/her to let you know.
    I think I know how to do that but do you think that would be a good idea in your opinion ?

    Thanks in advance for your insight.

  55. [ Gravatar Icon ] Jeff Starr says:

    Hi Francois,

    That’s actually a good idea if you don’t mind the extra bandwidth, server resources, etc. I may look into adding this option in the next update.

    Thanks for the idea!

  56. [ Gravatar Icon ] Michael says:

    I have a pesky internet site with a hyperlink pointing to my site from their site.

    I politely asked they remove that hyperlink and they refuse to respect the
    wishes of others.

    Is there something I could do? I definitely tried what I have read in other posts on your site like, blacklist by IP address. Unfortunately, for whatever reason, when I test their hyperlink to my site it still comes through to my site instead of redirecting them back to their own site.

    Please advise.

  57. [ Gravatar Icon ] Sirius says:

    Hi Michael,

    Try this:

    RewriteCond %{HTTP_REFERER} ^.*(the_bad_site_goes_here\.tld).* [NC]
    RewriteRule ^(.*)$ http://%{REMOTE_ADDR}/ [L]

    Juste replace “the_bad_site_goes_here\.tld” by the site with the good extension (tld)
    Then each of request from his site to yours will instantly back to his own site.
    In other words, he will hit himslef each time.

  58. [ Gravatar Icon ] Sirius says:

    @ Jeff and François

    The feature what you’re think about, is almost like the HoneyPot system do.
    Conditional landig page, with notification for the banned ones.
    Already try thet, but this is not really accurate, too much legitimate users were blocked so I stop with that.

    For the bad users agent, I prefer to look at my logs and block only what is need.
    The lists that we can find on the web are often obsolete. Make your fine tune list yourself is the best way.
    Look at your logs first…
    IMUO

  59. [ Gravatar Icon ] Michael says:

    Hi Sirius,

    I believe I have a conflicting statement in my htaccess file because I feel confident that your recommendation is very grounded.

    Could you please check your email inbox for an email from infoAthewellbeinghalo.com containing the htaccess statements that may be preventing your suggestion from working out for me?

    Michael

  60. [ Gravatar Icon ] Sirius says:

    ok Michael,

    but please try to re-send it, just in case, because I’ve clean my junk-mail box just before seen your post.
    thx

  61. [ Gravatar Icon ] Michael says:

    Hi Sirius,

    I just forwarded that email from infoAthewellbeinghalo.com

    Thank you for that. You should sign up for ageLOC Vitality, thewellbeinghalo.com/catalog.html for being so nice.

    Michael

  62. This web page won’t display correctly on my apple iphone - you may want to try and fix that

  63. [ Gravatar Icon ] ping says:

    any chance or news of compact list … this is excellent list … but want more agents and light weight script for it …

  64. [ Gravatar Icon ] Mark says:

    I’ve put this code into my apache2.conf file, but when I include the “RewriteBase /” line I get the following error

    Syntax error on line 269 of /etc/apache2/apache2.conf:
    RewriteBase: only valid in per-directory config files
    …fail!

    Do I need to have that line? When I removed it I did not get the error anymore

    Thanks

  65. [ Gravatar Icon ] Jeff Starr says:

    It’s not required, just comment it out with a # sign and see if it works. Here is official info on RewriteBase:

    http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritebase

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Please use basic markup. Wrap code with <code> tags!