Ultimate htaccess Blacklist

Published Thursday, June 28, 2007 @ 10:46 am • 42 Responses

[ Keywords: htaccess, rewrite, blacklist, block, deny, spam, spammers, scrapers, rippers ]

[ Image: Solar Eclipse ]

For those of us running Apache, htaccess rewrite rules provide an excellent way to block spammers, scrapers, and other scumbags easily and effectively. While there are many htaccess tricks involving blocking domains, preventing access, and redirecting traffic, Apache’s mod_rewrite module enables us to target bad agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any matches are immediately and quietly denied access.

There are many ways to obtain an effective htaccess blacklist. There are several excellent forums around the web that provide a plethora of priceless htaccess advice. Highly suggested. Additionally, after copying and pasting your favorite forum blacklist examples to your domain’s root htaccess file, you will want to continue with its development by tracking bandwidth thieves, comment spammers, and site scrapers and adding them to the list. Or, you may wish to skip the tedious grunt work and simply grab a copy of the Ultimate htaccess Blacklist!

The Ultimate htaccess Blacklist began as a short list of only the most heinous offenders. Blocking scum was such an enjoyable activity that we soon added to the list the identity of every nasty agent we could find. The result has been a very low-stress, spam-free site with virtually zero stolen bandwidth. The list is fairly comprehensive and attempts to blacklist as many site rippers, grabbers, spammers and bad bots as possible. While no blacklist could ever block them all (nor would they want to using this method)1, an elaborate htaccess blacklist can do wonders to improve overall performance, decrease site maintenance, and reduce server expense. Overall, we consider this blacklist a great foundation on which to build and customize your own ultimate htaccess blacklist!2

So without further ado, here is our version of the ultimate htaccess blacklist, as promised. Simply copy and paste the following code into the root htaccess file of your site to enjoy a serious reduction in wasted bandwidth, stolen resources, and comment spam. Don’t forget to backup your data and test everything, etc. — After that, you’re good to go!

The Ultimate htaccess Blacklist

# Ultimate htaccess Blacklist from Perishable Press
# Deny domain access to spammers and other scumbags
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} almaden [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EasyDL/2.99 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} email [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go\!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^grub-client [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Indy*Library [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MS\ FrontPage* [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^sproose [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Szukacz [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^URLSpiderPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^webcollage [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]

Footnotes

  • 1 Note: although this blacklist is highly effective at eliminating unwanted scum, its immense length requires extra processing and may affect the performance of your server. In our experience, employing this list (along with several other htaccess directives) for over two years has resulted in zero noticeable performance issues. Nonetheless, this may not be an ideal solution for sites with extreme levels of visitor traffic. [ ^ ]
  • 2 To begin building your own customized blacklist, you may want to check out the excellent list offered at joemaller.com. Thanks, Joe! [ ^ ]
  • 3 Update (October 14,2007): To reduce confusion and consolidate htaccess rules, the last two lines have been removed from the blacklist. These two lines are not required for the blacklist to work as intended:
    RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
    RewriteRule !^http://[^/.]\.perishablepress.com.* - [F,L]

Dialogue

42 Responses Jump to comment form

1Phil

July 15, 2007 at 3:32 pm

Probably just being dim, but I don’t get the final two lines (where the condition is that the referrer is http://www.iaea.org). Is there some spambot that pretends to be an incoming link from the IAEA?

2Perishable

July 15, 2007 at 4:24 pm

Yes, apparently there is a nasty bot from somewhere in Southeast Asia that uses iaea.org as the referrer. Not sure if it is still crawling the web — you may be safe removing it from the list..

3Phil

July 16, 2007 at 2:01 pm

OK, thanks.

I ran into a problem with the blacklist provided, however, and think there might be a typo.

The following line causes an internal error on my server. If I add a space after the DISCo slash, then it works fine again:
RewriteCond %{HTTP_USER_AGENT} ^DISCo\Pump [OR]

Thanks for the great blacklist. I’m now using it in place of the less comprehensive version I obtained from here:
http://www.javascriptkit.com/howto/htaccess13.shtml

Phil.

4Perishable

July 16, 2007 at 2:31 pm

Phil,

Yes, good catch. The DISCo condition should definitely have a space before the Pump:

RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]

I have corrected the blacklist so others should not experience any problems.
Thank you for your help!

Regards,
Jeff

5WillMacc

July 22, 2007 at 3:52 pm

Excellent blacklist for the htaccess file!
It even had a few on there I haven’t seen before! :)
I rant on this subject on my blog all the time:
http://www.a-daily-rant.com/

6Perishable

July 22, 2007 at 4:16 pm

WillMacc,

We are honored by your presence! Your work at http://www.a-daily-rant.com/ is excellent and much appreciated. I highly recommend your site for anyone serious about winning the war against spam, scrapers, and other online scum!

Many thanks!

7Greg

October 6, 2007 at 2:41 am

Thanks so much, with any doubt, the best list ever.Your site is fantastic !!
I’m wondering if you could privude the same list but in a “compress version” ?
Likethis for exemple: RewriteCond %{HTTP_USER_AGENT} ADSARobot|Anarchie|ASPSeek|Atomz|BackWeb|Bandit|…and so much more… with the [OR] and/or the [NC,OR] at the ends of lines.
Thank you agnain !

8Greg

October 6, 2007 at 2:46 am

You can see, wht I’m talking about here: http://www.toulouse-renaissance.net/c_outils/c_htaccess_compact.htm

;-)

9Perishable

October 6, 2007 at 8:21 pm

Greg,

By all means, that is a great idea. I will rewrite the blacklist and post a complete, compressed version featuring even more agents within the next week or so. Stay tuned..

10Robs

October 15, 2007 at 12:09 am

Thanks - just a heads up that this line causes the whole list to fail on my server (it ignores everything and lets every bot through). Comment it out and there is no problem

RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]

11Perishable

October 15, 2007 at 7:58 am

Robs,

Thanks for the heads up — may I ask what type of server/software are you using?

Thanks,
Jeff

12Perishable

October 15, 2007 at 12:26 pm

Greg,
Thanks again for the suggestion to release a compressed version of the ultimate htaccess blacklist. I finally managed to find the time to rewrite the list, and decided to add another fifty or so agents. Even with the new entries, the compressed blacklist is almost half the size of the original. Here is a link to the new article.
Cheers,
Jeff

13Greg

October 15, 2007 at 1:15 pm

Well done !

Your Welcome :-)

14Sanstenarios

October 25, 2007 at 6:19 pm

Hi,

glad to find up to date info about referer spam. If i get the point, referer spam is created using forged referer info in the http request, but can’t the user agent be forged also ?

15Perishable

October 25, 2007 at 7:37 pm

Sanstenarios,
Thanks for the comments. Unfortunately, any blacklist will only stop only those agents that admit to being on the list. Of course, forging identifying data such as referrer info and user agent is indeed possible, rendering virtually all such lists at least partially ineffective. Nonetheless, extensive blacklists such as the one provided in this article remain quite effective in denying access to spam whores and other scumbags. So, until that perfect solution manifests, we continue to employ any and all tools at our disposal.

16Sanstenarios

October 27, 2007 at 3:08 am

I set up the compress list and gonna check my logs ;)

Thanx

17khalifa

October 27, 2007 at 8:19 pm

great list thnx

18Perishable

October 28, 2007 at 3:27 pm

My pleasure!

19Lisa

December 11, 2007 at 6:04 pm

Wow! This is one big list :) By default, my .htaccess file already has the line “RewriteEngine on” and
“RewriteBase /”. I don’t need to rewrite them right?

I’m a little confused with how .htaccess works.

20Perishable

December 11, 2007 at 10:14 pm

Yes, Lisa, that is correct — you only need to specify each of those directives once per directory. If your htaccess file is in the root of your site, then you only need to declare them in that file, even if you place additional rewrite rules in subdirectories further downstream. The RewriteEngine on is simply telling the Apache server software to enable the rewrite module so that it can process rewrite rules, while the RewriteBase / declaration explicitly sets the base URL for per-directory rewrites.

21Proximuz

January 21, 2008 at 4:38 am

Hi.

Thanks for your list :) It’s your list up to date?

22Perishable

January 21, 2008 at 8:18 am

Hi Proximuz,

I last updated this list on October 15, 2007. Blacklists such as this are difficult to keep updated because the bad bots are constantly changing — there are new bots popping up every day, and old ones that simply disappear. Currently, I am constructing a fresh blacklist that I will post later this year.. until then, you may want to check out the Ultimate Blacklist 2 — last updated on November 5th, 2007, and also compressed for easier handling :)

23Proximuz

January 21, 2008 at 10:56 pm

Hi Perishable :)

Thanks for your answers:)
I’m gonna take a look at your link.

Thanks

24Perishable

January 22, 2008 at 9:00 am

My pleasure, Proximuz :)
Best of luck!! ;)

25Rasheed

February 24, 2008 at 11:03 pm

Hello Jef,

I suffer a lot from site scrappers.

I have this list in my site.

For test purposes i tried to (scrap) my site using Offline Explorer but it was not blocked.

I also tried the same thing with your site (just for 30 sec) and did not get Offline Explorer blocked.

What should i do to block it ?

How can i determine which software the scrapper is using ? The raw log file shows that ?

Thanks,

Rasheed.

26Perishable

February 26, 2008 at 6:07 pm

Hi Rasheed,

I feel your pain!
Perhaps this will work:

# block Offline Explorer
SetEnvIfNoCase User-Agent "Offline Explorer" keep_out
<Limit GET POST>
   order allow,deny
   allow from all
   deny from env=keep_out
</Limit>

It takes a different approach by using SetEnvIfNoCase, but I am not certain that “Offline Explorer” is the actual user agent for that client. If it doesn’t work, you may also want to try alternate names for the user agent.

I hope it helps!

27Rasheed

February 26, 2008 at 7:12 pm

Hi Jeff,

I found they changed the agent name to IE.

If i block it also Internet Explorer will be blocked (403).

Frustrating !

28Perishable

February 27, 2008 at 10:35 am

Ouch — That sucks! I guess I can remove that line from the blacklist now.. ;)

29Jeff Mez

February 28, 2008 at 12:24 pm

If you have RSS feeds, something in this list blocks Google and Yahoo from being able to read your RSS XML. Feedburner was still able to update. I put this on my site on Feb-16 and just today checked my feeds on My-Google and Yahoo. Both stopped updating on Feb-16. I commented out just this blacklist from my .htaccess file, waited about 30 minutes and sure enough, Google and Yahoo both updated to today. :\

30Perishable

March 2, 2008 at 9:17 am

Thanks for the input, Jeff. I will be looking into this. In the meantime, you may want to check out a lighter, “friendlier” type of blacklist. :)

31Peter

March 3, 2008 at 9:43 am

Hi
Does this even have to go into a .htaccess file? Can’t it be used more globally by having it in a apache config file?

32Perishable

March 3, 2008 at 9:47 am

Yes, the blacklist may definitely be placed directly into Apache’s httpd.conf file.

33Peter

March 3, 2008 at 10:01 am

Thanks for the follow up. With several domains its much easier to maintain doing this globally.

I log all activity to a database and the amount of 403 I’m getting is impressive between this and the 2G Blacklist.

:)

34Perishable

March 3, 2008 at 11:10 am

Absolutely! I love to watch the idiot bots bounce off the walls ;)

35HR Blog

July 9, 2008 at 2:16 pm

Thanks for this list and the updated list. I love watching all the denials in the logs.. Is that creepy?

36TechJammer

September 12, 2008 at 6:46 am

I have noticed that comment spammers are bypassing this security step, and just forging my sites own referer ID. Has anyone discovered a way to detect a forged referer ID?

Subscribe to comments on this post


Share your thoughts..

TopRead official comment policy

Contact Perishable Press

  • Contact Jeff via form

Search Perishable Press

About Perishable Press

Perishable Press is the virtual playground of Jeff Starr — visionary, founder and lead developer of Monzilla Media, a small web and graphic design company in the lush desert oasis of Moses Lake, Washington. Perishable Press features articles and tutorials on many aspects of digital design..

Read more..

Perishable on Twitter

automation is great: i've got photoshop batch processing 300+ images while FTP is simultaneously uploading them to the server..

Perishable on Tumblr

Tons of Firewalls

Tuesday, 7 October 2008, 1:45 am

Recently overheard on conservative talk radio (instructing listeners how to obtain a free promotional video from their new website):

“This website has tons and tons of firewalls, so you have to use your real email address to download the video..”

The Quiet Search Revolution

Monday, 6 October 2008, 12:15 pm

Just a thought.. As awesome as Google is these days, it would suck if they ended up owning the entire search-engine business. When they get to the point where all competition is impossible (due to their sheer size, financial resources, media influence, etc.), how many alternate search engines will have the resources for continuous improvement and top-quality search results? When this happens, we will have no choice but to do exactly what Google tells us to do.

As deeply ingrained as it is for everyone to instinctively and unthinkingly turn to Google for their search activity, it is time to leave a few alternate search tabs open for as much use as possible. Instead of using Google just because that’s what you always do, try your search on MSN, Yahoo, Ask, or any of the other independent search engines instead. Sharing traffic with other search engines is a nice, quiet way to keep the competitive spirit alive and well in the search-engine business.

Disappearing WordPress Posts

Wednesday, 1 October 2008, 7:50 pm

Today I experienced difficulties while trying to publish or even save new posts in WordPress. I would compose the post as usual, add all of the keywords, tags, meta tags, and so on, but as soon as I clicked the “Publish” or “Save” button, the post would just disappear from existence.

The weird thing is that during the drafting process, WordPress’ default auto-save feature showed that the post had been saved at expected intervals. Unfortunately, after trying to publish several different posts, WordPress showed absolutely no record of the posts ever being created. They simply vanished into thin air.

Fortunately, a little investigation revealed the culprit. If you should find yourself dealing with this same issue, here are some different things that you should try. First, re-upload fresh copies of your entire WordPress installation. I don’t know why exactly, but apparently various files can either go stale or completely disappear from the server. Overwriting or writing fresh files may do the trick.

If that doesn’t work, check your WordPress database for errors. In my case, a little investigation revealed that something had caused a couple of fatal errors in the wp_posts table. Fortunately, checking and repairing the table solved the issue.

Tumblr Battles

Wednesday, 1 October 2008, 5:30 pm

Please excuse the duplicate Tumbr posts.. seems there is no way to ping Tumblr to refresh/rebuild the RSS feed according to changes in post content. So, to resolve the issue I have discussed now like two or three times regarding paragraph elements and proper feed formatting, I have no choice but to repost a majority of my text posts.

This is necessary for the proper import and display of my Tumblr feed into WordPress. Currently, there are five items displayed at once, each styled according to proper inclusion of paragraph tags. Thus, whenever the Tumblr feed “forgets” to enclose single-paragraph posts with the proper tags, the result is an unstyled post entry displayed on my site.

Assuming that makes sense, you will please excuse my dust while I repost a few older entries in an attempt to reconstruct (the hard way) a properly formatted Tumblr feed.

More Optimization Measures

Wednesday, 1 October 2008, 5:27 pm

Another important step in improving the performance of my recent redesign involves the optimization of both CSS and JavaScript content. During development there were around 15 server requests for these two types of files, 10 JavaScript files and 5 CSS files. This was okay for my own use, but would not work for production purposes.

Optimizing these file types involves consolidation, compression, and caching. Consolidation of 10 JavaScript files into three is huge improvement. Now I deliver one JS file for the functionality of the site, one for Mint, and another for Analytics. Likewise for the stylesheets; after consolidation, a single stylesheet is delivered to all modern browsers. There are two additional stylesheets as well, but they are targeted at IE6 and mobile browsers and will not load elsewhere.

Once the files were consolidated as much as possible, it was time to optimize or “crunch” them. Using the sexy Flumpcakes CSS optimizer, I was able to reduce my stylesheets by around 25%. Likewise for JavaScript, I used xtreeme.com’s optimizer to shave an additional 20% off the size of my JS content.

Finally, once I had consolidated and compressed my JS and CSS files as much as possible, I wanted to further my optimization efforts by ensuring that these files were cached by the browser. By setting far-future Expires headers for everything but the statistical files, my site gains an additional performance boost by eliminating the need to reload preexisting content.

Read more on Tumblr..

Subscribe to Comments Recent Dialogue

  • Adam Singer: Thanks for this. You're right, if it isn't broken, don't fix it. I was about to update my permalinks and install a plugin to redire...
  • Marilyn: It looks great on my browser! I wish I had that much creativity in my head! It's gorgeous!...
  • Randy: "Too girly?" It looks like a great design. Define "too girly!"...
  • Christopher Ross: .htaccess based redirects are wonderful. I'm always baffled by web professionals who don't take the time to learn more about them....
  • federico: Hi Jeff... tnx so much...it worked perfectly... c u Federico...
  • Cooltad: The skin seems (mostly) fine in my expert opinion. Your one of the few people able to make a design with a transparent table and a b...
  • Neal: The free Intro to Linux book is a great place to start http://www.ischool.utexas.edu/mirrors/LDP/LDP/intro-linux/html/index.html ...
  • Louis: @Jeff: Your “Archives” page is slick, although I would expect a cleaner implementation from such a vehement advoc...
  • Jeremy: Well I think that you may be over-critical, I don't see a darn thing wrong with it - I like it a lot!...
  • Jeff Starr: Alright, this is exactly the kind of information I was hoping to get. Lots of great ideas and recommendations here. I will be reading...

Read more recent comments..