Block Spam by Denying Access to No-Referrer Requests

Published Monday, November 20, 2006 @ 11:28 am • 28 Responses

Credit for this trick goes to shoemoney.com. What we have here is an excellent method for preventing a great deal of blog spam. With a few strategic lines placed in your htaccess file, you can prevent spambots from dropping spam bombs by denying access to all requests that do not originate from your domain.

How does it work? Well, when a legitimate user (i.e., not a robot, etc.) decides to leave a comment on your blog, they have (hopefully) read the article for which they wish to leave a comment, and have subsequently loaded your blog’s comment template (e.g., comments.php), which is most likely located within the same domain as the article, blog, etc. (i.e., your domain).

So, after filling out the comment form via comments.php, the user clicks the "submit" button, which then initiates the PHP file/script that actually processes the comment for the world to see. For WordPress users, the comment processing file is wp-comments-post.php.

Therefore, the HTTP referrer for all legitimate (user-initiated) comments will be your domain (or the domain in which the comments.php file is located). Automated spam robots typically target the comment-processing script directly, bypassing your comments.php form altogether. Such activity results in HTTP referrers that are not from your domain.

Thus, by blocking all requests for the comments-processing script (wp-comments-post.php) that are not sent directly from your domain (comments.php), you immediately eliminate a large portion of blog spam.

Sound good? Here is the script to add to your site’s .htaccess file:

# block comment spam by denying access to no-referrer requests
RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} .wp-comments-post\.php*
RewriteCond %{HTTP_REFERER} !.*perishablepress.com.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule ^(.*)$ ^http://the-site-where-you-want-to-send-spammers.com/$ [R=301,L]

Please note that you need to edit the following lines according to your specific setup:

.wp-comments-post\.php*
This is the default comment-processing script for WordPress users. If you are not running WordPress, you will need to determine the corresponding file and enter its name here.
!.*perishablepress.com.*
Change this value to that of your own domain.
^http://the-site-where-you-want-to-send-spammers.com/$
Because spambots typically ignore redirects, this may not be accomplishing too much. But go ahead and enter the URL of your least-favorite website anyway. Another option here is to simply bounce the spambot back to where it came from by replacing the last with this: RewriteRule ^(.*)$ ^http://%{REMOTE_ADDR}/$ [R=301,L]

And that is all there is to it! Bye bye spambots!


Dialogue

28 Responses Jump to comment form

1Michael

December 8, 2006 at 4:57 pm

WordPress Trackback Spam!!!
I have installed plugins that prevent comment spams, but this won't prevent trackback to be blocked. I've been spam by many
MFA websites that most probably is from the same network with trackback, but they are not linking me on their website. May I
know how do they do it and how do I stop it? Without disabling trackback?
Thanks, and I'm using WordPress.

2Perishable

December 12, 2006 at 10:54 am

Hmmm… good question. I will look into it..

3Lee

January 23, 2007 at 6:25 am

Shouldn’t the last line be changed to:

RewriteRule ^(.*)$ http://the-site-where-you-want-to-send-spammers.com/ [R=301,L]

I am using it as you have it and am getting getting 404 errors like this:

http://shamar.org/%sitegoto.com/$

4Perishable

January 23, 2007 at 9:10 am

Lee,
If that works for you, great. Often, there are multiple ways of writing htaccess expressions. For example, here is the last line of the same htaccess code currently presented on the WordPress Codex:

RewriteRule (.*) ^http://%{REMOTE_ADDR}/$ [R=301,L]

Further, here is the corresponding line we are currently using at Perishable Press:

RewriteRule ^(.*)$ ^http://www.google.com/$ [R=301,L]

..which has been working fine for quite a while.

Also, an absence of errors doesn’t necessarily translate into proper functionality. You should throw down with some tuf log action:

RewriteEngine On
RewriteLog /absolute/path/to/your/wwwroot/public_html/rewrite_log.txt
RewriteLogLevel 2

..to ensure that your syntax actually produces the desired results (i.e., blocking spambots, etc.). Either way, thanks for the information concerning your specific issue — it may prove beneficial to others experiencing the same type of error.

Cheers!

5danielle

October 10, 2007 at 7:59 am

oh nothing just wanted to feel special!!!!!!!!!!!

6Perishable

October 10, 2007 at 10:49 am

Your specialness is obvious, danielle ;)

7Jenny

December 10, 2007 at 9:15 pm

I’ve thought of using this method before but I was too lazy to form up a proper code. Thank you Perishable…of course not forgetting Shoemoney :)

8Perishable

December 10, 2007 at 9:17 pm

My pleasure, Jenny — thank you for the feedback :)

9Rick Beckman

January 24, 2008 at 5:00 am

I’m using this code too, but looking up the IPs of spammers caught by Akismet and cross-referencing those same IPs with my Apache logs, I’m seeing that the spammers are actually loading the posts and submitting via the actual form.

And by doing so, they’ve circumvented the protection you share above, as well as the one I implemented (renaming /wp-comments-post.php to something custom, editing my theme’s /comments.php file appropriately).

Spam sucks.

Oh, just curious as to why users with empty user-agents are blocked from commenting in the above rewrite?

10a name

January 26, 2008 at 12:09 am

I put in the above code in my .htaccess and got a 500. After a few tries and changes, I decided to add this into my wp-comments-post.php. Is there any reason I shouldn’t have this (other than having to add it every time I upgrade WP)?

if (strPos($_SERVER['HTTP_REFERER'],'yourdomain.com')===FALSE) exit;

Thanks.

11Perishable

January 27, 2008 at 9:11 am

@Rick: Yes indeed, spam sucks — it’s like a perpetual cyberspace battle: spammers attack, bloggers defend themselves, spammers defeat the defenses and attack some more.. ad nauseam. As to the secret purpose of blocking empty user agents, I will never tell!

@a name: Beyond the pain of perpetual updates, I see no reason why such code would cause any issues — in fact, it seems like an excellent alternative to the htaccess method. Thanks for sharing :)

12Jim

April 4, 2008 at 5:25 am

Hi

Thanks for your list, it’s been on my favourites for years. I’m trying to use the above script to kill spam on our contact forms, however, not being the htaccess guru you are, I’m having trouble redoing the urls to the form handlers in subdirectories….any tips?

13(beausoleilm) Mathieu Beausoleil

May 9, 2008 at 5:05 pm

What about proxy ? I know that some proxy server will erase referrer header. Do you know if that solution will block visitors ? Is that better to stock a referrer address in session or use an otherway like an empty input text (display none) and verify that the input still empty before using the data ?

14Perishable

May 12, 2008 at 10:09 am

Yes, that would be one way to do it. If you are allowing visitors to comment via proxy, you may want to test the method before implementation. It is really a double-edged sword, dealing with no-referrer requests: it is nearly impossible to avoid false positives and false negatives. Frankly, I have been contemplating removing the method described in this article. If so, it will be done as a test and I will report the results in a future post.

15Web Designer

September 25, 2008 at 8:40 pm

I was using manual posting technique, but I am not able to post comment in any site.

May be my URL “gigaturn” has been listed in block-list by wp-comments-post.

Any solution for this problem would be appreciated.

Thanks in advance!
jitu78@gmail.com

16Jeff Starr

September 27, 2008 at 7:20 pm

Hi Web Designer, I have never heard of the wp-comments-post.php file having any inherent blacklisting capabilities, but I have not investigated the file in newer versions of WordPress, so it may be the case. Another thing you could check is whether or not your “gigaturn” URL has been flagged as spam within the Akismet database. If you go to the site or Google for something like “remove site from Akismet” (or similar), you should find all the information needed to investigate and possibly remedy the issue. Good luck!

17Web Designer

September 29, 2008 at 3:03 am

Thanks Jeff,
But you can try it by yourself.
just try it with gigaturn.com
you will not able to post.

still looking for right solution.

18Web Designer

September 29, 2008 at 3:06 am

Tried again with site URL and got this,

http://perishablepress.com/press/wp-comments-post.php

Not able to post comment but this URL (gigaturn.com), however I can post with other URLs.

19Jeff Starr

October 5, 2008 at 10:27 am

Perhaps I am confused as to what you are trying to do. Are you trying to post comments at gigaturn.com? Or are you trying to post comments on other sites using gigaturn.com as the commentator link? Or something else..? I guess I need more information as to what’s going on and where..

20balisugar

November 19, 2008 at 8:12 am

Hi, sorry to botter you, I need help.

I think I have a few pages with strange url, that i can see from my wassUp stats. That xxx is a porn site. And Google crawls it all the time. I never link to them in the first place. Please help. How to remove and block it because it’s not only one page.
eg :
/page/92/?ref=www.xxx.com-www.xxx.com-www.xxx.com-www.xxx.com

I’m very sad, I don;t know much about this :cry:

21Jeff Starr

November 24, 2008 at 6:03 pm

@balisugar: don’t cry! You should be able to deny requests for that specific query string by adding the following directives to your root htaccess file:

# BLOCK XXX.COM QUERY STRINGS
<ifmodule mod_rewrite.c>
   RewriteCond %{QUERY_STRING} xxx\.com [NC,OR]
   RewriteRule .* - [F,L]
</ifmodule>

Once in place, this code should block any query-string requests containing the character string “xxx.com”.

For more information on this technique, check out my article, Improving Site Security by Preventing Malicious Query-String Exploits.

22balisugar

November 25, 2008 at 5:50 am

Thank you for your help. I will link to you so I don’t forget your site.

23Jeff Starr

November 26, 2008 at 12:57 pm

@balisugar: My pleasure — happy to help! :)

24balisugar

November 28, 2008 at 7:58 am

Hi, Mr Jeff. How are you? :smile:

After what happened to me, I’m still sometimes worried that someone is redirecting bad content to my site. Is that possible and if so, how can I stop them? And which is the better way to block bad bots - .htaccess or robots.text?

I feel more “comfortable” modifying robots.text rather than .htaccess.
Thank you for all your help.

25Jeff Starr

November 30, 2008 at 10:40 am

@balisugar: htaccess and robots.txt perform two different functions. htaccess is responsible for per-directory configuration of various Apache server directives (such as rewriting, redirecting, and many others), while robots.txt directives merely instruct robots (such as Googlebot and Slurp) on how to go about crawling your site (which URLs to ignore, sitemap location, et al). Unfortunately, there are precious few robots that actually obey the directives specified in the robots.txt files, while they really have no choice but to follow the rules specified via htaccess files.

So, to answer your question, if you need to block a specific URL, referrer, or request, it is best to handle it via htaccess — robots.txt is the wrong tool for the job.

Subscribe to comments on this post


Share your thoughts..

TopRead official comment policy

Contact Perishable Press

  • Contact Jeff via form

Search Perishable Press

About Perishable Press

Perishable Press is the virtual playground of Jeff Starr — visionary, founder and lead developer of Monzilla Media, a small web and graphic design company in the lush desert oasis of Moses Lake, Washington. Perishable Press features articles and tutorials on many aspects of digital design..

Read more..

Perishable on Twitter

Google tells users to drop support for IE6! @ http://www.tgdaily.com/content/view/40785/140/

Perishable on Tumblr

WordPress Tip for Multiple Themes

Sunday, 4 January 2009, 5:16 pm

If your site makes available multiple themes for users to choose from, remember to include the JavaScript (or any other required code) for any statistical applications that you might be using, such as Mint, Google Analytics, and so forth. I am not sure about the various WordPress statistics plugins, but they may need to be included as well. A good way to check if your stats plugin is tracking data across all themes is to either visit a few pages that you know others aren’t hitting, or else activate each of the alternate themes and check the source code of each one for the required code.

Earlier today, I realized that only several of my most recent themes included the required JavaScript for Mint and Google Analytics. I am now in the process of editing each of the 18 themes available for users at Perishable Press. Haven’t decided on whether or not both statistics apps are needed for all themes, but I will certainly be using at least one of them to keep an eye on everything.

Insane Christmas

Monday, 22 December 2008, 9:47 pm

For as long as I can remember, Christmas has always been a relatively peaceful affair. Sure there’s the usual holiday stress — traffic, shopping, presents, relatives, and all that goes with the preparation of a traditional celebration, but when it’s all said and done, you get to relax and enjoy the peace and harmony of gathering together and basking in the reason for the season: the birth of Christ.

This year, however, the stress factor has been kicked up a few notches, making for a rather insane Christmas if I do say so myself. In addition to the usual holiday chaos, we are currently purchasing a brand new home, and quickly realizing the incredible amount of work involved in the process. If you’ve ever bought a newly built home, you know exactly what I am talking about here.

Plus, as if all the paperwork, inspections, insurance, costs, and anxious anticipation weren’t enough to confound the usual holiday stress, we are also packing up everything, dealing with kids, working full-time jobs, and — beginning on Christmas Eve — moving into our new house.

It certainly is all a great joy and blessing to have such amazing things going on, but combined with the work that I do on the Web — blogging, designing, projects, helping people, and so on — it really becomes all too much rather quickly. We are doing are best to get through everything with our sanity intact, but I have to admit that this is the most insane Christmas I have ever experienced.

New (4G) Blacklist Now in Beta

Monday, 22 December 2008, 9:27 pm

Just a quick note to anyone interested in securing their websites against malicious activity, spam, and other nonsense. Several months after releasing my 3G Blacklist, I have finally begun work on the next incarnation of the blacklist: the 4G Firewall!

The first part of the blacklist is now ready for testing, and I plan on setting it up on Perishable Press within the next few days. While testing on my own site, I thought it would beneficial to also invite a few “beta” testers to run the code on their own site(s) as well.

So, if you have a site that receives its share of malicious attacks, and cracker exploits, drop me a line via the contact form at Perishable Press and I will send you the initial block of HTAccess directives. This version of the Blacklist is looking better than ever, and I look forward to releasing the complete version to the public early in 2009.

Thanks for the Free Traffic and Link Juice

Sunday, 7 December 2008, 1:26 pm

Just wanted to thank the fine folks at fafich.ufmg.br for all the free traffic and link juice. Thanks to their misapplication of my comprehensive canonicalization code, every non-canonical version of their 21,700 indexed pages points directly to my site, Perishable Press. This means that every one of their permalink URLs that is mistyped, lacks the “www” prefix, or contains the superfluous “index.php” file name is directed via permanent redirect directly to the home page of my site.

I have tried contacting the site owner(s) about this situation, but it has been over a week and I have yet to hear anything back. Hopefully, they will take notice soon and correct the issue by properly configuring their htaccess file, but in the meantime, I certainly don’t mind the extra link juice and free traffic! :)

No Plugin Needed for Feed Delay

Monday, 24 November 2008, 10:01 am

I recently saw a WordPress plugin that was designed to delay the publication of your WordPress feed by any specified time interval. While it is a good idea to carefully proofread your content before posting it, a plugin certainly is not required to do so.

As savvy WordPress users already know, WordPress has a built-in post-preview feature that enables authors to view their unpublished content as a published post. This enables authors to do any amount of proofreading and browser checking until they are satisfied with the results.

To do this, simply write your post as usual, and then click on the “Preview this post” button on the right-hand side of the screen. In older versions of WordPress (less than 2.5, I think), you actually need to save (without publishing!) the post first and then re-open it as if to continue editing. You will then see a “Preview »” link sort of hidden (due to poor CSS design) in the upper-right corner near the edit post field. Right-click on that link to open in new tab and you are good to go.

No extra plugin needed! :)

Read more on Tumblr..

Subscribe to Comments Recent Dialogue

  • Mark: There we go! That's the way to do it! Thanks, Jeff!...
  • Jeff Starr: Well said, Mark! Here is some news that I find ...
  • Jeff Starr: Thank you all for the great feedback! I wrote this article as a way to purge some of my thoughts on Twitter, but now see that some of...
  • Jeff Starr: Thank you so much for the thoughtful feedback, Adrian. It has been a good year indeed, and I certainly hope that 2009 brings many ble...
  • Jeff Starr: Hi heywho, glad to hear you are doing well! ;) I wish I could join in the festivities.. it has been so long that I almost have forgot...
  • Rob Barrett: Thanks for posting about the Stealth Publish plugin -- just what I needed for my site. Works perfectly!...
  • Jeff Starr: Hi Chiwan, I got your email and have sent some information that may help you with this. Cheers, Jeff...
  • Chiwan: Hi. This is cool. So I can I replace the clock that comes with your Apathy theme with this clock? If that's not possible, how do ...
  • Brass Engraved: Thankyou very much for this, worked like a dream!...
  • Patrix: I'm using FeedBurner and the Feedsmith plugin for my filter blog, DesiPundit. I found your post via the WordPress page for RSS feeds ...

Read more recent comments..

Attention: Do NOT follow this link!