Block Spam by Denying Access to No-Referrer Requests
What we have here is an excellent method for preventing a great deal of blog spam. With a few strategic lines placed in your .htaccess file, you can prevent spambots from dropping spam bombs by denying access to all requests that do not originate from your domain.
Block comment spam
Here is the script to add to your site’s root .htaccess file:
# block comment spam by denying access to no-referrer requests
RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} wp-comments-post\.php
RewriteCond %{HTTP_REFERER} !(.*)example\.com(.*) [OR]
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule .* http://the-site-where-you-want-to-send-spammers.com/ [R=301,L]
Note that you need to edit the following lines according to your specific setup:
- wp-comments-post\.php
- This is the default comment-processing script for WordPress users. If you are not running WordPress, you will need to determine the corresponding file and enter its name here.
- !(.*)example\.com(.*)
- Change this value to that of your own domain.
- http://the-site-where-you-want-to-send-spammers.com/
- Because spambots typically ignore redirects, this may not be accomplishing too much. But go ahead and enter the URL of your least-favorite website anyway. Another option here is to simply bounce the spambot back to where it came from by replacing the last with this:
RewriteRule .* http://%{REMOTE_ADDR}/ [R=301,L]
For more awesome anti-spam techniques, check out How to Block Bad Bots and Stupid .htaccess Tricks.
How does it work?
When a legitimate user (i.e., not a robot, etc.) decides to leave a comment on your blog, they have (hopefully) read the article for which they wish to leave a comment, and have subsequently loaded your blog’s comment template (e.g., comments.php
), which is most likely located within the same domain as the article, blog, etc. (i.e., your domain).
So, after filling out the comment form via comments.php
, the user clicks the “submit” button, which then initiates the PHP file/script that actually processes the comment for the world to see. For WordPress users, the comment processing file is wp-comments-post.php
.
Therefore, the HTTP referrer for all legitimate (user-initiated) comments will be your domain (or the domain in which the comments.php
file is located). Automated spam robots typically target the comment-processing script directly, bypassing your comments.php
form altogether. Such activity results in HTTP referrers that are not from your domain.
Thus, by blocking all requests for the comments-processing script (wp-comments-post.php
) that are not sent directly from your domain (comments.php
), you immediately eliminate a large portion of blog spam.
And that is all there is to it! Bye bye spambots!
44 responses to “Block Spam by Denying Access to No-Referrer Requests”
@balisugar: htaccess and robots.txt perform two different functions. htaccess is responsible for per-directory configuration of various Apache server directives (such as rewriting, redirecting, and many others), while robots.txt directives merely instruct robots (such as Googlebot and Slurp) on how to go about crawling your site (which URLs to ignore, sitemap location, et al). Unfortunately, there are precious few robots that actually obey the directives specified in the robots.txt files, while they really have no choice but to follow the rules specified via htaccess files.
So, to answer your question, if you need to block a specific URL, referrer, or request, it is best to handle it via htaccess — robots.txt is the wrong tool for the job.
I am new to .htaccess and have to ask…
Q1: Can I use this for any page that is posting data?
Q2: If Q1 is YES, my page is one folder deep, ie:comments/comment-page.php
Do I do this:
RewriteCond %{REQUEST_URI} .comments/page.php\.php*
Or this:
RewriteCond %{REQUEST_URI} .comments\page.php\.php*
Or this:
RewriteCond %{REQUEST_URI} .http://www/example.com/comments/page.php\.php*
Or this:
RewriteCond %{REQUEST_URI} ./var/htdocs/web/comments/page.php\.php*
Any help would be great.
Cheers
@Web Designer: I got the same thing twice. I can only say IE7 is a serious problem, IE8 needs to be refreshed every time, MS is terrible… these aren’t even web browsers.
Cheers
@Json: Yes you can use the
REQUEST_URI
variable to match just about anything in the request string, including individual files. The targeted string is a regular expression and will match any instance of itself within the requested URI. Something like this should work great:RewriteCond %{REQUEST_URI} page-you-want-blocked.php
To verify that it works, try loading the page directly in a browser. For more information on blocking with the
REQUEST_URI
variable, check out the fifth method in this article.My problem is similar to the one highlighted by Rick Beckman. Our spammer is submitting a GET request for the contact form on my site (with no referer) followed a few seconds later by a POST request. The POST request correctly identifies my website as the referer.
The only solution I can think of is to also block GET requests for the form if the referrer is blank, however, I’m a bit new to this so wondered if that was a bad idea?.
I figure it would block anyone who has bookmarked the contact form page directly, but I could redirect them back to the homepage where they can follow the page link as intended.
You’re not the only one with stuff like this :)
@Peekay: that would certainly work, unless you think there are quite a few folks bookmarking your site’s contact page.. if so, I think a better option would be to implement some sort of simple captcha system to stop the automated junk.
Great tips, I am not a big programmer, but this kind of thing I think I understand and will implement it as soon as our new posting system will be functioning.
In the past we got a huge swarm of spam and had a hard time dealing with it!
Since I run all my pages through the same script, I use the following to let the script know to look for method POST:
if ((($_SERVER['REQUEST_METHOD']) == 'POST') and (strPos($referer , $site_domain_name) === FALSE)) die();
I can’t figure out how to use your .htaccess method because of not running comments through a particular file… is there an .htaccess version of
$_SERVER['PHP_SELF']
that would replace the wp-comments-post.php part?BTW, I am grateful for your site and have implemented many of your techniques… thank you very much for sharing your expertise! .htaccess is such a great place to stop these jokers.
For the ones who get past the .htaccess stuff, I’ve had a good deal of success by sending forms back if any fields contain any of the words in a hash I’ve collected that includes profanity and words taken from spammer url’s (‘nude’ etc) that aren’t likely to appear in legitimate posts.
Anyway, thanks again, very useful stuff!
I’m going to apply the hack to my site cos Google Webmasters Tool is reporting that On average, pages in your site take 35.0 seconds to load and wp-comments-post.php seems to be the culprit as it’s loading in 30.2 secs :-(
Who knows? Maybe this naughty spammers are to blame.
Hi James, could you tell me how to stop trackback spamming? i’ve got so many MFA trackback spam lately. Is there any way to stop it without delete wp-trackback file
thanks
~james