Latest TweetsVerify any search engine or visitor via CLI Forward-Reverse Lookup perishablepress.com/cli-forwar…
Perishable Press

4G Series: The Ultimate Referrer Blacklist, Featuring Over 8000 Banned Referrers

You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers.

Shortcut: skip the article and jump to Disclaimer and Download »

Referrer Spam Sucks

For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the Mainstream Media website and happened to follow a link to your site (of all places), you would look at your access logs, notice Johnny’s visit, and speak out loud (slowly): “hmmm.. it looks like the Mainstream Media website referred my good pal Johnny to my Alka-Seltzer sales page.” In such a bizarre case, the Mainstream Media website — or specific page — is referred to as (no pun intended) the referrer.

Sounds like a totally radical concept, right? I mean, who doesn’t want other sites sending them traffic? Not many, of course, unless the referrals are in actuality a type of spam known as, well, referrer spam. Eh? Referrer spam, you say? How does that work? Well, I’m so glad you asked. Allow me to explain..

Like Mynocks

Referrer spam is actually a barrage of URI requests from a fake referrer. Just imagine some pathetic dillweed out there, sitting alone in his bedroom, running a borrowed script that does something like this:

  • targets your site from some randomly generated hitlist
  • begins making hundreds of URI requests for random pages on your site
  • leaves fake referrer information for each request, claiming to have arrived by way of “harrypotterdogpanties.net
  • continues making hundreds of requests with the fake referrer information
  • ad nauseaum
  • ad nauseaum
  • ad nauseaum

In the process of doing this, the spammer is draining your resources, consuming your bandwidth, decreasing your site’s performance, and clogging your access and error logs with hundreds or thousands of bogus requests. This in turn may skew or obscure accurate statistical information and result in additional service charges and other headaches. In other words, referrer spam sucks donkey dong.

Big Spammin’

For the spammer, referrer spam pays off because it serves as a cheap way to get garbage spam sites to rank in the search engines. This technique is also referred to as “spamdexing,” which refers to spamming that is directed at the search engines. By artificially accessing your site via their fake spammy web pages, referrer spammers effectively populate your server’s access logs with hundreds of links back to their stinky spam site.

The actual payoff occurs as a percentage of spammed sites publicizes their access logs on the Web. This may not sound like much, but with a free, easily accessible referrer-spam script, referrer spammers can hit hundreds of thousands of sites. If even a tiny fraction of these sites publicizes their access logs, the number of links back to the spam site can be significant.

Make Them Stop

Unfortunately, there aren’t many options for stopping this sort of nonsense. Referrer spammers are targeting actual resources, so blocking malicious request strings is not an option. We could block individual IP addresses or even user-agents, but that also would be futile because such variables are easily spoofed.

So how do we keep these leeches from draining our sites? Easy. Blacklist the fake referrer sites themselves. And fortunately, there are many resources on the Web for obtaining extensive lists of spammy referrers. Including this one. Below you will find the convergence of two excellent lists of spammy referrers: one containing 276 referrers (404 link removed 2015/09/15) and another containing 7998 referrers (404 link removed 2013/01/13). This is well over 8000 referrers, so please use these lists wisely, according to your own security strategy.

Disclaimer

These lists are provided “as-is” and with no guarantee of anything. If you decide to implement these lists, please be advised that I probably won’t have time to troubleshoot requests and diagnose issues. For the most part, I am providing these lists as a sort of novelty, and suggest that you build your own referrer blacklist based on your actual access logs. I do not recommend simply copying and pasting either of these lists in wholesale format. Hopefully they will serve as a comparative resource and as examples of potentially useful blacklisting accomplishments.

Download

So without further ado, here is the Ultimate Referrer Blacklist, featuring over 8000 of the Web’s spammiest referrers. Enjoy! :)

Know of other incredible referrer blacklists? Share ’em with us!

Jeff Starr
About the Author Jeff Starr = Creative thinker. Passionate about free and open Web.
Archives
8 responses
  1. Jessi Hance April 21, 2009 @ 12:04 pm

    Thank you! My employer’s website gets a lot of this crud. I’ve recommended this article to our webmasters.

  2. Jeff Starr

    Excellent! Glad to hear it may help provide some relief — I know how annoying referrer spam can be. Cheers :)

  3. Jonathan Ellse April 23, 2009 @ 9:58 am

    @ Jessi

    Yes, I agree. As a webmaster/site admin, I find a lot of time is wasted searching for someone who has been spamming comment forms,etc

    @Jeff

    Great Article, although every time I visit this site, I’m back on Requiem, rather than Quintessential, which I prefer? Is this a bug, or meant to be so?

    Anyway, Great stuff! Keep up the good work!

  4. Jeff Starr

    @Jonathan: Yes, this time I have set Requiem as the default theme. I’ve been having some search-engine crawl issues that I can’t seem to pin down. After trying everything else, I decided to see if it was the Quintessential theme that was causing the issue. After a few weeks I should know for certain and will restore good ‘ol Quint if the problem lies elsewhere.

  5. Alex Denning July 3, 2009 @ 10:14 am

    Jeff, great resource you’ve got here – I’ve already linked to it on ProBlogDesign, and just writing a CatsWhoCode article with reference to this – again, great post!

  6. Jeff Starr

    Thanks Alex — much appreciated! :)

  7. hi jeff

    could you ell me please where i have o put the the file or include it?

    goes all to my root-.htaccess?

    thanks

    • Jeff Starr

      Hey diljan,

      Correct, placing such blacklists in the root .htaccess file means that their rules are applied to the entire site. If, on the other hand, you place the rules in the .htaccess file of a subdirectory, then its rules will only be applied to all subdirectories of the subdirectory, and thus any directories above the subdirectory (such as root) will not be protected. Also, this is the 4G Blacklist, make sure to use the more current 5G instead:

      https://perishablepress.com/5g-blacklist-2013/

      Cheers

[ Comments are closed for this post ]