How to Block Proxy Servers via htaccess

Published Sunday, April 20, 2008 @ 4:38 pm • 18 Responses

Not too long ago, a reader going by the name of bjarbj78 asked about how to block proxy servers from accessing her website. Apparently, bjarbj78 had taken the time to compile a proxy blacklist of over 9,000 domains, only to discover afterwards that the formulated htaccess blacklisting strategy didn’t work as expected:

deny from proxydomain.com proxydomain2.com

Blacklisting proxy servers by blocking individual domains seems like a futile exercise. Although there are a good number of reliable, consistent proxy domains that could be blocked directly, the vast majority of such sites are constantly changing. It would take a team of professionals working around the clock just to keep up with them all.

As explained in my reply to bjarbj78’s comment, requiring Apache to process over 9,000 htaccess entries for every request could prove disastrous:

The question is, even if you could use htaccess to block over 9,000 domains, would you really want to? If you consider the potential performance hit and excessive load on server resources associated with the perpetual processing of such a monstrous list, it may inspire you to seek a healthier, perhaps more effective alternative..

A better way to block proxy servers

Rather than attempt to block proxy servers by who they are (i.e., via their specified domain identity), it is far more expedient and effective to block proxy servers by what they do. By simply blacklisting the various HTTP protocols employed by proxy servers, it is possible to block virtually all proxy connections. Here is the code that I use for stopping 99% of the proxies that attempt to access certain sites:

# block proxy servers from site access
# http://perishablepress.com/press/2008/04/20/how-to-block-proxy-servers-via-htaccess/

RewriteEngine on
RewriteCond %{HTTP:VIA}                 !^$ [OR]
RewriteCond %{HTTP:FORWARDED}           !^$ [OR]
RewriteCond %{HTTP:USERAGENT_VIA}       !^$ [OR]
RewriteCond %{HTTP:X_FORWARDED_FOR}     !^$ [OR]
RewriteCond %{HTTP:PROXY_CONNECTION}    !^$ [OR]
RewriteCond %{HTTP:XPROXY_CONNECTION}   !^$ [OR]
RewriteCond %{HTTP:HTTP_PC_REMOTE_ADDR} !^$ [OR]
RewriteCond %{HTTP:HTTP_CLIENT_IP}      !^$
RewriteRule ^(.*)$ - [F]

To use this code, copy & paste into your site’s root htaccess file. Upload to your server, and test it’s effectiveness via the proxy service(s) of your choice. It may not be perfect, but compared to blacklisting a million proxy domains, it’s lightweight, concise, and very effective ;)


Dialogue

18 Responses Jump to comment form

1Gabry

April 21, 2008 at 8:37 am

Hello was reading your page about the htaccess file to block proxy servers from surfing my site, very interesting, but my host said that since I use FrontPage editor it might block me from editing my site, is there a way to avoid this? Thank you in advance

2H5N1

April 21, 2008 at 10:20 am

Is this already effective? :)
I tried to read this arcticle via web-proxy without problem! :D
I thought this limitation was already implemented here for web-proxy…

3Perishable

April 21, 2008 at 10:48 am

Hi H5N1 :)

No, I do not block proxy servers from Perishable Press. There are a number of readers who (for whatever reason) visit this site via proxy. It is important to me to facilitate site access for this select group of individuals, even at the expense of malicious spam and other attacks. Maybe someday I will change this policy, but for now, it is my hope that the code provided in this article will prove useful to other site owners and webmasters.

4Willard

April 21, 2008 at 10:55 am

I was hoping you’d eventually create an article by itself on this subject! Good advice I must say. =)

I do have one question though.. there seems to be a small difference between what you posted the other day, and what you posted here.. specifically:

RewriteCond %{HTTP:XROXY_CONNECTION} !^$ [OR]

vs.

RewriteCond %{HTTP:XPROXY_CONNECTION} !^$ [OR]

Notice the additional P? I’m just wondering if that was added on purpose.

Thanks for the input if you have any!

5Perishable

April 21, 2008 at 4:57 pm

@Gabry: I am unfamiliar with FrontPage protocols, however you could always try uploading the code and checking for access. Then, if FrontPage is blocked, try removing one line at a time until access is achieved. If successful, this method of removing a line (or two) will reduce the overall effectiveness of the htaccess blocking rules to some degree, but should still provide a significant amount of protection. Also, a cursory search of the required FrontPage protocol indicates that the required header may in fact be X_FORWARDED_FOR or even X-FORWARDED-FOR, which isn’t on the list. So, try the code as-is first and if you are blocked, then try removing the X_FORWARDED_FOR first. Finally, if that fails, try removing different lines one at a time and checking the results. Sorry I couldn’t provide more specific advice, but hopefully these clues will help get you going in the right direction.

6Perishable

April 21, 2008 at 5:15 pm

@Willard: The reason for the change is based on research that suggests that XPROXY is the correct protocol for this purpose. However, after reading your comment and looking into it further, it seems that XROXY is also a commonly employed protocol/header for proxy servers. So, to be honest, I am considering adding the XROXY condition to the htaccess code just to be safe. Further, I am also considering adding three more common proxy protocols to the list as well, which, when added to the XROXY case, would give the following four additions:

RewriteCond %{HTTP:XROXY_CONNECTION} !^$ [OR]
RewriteCond %{HTTP:X-FORWARDED-FOR} !^$ [OR]
RewriteCond %{HTTP:FORWARDED-FOR} !^$ [OR]
RewriteCond %{HTTP:X-FORWARDED} !^$ [OR]

I am thinking that these additional directives will help improve the overall effectiveness of this proxy-blocking technique. I am not going to edit the article just yet, however, as I am hoping that someone with some deeper knowledge of the subject will chime in with some definite information on the topic. I apologize for any confusion in the matter. Thanks for sharing your concerns with us! :)

7prislea

April 23, 2008 at 4:04 am

how I can exclude some ip’s/proxy’s from the filter?

tks.

8Perishable

April 23, 2008 at 8:58 am

Hi prislea, have you tried including an additional rewrite condition targeting the specific domain, for example:

RewriteCond %{HTTP_REFERER} !.*allowedproxydomain.com.*

I haven’t tried this yet, but it may help you to get going in the right direction :)

9sam

April 24, 2008 at 12:33 am

are these conditions suppose to block sites like hidemyass.com or similar sites?
because I tried and its not blocking it.

10David

April 24, 2008 at 8:45 am

I’m looking at this page on blocking proxy servers, using a proxy server.

I tried the code, it doesn’t seem to work for proxylord.com.

11Jeff Starr

April 24, 2008 at 9:56 am

@David: see comment #3 for an explanation as to why you are able to surf this site while using a proxy. Also, this code blocks proxies by targeting associated HTTP protocols. The block list is not comprehensive, so proxies using unlisted methods will not be blocked.

12David

April 24, 2008 at 10:38 am

Yes, I see #3.
Is there a way to block the anonymous proxy server with the .htaccess codes?

Maybe it’s a go Daddy thing.

13Perishable

April 27, 2008 at 9:07 am

Hi David, it should be easy to block the anonymous proxy server via htaccess. Add the following code to your root htaccess file:

# deny domain access
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} anonymous [NC]
RewriteRule ^.* - [F,L]

..of course, this method blocks by targeting the user agent, which may or may not be the same as the domain name. Another way to block a specific proxy is to target the domain itself, as identified via referrer:

RewriteCond %{HTTP_REFERER} ^http://.*anonymous.*$ [NC]

This line should replace the RewriteCond line in the previous code. Remember to test thoroughly!

14Eric

May 2, 2008 at 3:59 pm

Will this also block Paypal IPN? Untested on my end.. waiting for a payment to come through rather than converting all my ipn stuff to sandbox.

15Perishable

May 4, 2008 at 7:23 am

Thanks for the feedback, Eric — keep us updated on the results..

16air force ones

May 22, 2008 at 2:43 am

Hey Perishable。 I have a good idea about how to block proxy server. Cause the operation system of most proxy server are Linux but the operation system of most visitor are windows. So If we can block Linux, maybe can block most proxy server.

17Perishable

May 25, 2008 at 6:31 am

Are you kidding? A good number of my visitors are Linux users. I definitely do not want to block them. I appreciate the idea, but think it would be an unwise move. The last thing I want to do is cater specifically to Windows users..

Subscribe to comments on this post


[ Comments are closed for this post. ]

If you have additional information, contact me.

Contact Perishable Press

  • Contact Jeff via form

Search Perishable Press

About Perishable Press

Perishable Press is the virtual playground of Jeff Starr — visionary, founder and lead developer of Monzilla Media, a small web and graphic design company in the lush desert oasis of Moses Lake, Washington. Perishable Press features articles and tutorials on many aspects of digital design..

Read more..

Perishable on Twitter

automation is great: i've got photoshop batch processing 300+ images while FTP is simultaneously uploading them to the server..

Perishable on Tumblr

Tons of Firewalls

Tuesday, 7 October 2008, 1:45 am

Recently overheard on conservative talk radio (instructing listeners how to obtain a free promotional video from their new website):

“This website has tons and tons of firewalls, so you have to use your real email address to download the video..”

The Quiet Search Revolution

Monday, 6 October 2008, 12:15 pm

Just a thought.. As awesome as Google is these days, it would suck if they ended up owning the entire search-engine business. When they get to the point where all competition is impossible (due to their sheer size, financial resources, media influence, etc.), how many alternate search engines will have the resources for continuous improvement and top-quality search results? When this happens, we will have no choice but to do exactly what Google tells us to do.

As deeply ingrained as it is for everyone to instinctively and unthinkingly turn to Google for their search activity, it is time to leave a few alternate search tabs open for as much use as possible. Instead of using Google just because that’s what you always do, try your search on MSN, Yahoo, Ask, or any of the other independent search engines instead. Sharing traffic with other search engines is a nice, quiet way to keep the competitive spirit alive and well in the search-engine business.

Disappearing WordPress Posts

Wednesday, 1 October 2008, 7:50 pm

Today I experienced difficulties while trying to publish or even save new posts in WordPress. I would compose the post as usual, add all of the keywords, tags, meta tags, and so on, but as soon as I clicked the “Publish” or “Save” button, the post would just disappear from existence.

The weird thing is that during the drafting process, WordPress’ default auto-save feature showed that the post had been saved at expected intervals. Unfortunately, after trying to publish several different posts, WordPress showed absolutely no record of the posts ever being created. They simply vanished into thin air.

Fortunately, a little investigation revealed the culprit. If you should find yourself dealing with this same issue, here are some different things that you should try. First, re-upload fresh copies of your entire WordPress installation. I don’t know why exactly, but apparently various files can either go stale or completely disappear from the server. Overwriting or writing fresh files may do the trick.

If that doesn’t work, check your WordPress database for errors. In my case, a little investigation revealed that something had caused a couple of fatal errors in the wp_posts table. Fortunately, checking and repairing the table solved the issue.

Tumblr Battles

Wednesday, 1 October 2008, 5:30 pm

Please excuse the duplicate Tumbr posts.. seems there is no way to ping Tumblr to refresh/rebuild the RSS feed according to changes in post content. So, to resolve the issue I have discussed now like two or three times regarding paragraph elements and proper feed formatting, I have no choice but to repost a majority of my text posts.

This is necessary for the proper import and display of my Tumblr feed into WordPress. Currently, there are five items displayed at once, each styled according to proper inclusion of paragraph tags. Thus, whenever the Tumblr feed “forgets” to enclose single-paragraph posts with the proper tags, the result is an unstyled post entry displayed on my site.

Assuming that makes sense, you will please excuse my dust while I repost a few older entries in an attempt to reconstruct (the hard way) a properly formatted Tumblr feed.

More Optimization Measures

Wednesday, 1 October 2008, 5:27 pm

Another important step in improving the performance of my recent redesign involves the optimization of both CSS and JavaScript content. During development there were around 15 server requests for these two types of files, 10 JavaScript files and 5 CSS files. This was okay for my own use, but would not work for production purposes.

Optimizing these file types involves consolidation, compression, and caching. Consolidation of 10 JavaScript files into three is huge improvement. Now I deliver one JS file for the functionality of the site, one for Mint, and another for Analytics. Likewise for the stylesheets; after consolidation, a single stylesheet is delivered to all modern browsers. There are two additional stylesheets as well, but they are targeted at IE6 and mobile browsers and will not load elsewhere.

Once the files were consolidated as much as possible, it was time to optimize or “crunch” them. Using the sexy Flumpcakes CSS optimizer, I was able to reduce my stylesheets by around 25%. Likewise for JavaScript, I used xtreeme.com’s optimizer to shave an additional 20% off the size of my JS content.

Finally, once I had consolidated and compressed my JS and CSS files as much as possible, I wanted to further my optimization efforts by ensuring that these files were cached by the browser. By setting far-future Expires headers for everything but the statistical files, my site gains an additional performance boost by eliminating the need to reload preexisting content.

Read more on Tumblr..

Subscribe to Comments Recent Dialogue

  • Adam Singer: Thanks for this. You're right, if it isn't broken, don't fix it. I was about to update my permalinks and install a plugin to redire...
  • Marilyn: It looks great on my browser! I wish I had that much creativity in my head! It's gorgeous!...
  • Randy: "Too girly?" It looks like a great design. Define "too girly!"...
  • Christopher Ross: .htaccess based redirects are wonderful. I'm always baffled by web professionals who don't take the time to learn more about them....
  • federico: Hi Jeff... tnx so much...it worked perfectly... c u Federico...
  • Cooltad: The skin seems (mostly) fine in my expert opinion. Your one of the few people able to make a design with a transparent table and a b...
  • Neal: The free Intro to Linux book is a great place to start http://www.ischool.utexas.edu/mirrors/LDP/LDP/intro-linux/html/index.html ...
  • Louis: @Jeff: Your “Archives” page is slick, although I would expect a cleaner implementation from such a vehement advoc...
  • Jeremy: Well I think that you may be over-critical, I don't see a darn thing wrong with it - I like it a lot!...
  • Jeff Starr: Alright, this is exactly the kind of information I was hoping to get. Lots of great ideas and recommendations here. I will be reading...

Read more recent comments..