Eight Ways to Blacklist with Apache’s mod_rewrite

Posted on February 3, 2009 in Function by

With the imminent release of the next series of (4G) blacklist articles here at Perishable Press, now is the perfect time to examine eight of the most commonly employed blacklisting methods achieved with Apache’s incredible rewrite module, mod_rewrite. In addition to facilitating site security, the techniques presented in this article will improve your understanding of the different rewrite methods available with mod_rewrite.

Blacklist via Request Method

[ #1 ] This first blacklisting method evaluates the client’s request method. Every time a client attempts to connect to your server, it sends a message indicating the type of connection it wishes to make. There are many different types of request methods recognized by Apache. The two most common methods are GET and POST requests, which are required for “getting” and “posting” data to and from the server. In most cases, these are the only request methods required to operate a dynamic website. Allowing more request methods than are necessary increases your site’s vulnerability. Thus, to restrict the types of request methods available to clients, we use this block of Apache directives:

<IfModule mod_rewrite.c>
 RewriteEngine On
 ServerSignature Off
 Options +FollowSymLinks
 RewriteCond %{REQUEST_METHOD} ^(delete|head|trace|track) [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

The key to this rewrite method is the REQUEST_METHOD in the rewrite condition. First we invoke some precautionary security measures, and then we evaluate the request method against our list of prohibited types. Apache will then compare each client request method against the blacklisted expressions and subsequently deny access to any forbidden requests. Here we are blocking delete and head because they are unecessary, and also blocking trace and track because they violate the same-origin rules for clients. Of course, I encourage you to do your own research and establish your own request-method security policy.

Blacklist via the Request

[ #2 ] The next blacklisting method is based on the client’s request. When a client attempts to connect to the server, it sends a full HTTP request string that specifies the request method, request URI, and transfer-protocol version. Note that additional headers sent by the browser are not included in the request string. Here is a typical example:

GET blog/index.html HTTP/1.1

This long request string may be checked against a list of prohibited characters to protect against malicious requests and other exploitative behavior. Here is an example of sanitizing client requests by way of Apache’s THE_REQUEST variable:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

Here we are evaluating the entire client-request string against a list of prohibited entities. While there are many character strings common to malicious requests, this example focuses on the prevention of HTTP response splitting, cross-site scripting attacks, cache poisoning, and similar dual-header exploits. Although these are some of the most common types of attacks, there are many others. I encourage you to check your server logs, do some research, and sanitize accordingly.

Blacklist via the Referrer

[ #3 ] Blacklisting via the HTTP referrer is an excellent way to block referrer spam, defend against penetration tests, and protect against other malicious activity. The HTTP referrer is identified as the source of an incoming link to a web page. For example, if a visitor arrives at your site through a link they found in the Google search results, the referrer would be the Google page from whence the visitor came. Sounds straightforward, and it is.

Unfortantely, one of the biggest spam problems on the Web involves the abuse of HTTP referrer data. In order to improve search-engine rank, spambots will repeatedly visit your site using their spam domain as the referrer. The referrer is generally faked, and the bots frequently visit via HEAD requests for the sake of efficiency. If the target site publicizes their access logs, the spam sites will receive a search-engine boost from links in the referrer statistics.

Fortunately, by taking advantage of mod_rewrite’s HTTP_REFERER variable, we can forge a powerful, customized referrer blacklist. Here’s our example:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTP_REFERER} ^(.*)(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
 RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*(-|.)?adult(-|.).*$  [NC,OR]
 RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*(-|.)?poker(-|.).*$  [NC,OR]
 RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*(-|.)?drugs(-|.).*$  [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

Same basic pattern as before: check for the availability of the rewrite module, enable the rewrite engine, and then specify the prohibited character strings using the HTTP_REFERER variable and as many rewrite conditions as necessary. In this case, we are blocking a series of potentially malicious characters in the first condition, and then blacklisting any referrer containing the terms “adult”, “poker”, or “drugs”. Of course, we may blacklist as many referrer strings as needed by simply emulating the exisiting rewrite conditions. Just don’t get carried away — I have seen some referrer blacklists that are over 4000 lines long!

Blacklist via Cookies

[ #4 ] Protecting your site against malicious cookie exploits is greatly facilitated by using Apache’s HTTP_COOKIE variable. HTTP cookies are chunks of data sent by the server to the web client upon initialization. The browser then sends the cookie information back to the server for each subsequent visit. This enables the server to authenticate users, track sessions, and store preferences. A common example of the type of functionality enabled by cookies is the shopping cart. Information about the items placed in a user’s shopping cart may be stored in a cookie, thereby enabling server scripts to respond accordingly.

Generally, a cookie consists of a unique string of alphanumeric text and persists for the duration of a user’s session. Apache’s mod_cookie module generates cookie values randomly and upon request. Once a cookie has been set, it may be used as a database key for further processing, behavior logging, session tracking, and much more. Unfortunately, this useful technology may be abused by attackers to penetrate and infiltrate your server’s defenses. Cookie-based protocols are vulnerable to a variety of exploits, including cookie poisoning, cross-site scripting, and cross-site cooking. By adding malicious characters, scripts, and other content to cookies, attackers may find and exploit sensitive vulnerabilities.

The good news is that we may defend against most of this nonsense by using Apache’s HTTP_COOKIE variable to blacklist characters known to be associated with malicious cookie exploits. Here is an example that does the job:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTP_COOKIE} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

This is as straightforward as it looks. Check for the required rewrite module, enable the rewrite engine, and deny requests for any HTTP_COOKIEs containing the specified list of prohibited characters. In this list you will see characters generally required to execute any sort of scripted attack: opening and closing angle brackets, single quotation marks, and a variety of hexadecimal equivalents. Feel free to expand this list with additional characters as you see fit. As always, recommendations are most welcome.

Blacklist via Request URI

[ #5 ] Use of Apache’s REQUEST_URI variable is frequently seen in conjunction with URL canonicalization. The REQUEST_URI variable targets the requested resource specified in the full HTTP request string. Thus, we may use Apache’s THE_REQUEST variable to target the entire request string (as discussed above), while using the REQUEST_URI variable to target the actual request URI. For example, the REQUEST_URI variable refers to the “blog/index.html” portion of the following, full HTTP request line:

GET blog/index.html HTTP/1.1

For canonicalization purposes, this is exactly the type of information that must be focused on and manipulated in order to achieve precise, uniform URLs. Likewise, for blacklisting malicious request activity such as the kind of nonsense usually exposed in your server’s access and error logs, targeting, evaluating, and denying malicious URL requests is easily accomplished by taking advantage of Apache’s REQUEST_URI variable.

As you can imagine, blacklisting via REQUEST_URI is an excellent way to eliminate scores of malicious behavior. Here is an example that includes some of the same characters and strings that are blocked in the upcoming 4G Blacklist:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{REQUEST_URI} ^.*(,|;|:|<|>|">|"<|/|\\\.\.\\).* [NC,OR]
 RewriteCond %{REQUEST_URI} ^.*(\=|\@|\[|\]|\^|\`|\{|\}|\~).* [NC,OR]
 RewriteCond %{REQUEST_URI} ^.*(\'|%0A|%0D|%27|%3C|%3E|%00).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

Again, same general pattern of directives as before, only this time we are specifying forbidden characters via the REQUEST_URI variable. Here we are denying any URL requests containing invalid characters, including different types of brackets, various punctuational characters, and some key hexadecimal equivalents. Of course, the possibilities are endless, and the blacklist should be customized according to your specific security strategy and unfolding blacklisting needs.

Blacklist via the User Agent

[ #6 ] Blacklisting via user-agent is a commonly seen strategy that yields questionable results. The concept of blacklisting user-agents revolves around the idea that every browser, bot, and spider that visits your server identifies itself with a specific user-agent character string. Thus, user-agents associated with malicious, unfriendly, or otherwise unwanted behavior may be identified and blacklisted in order to prevent against future access. This is a well-known blacklisting strategy that has resulted in some extensive and effective user-agent blacklists.

Of course, the downside to this method involves the fact that user-agent information is easily forged, making it difficult to know for certain the true identity of blacklisted clients. By simply changing their user-agent to an unknown identity, malicious bots may bypass every blacklist on the Internet. Many evil “scumbots” indeed do this very thing, which explains the incredibly vast number of blacklisted user-agents. Even so, there are certain limits to the extent to which certain user-agent strings may be changed. For example, GNU’s Wget and the cURL command-line tool are difficult to forge, and many other clients have hard-coded user-agent strings that are difficult to change.

On Apache servers, user-agents are easily identified and blacklisted via the HTTP_USER_AGENT variable. Here is an example:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTP_USER_AGENT} ^$                                                              [OR]
 RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).*                            [NC,OR]
 RewriteCond %{HTTP_USER_AGENT} ^.*(HTTrack|clshttp|archiver|loader|email|nikto|miner|python).* [NC,OR]
 RewriteCond %{HTTP_USER_AGENT} ^.*(winhttp|libwww\-perl|curl|wget|harvest|scan|grab|extract).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

This method works just like the others: check for the mod_rewrite module, enable the rewrite engine, and proceed to deny access to any user-agent that includes any of the blacklisted character strings in its name. As with our previous blacklisting techniques, here we are prohibiting angle brackets, single quotation marks, and various hexadecimal equivalents. Additionally, we include a handful of user-agent strings commonly associated with server attacks and other malicious behavior. We certainly don’t need anything associated with libwww-perl hitting our server, and many of the others are included in just about every user-agent blacklist that you can find. There are tons of other nasty user-agent scumbots out there, so feel free to beef things up with a few of your own.

Blacklist via the Query String

[ #7 ] Protecting your server against malicious query-string activity is extremely important. Whereas static URLs summon pages, their appended query strings transmit data and pass variables throughout the domain. Query-string information interacts with scripts and databases, influencing behavior and determining results. This relatively open channel of communication is easily accessible and prone to external manipulation. By altering data and inserting malicious code, attackers may penetrate and exploit your sever directly through the query string.

Fortunately, we can protect our server against malicious query-string exploits with the help of Apache’s invaluable QUERY_STRING variable. By taking advantage of this variable, we can ensure the legitimacy and quality of query-string input by screening out and denying access to a known collection of potentially harmful character strings. Here is an example that will keep our query strings squeaky clean:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).*                                [NC,OR]
 RewriteCond %{QUERY_STRING} ^.*(\.|\*|;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*                 [NC,OR]
 RewriteCond %{QUERY_STRING} ^.*(md5|benchmark|union|select|insert|cast|set|declare|drop|update).* [NC]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

As you can see, here we are using the QUERY_STRING variable to check all query-string input against a list of prohibited alphanumeric characters strings. This strategy will deny access to any URL-request that includes a query-string containing localhost references, invalid punctuation, hexadecimal equivalents, and various SQL commands. Blacklisting these enitities protects us from common cross-site scripting (XSS), remote shell attacks, and SQL injection. And, while this a good start, it pales in comparison to the new query-string directives of the upcoming 4G Blacklist. ;)

Blacklist via IP Address

[ #8 ] Last but certainly not least, we can blacklist according to IP address. Blacklisting sites based on IP is probably the oldest method in the book and works great for denying site access to stalkers, scrapers, spammers, trolls, and many other types of troublesome morons. The catch is that the method only works when the perpetrators are coming from the same location. An easy way to bypass any IP blacklist is to simply use a different ISP or visit via proxy server. Even so, there is no lack of mindless creeps out there roaming the Internet, who sit there, using the same machine, day after day, relentlessly harassing innocent websites. For these types of lazy, no-life losers, blacklisting via IP address is the perfect solution. Here is a hypothetical example demonstrating several ways to blacklist IPs:

# block individual IPs
<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{REMOTE_ADDR} ^123\.456\.789\.1$ [OR]
 RewriteCond %{REMOTE_ADDR} ^456\.789\.123\.2$ [OR]
 RewriteCond %{REMOTE_ADDR} ^789\.123\.456\.3$ [OR]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

# block ranges of IPs
<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{REMOTE_ADDR} ^123\.$ [OR]
 RewriteCond %{REMOTE_ADDR} ^456\.789\.$ [OR]
 RewriteCond %{REMOTE_ADDR} ^789\.123\.456\.$ [OR]
 RewriteRule ^(.*)$ - [F,L]
</IfModule>

# alt block IP method
<Limit GET POST PUT>
 order allow,deny
 allow from all
 deny from 123.
 deny from 123.456.
 deny from 123.456.789.0
</Limit>

In the first block, we are blacklisting three specific IP addresses using Apache’s mod_rewrite and its associated REMOTE_ADDR variable. Each of the hypothetical IPs listed represent a specific, individual address. Then, in the next code block, we are blocking three different ranges of IPs by omitting numerical data from the targeted IP string. In the first line we target any IP beginning with “123.”, which is an enormous number of addresses. In the second line, we block a different, more restrictive range by including the second portion of the address. Finally, in the third line, we block a different, much smaller range of IPs by including a third portion of the address.

Then, just for kicks, I threw in an alternate method of blocking IPs. This is an equally effective method that enables you to block IP addresses and ranges as specifically as necessary. Each deny line pattern-matches according to the specified IP string.

Dealing with Blacklisted Visitors

In each of these eight blacklisting techniques, we respond to all blacklisted visitors with the server’s default “403 Forbidden” error. This page serves its purpose and requires very little to deliver in terms of system resources, however there is much more that you can do with blacklisted traffic. Here are a few ideas:

Redirect to home page
More subtle than the 403 error, this redirect strategy routes blocked traffic directly to the home page. To use, replace the RewriteRule directive (i.e., the entire line) with the following code:

RewriteRule ^(.*)$ http://your-domain.tld/ [F,L]

Redirect to external site
The possibilities here are endless. Just make sure you think twice about the destination, as any scum that you redirect to another site will be seen as coming from your own. Even so, here is the code that you would use to replace the RewriteRule directive in any of the examples above:

RewriteRule ^(.*)$ http://external-domain.tld/some-target/page.html [F,L]

Redirect them back to their own site
This is one of my favorites. It’s like having a magic shield that reflects attacks back at the attacker. Send a clear message by using this code as the RewriteRule directive in any of our blacklisting methods:

RewriteRule ^(.*)$ http://%{REMOTE_ADDR}/ [F,L]

Custom processing
For those of you with a little skill, it is possible to redirect your unwelcome guests to a fail-safe page that explains the situation to the client while logging all of the information behind the scenes. This is perhaps the most useful approach for understanding your traffic and developing an optimal security strategy. The code would look something like this, depending on your file name and its location:

RewriteRule ^(.*)$ /home/path/blacklisting-script.php [F,L]

Closure

This article presents eight effective techniques for protecting your server and preventing malicious behavior. While each of these methods may be used individually, they are designed to secure different aspects of your environment and thus provide a more complete type of firewall protection when combined into a synergized whole. Even when combining these techniques, however, keep in mind that blacklisting various protocols serves to complement a more robust and comprehensive security strategy. Once understood, these methods provide the average webmaster an easy, effective way of defending against unwanted behavior and enhancing the overall security of their sites.

Related articles

45 Responses

  1. [ Gravatar Icon ] Shane Arthur says:

    Very interesting indeed. Just curious, If someone goes with a standard host like MidPhase and does a one-click install, what security measures is one likely to get by default as compared to what you mentioned here?

    Regards
    Shane

  2. [ Gravatar Icon ] Donace says:

    Looking forward to the new release!

    One point in regards to the last chunk of code about 403’s etc; is it possible to redirect them to their homepage with a ‘banner’ on top saying whoopsie daisy you were not meant to be there…and then take over their server?

    Nah i’m joking…(kinda)looks great!

    I’ll also point out askapaches security thing for more ideas as I usually use it in conjuncture with your list and hey keeping an eye on one is easier ;)
    (http://www.askapache.com/htaccess/htaccess-plugin-blocks-spam-hackers-and-password-protects-blog.html)

  3. [ Gravatar Icon ] Tuan Anh says:

    Thank you for the great list of methods. I like the return-back-to-attacker’s-website method, it’s so smart.

    And I’m wonderring about blocking by request string, because maybe these string are valid, for ex. when we write an article about md5 or benchmark, … Does our method works in this situation?

  4. [ Gravatar Icon ] Kristian says:

    This was a great walk through of blaklist methods and i suspect that i might be handy for me. Bookmarked into my Web Devevelopment folder.

  5. [ Gravatar Icon ] Jeff Starr says:

    @Shane Arthur: Good question, but difficult to know for sure without access to the server configuration, software, etc. That said, I am confident that most of the “big-name” hosts implement some pretty solid security strategies out of necessity; some of the smaller, independent hosts, I wouldn’t be so sure about..

    The different blacklisting methods described in this article are best suited as reinforcements to a more comprehensive strategy. I recommend checking your server access logs periodically and checking for foul play. If you see tons of nonsense getting through, the methods described in this article will help you craft both targeted and general defense mechanisms.

  6. [ Gravatar Icon ] Jeff Starr says:

    @Donace: hehe, you said it — I would like to devise a method that hunts downs and disembowels any cracker or spammer scumbag that goes anywhere near my sites. You know, something that chops them up into little pieces and feeds the still-warm chunks of cracker flesh to a small pack of hungry Pomeranians! Now that would be a script worth paying for! ;)

  7. [ Gravatar Icon ] Jeff Starr says:

    @Tuan Anh: Thanks for the positive feedback. To answer your question, no, the methods described in the article are carefully designed to not interfere with any legitimate URL requests. For example, the md5 and benchmark strings are only blocked in query strings, where they have no business existing in the first place. For the root portion of URLs, we only target invalid characters and other entities frequently associated with malicious behavior.

  8. [ Gravatar Icon ] Jeff Starr says:

    @Kristian: Excellent — Glad to hear it! :)

  9. [ Gravatar Icon ] debu says:

    Its really a good post.. i appreciate :-)

  10. [ Gravatar Icon ] AskApache says:

    Wow great article perishable, so clear/concise/accurate and compelling! Well done.

  11. [ Gravatar Icon ] Jeff Starr says:

    @AskApache: Thank you! Your site was a great help in researching this article. Keep up the good work! :)

  12. [ Gravatar Icon ] rc says:

    great stuff! My httpd seems to 403 choke on your first URI rule. gotta work on my regular expressions…

  13. [ Gravatar Icon ] Jeff Starr says:

    @rc: Yeh, that’s one of the downfalls of not having my own server. Most everything I do with Apache directives happens through the HTAccess file. Thus, articles such as this one are presented from that perspective. And as you know, there is a bit of a difference in syntax (among other things) between HTAccess directives and rules implemented via httpd.cfg.

  14. [ Gravatar Icon ] rc says:

    sorry to be imprecise. I am jamming your rules into .htaccess (I don’t have access to httpd.conf (MediaTemple))

    the URI being 403ed is of the form http://domain.com/

  15. [ Gravatar Icon ] rc says:

    i think I found it. i believe you are improperly catching the final forward slash

  16. [ Gravatar Icon ] Jeff Starr says:

    Help me out here, rc — there’s quite a bit of code in the article.. which directives are at issue here?

  17. [ Gravatar Icon ] rc says:

    I’m referring to your first URI rule.

  18. [ Gravatar Icon ] rc phelps says:

    hi jeff,

    is the following httpd log file entry indicative of mischievous behavior:

    [Thu Apr 09 08:22:24 2009] [error] [client 204.234.223.2] Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.

    Curous that the state of nebraska would be banging on my lowly webserver.

    thanks for any insight.

    rc phelps

  19. [ Gravatar Icon ] Jeff Starr says:

    @rc phelps: That error message is telling you that the URL request resulted in too many redirects, as specified by your server. This means that something requested a web page that was redirected by your server. That redirect was then redirected to another resource, and then that redirect was redirected, and so on until the maximum number was reached. This is probably due to some unruly HTAccesss directives or something script-related. I.e., most likely not the result of mischievous behavior.

  20. [ Gravatar Icon ] kg says:

    I also had trouble with the first REQUEST_URI line

    RewriteCond %{REQUEST_URI} ^.*(,|;|:|<|>|">|"<|/|\\\.\.\\).* [NC,OR]

    I had to remove the forward slash from the list, like so

    RewriteCond %{REQUEST_URI} ^.*(,|;|:|<|>|">|"<|\\\.\.\\).* [NC,OR]

  21. [ Gravatar Icon ] Jeff Starr says:

    @kg: Thanks for the information. Escaping the character may also have worked ( \/ ):

    RewriteCond %{REQUEST_URI} ^.*(,|;|:|<|>|">|"<|\/|\\\.\.\\).* [NC,OR]

    Although I haven’t tested it..

  22. [ Gravatar Icon ] Vladimir says:

    Here’s another rule set that blocks many HTTP-scanners, maybe someone will find it useful:

    RewriteEngine On

    RewriteCond %{QUERY_STRING} [^?]*\? [OR]
    RewriteCond %{QUERY_STRING} (\.\./|\.\.\\) [OR]
    RewriteCond %{QUERY_STRING} (///) [OR]
    RewriteCond %{THE_REQUEST} "^(GET|POST) /?https?:" [OR]
    RewriteCond %{THE_REQUEST} "^(GET|POST|HEAD) //"
    RewriteRule (.*) $1 [F]

    The first RewriteCond checks if the query string has more than one question mark (this pattern is used in some attacks; moreover, extra question marks should be encoded tp %3F), the second one tries to prevent directory traversal attacks (for both Windows and Linux hosts), the third one disallows three or more slashes in the query string (common pattern in many attacks), the fourth and the fifth ones stops proxy checkers.

  23. [ Gravatar Icon ] Jeff Starr says:

    Another excellent post, Vladimir — thanks for sharing with us. This is a great set of HTAccess security directives, some of which are already included in my 4G Blacklist in the “Query String Exploits” section. I like the check for double question marks, and the proxy-checking directives are just plain sexy. A couple of questions for you:

    1. What are your thoughts on simply blocking any instances of two periods (i.e., \.\. )?
    2. What exactly is going on in the RewriteRule? Seems like a possible typo?

    Thanks again for the comment :)

  24. [ Gravatar Icon ] Vladimir says:

    What are your thoughts on simply blocking any instances of two periods

    Well, maybe… This depends upon what comes in GET and what in POST. For example, when you search for something in WordPress, the string is passed in GET request. A visitor could make a typo and by accident put two periods and it would not be user-friendly to show a 403 page.

    And, if you try a directory traversal attack, you still need to use either forward or backward slash — cf. ..etc/password and ../etc/password. So it looks like a slash is a must in this type of attacks.

    What exactly is going on in the RewriteRule? Seems like a possible typo?

    RewriteRule (.*) $1 [F]

    That is, anything that matches RewriteCond’s gets banned ([F])

    Well, maybe

    RewriteRule .* - [F]

    is better, but both worked for me.

    Would you mind if I scan your site with Nessus and Nikto? This can give you more attack patterns. I will launch the scanner from 195.10.218.132, please do not ban me :-)

  25. [ Gravatar Icon ] Jeff Starr says:

    I am on the fence about blocking any instance of two simultaneous periods in the query string. Then the presence of a forward-vs-backslash along with the request is also an interesting dilemma. Currently, I only block the case when \.\.\/ is present in the query string, but I am thinking that blocking the backslash case is also a good idea. I almost blocked backslashes (either one or two) via mod_alias as well. I just can’t imagine why they would be needed unless encoded. Anyway, food for thought.

    I assumed that your RewriteRule was blocking any matching requests, but I had never seen that particular flavour before, so I thought I would ask. Very interesting.

    Go ahead and scan my site using the specified IP. I am currently running a series of tests myself, so it will be interesting to see the results given the current conditions. I won’t ban you ;)

  26. [ Gravatar Icon ] Rima says:

    Hi
    I have Wordpress blog and I’d like to hide the login URL totally.. or to be honest I do want visitors to know that I’m using wordpress because it’s not the only thing running the website.

    I have tried a wordpress plug-in that generates this for example

    RewriteRule ^logout wp-login.php?action=logout&_wpnonce=b4318ad0cd&stealth_out_key=cbmfojbhqsdaxjem1q0jyzyivq [L]

    well that’s nice, but when you go to /login it redirects and shows in the address bar wp-login?action=……. etc.

    I don’t want that to happen, I want the whole thing to be totally masked.

    Any advices ?

  27. [ Gravatar Icon ] Jeff Starr says:

    Hi Rima, the URL of any webpage will always be available to the visitor. There is no way (of which I am aware) to completely hide it. Fascinating that you actually would want to deprive users of that information.

  28. [ Gravatar Icon ] 2hei says:

    Hi
    my site use https,all http request will rewirte to https.

    RewriteEngine on
    RewriteCond %{SERVER_PORT} !^443$
    RewriteRule ^/special-interface - [L]
    RewriteRule ^(.*)?$ https://%{SERVER_NAME}$1 [L,R=301]

    I find post request in http rewirte to https become get method,and all posted data are lost . do you have have any solution? or mod_rewrite cann’t do with post method?

  29. [ Gravatar Icon ] Ciaran says:

    Brilliant Article. Very comprehensive and well written.

    I need to do the following.

    Rewrite all requests that try to POST to /form_submit.jsp where the referrer URL is:

    http://mysite.mydomain.com/my_new_thing.jsp?MessageID=-1&amp;SID=_&amp;SID=1234_56789

    The referrer URL above is perfectly valid & acceptable, what i need to block is any request where the query string contains SID=1234_56789

    Any help would be greatly appreciated.

  30. [ Gravatar Icon ] John says:

    I’m trying to block proxies with some success, but I’ve found out someone is using http://www.mywebtunnel.com/ which selects at random. One of them http://www.8cap.info gives the following entry in the apache logs :-

    . - - [15/Jul/2010:22:23:03 -0600] "GET / HTTP/1.1" 200 9032 "-" "Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.2.4) Gecko/20100622 Fedora/3.6.4-1.fc13 Firefox/3.6.4"

    How can I ban an IP address of “.” (first field)?

    Thanks, John.

  31. [ Gravatar Icon ] John says:

    Found the problem, it was reverse DNS lookups enabled in the logs and the site had set their reverse lookup to a dot “.

  32. [ Gravatar Icon ] Jeff Starr says:

    Thanks for the follow-up information John :)

  33. [ Gravatar Icon ] Paul says:

    I want to use the “Blacklist via the Referrer” to block my site from everybody except when they come from a certain web page.

    example: My site is only accessible if the come from a google site.
    Any help would be greatly appreciated.

  34. [ Gravatar Icon ] John says:

    To block a referrer from google I use with in the .htaccess

    RewriteEngine on
    RewriteCond %{HTTP_REFERER} ^http://google\.com [NC,OR]
    RewriteCond %{HTTP_REFERER} ^http://www\.google\.com
    RewriteRule ^.* - [F,L]

    So I think the opposite would be

    RewriteEngine on
    RewriteCond %{HTTP_REFERER} !^http://google\.com [NC,OR]
    RewriteCond %{HTTP_REFERER} !^http://www\.google\.com
    RewriteRule ^.* - [F,L]

  35. [ Gravatar Icon ] Larry Brown says:

    Hi Jeff: Thanks for the great article.
    Just in case you don’t know yet, I can explain Vladimir’s syntax for the RewriteRule:

    RewriteRule (.*) $1 [F]

    The .* matches the entire request string, so if the request string was “/index.html”, the .* will evaluate to “/index.html” that is, the entire string. The parenthesis around that expression mark it as something that can be referred to later with the $1 syntax. The $1 refers back to the string that was matched by the “.*”. First off we can say that Vladimir’s $1 is, at best, superfluous because he could have just put a “-” there, which would mean “don’t perform any substitution at all, it’s not needed” and the [F] would send the requester to the “forbidden” page, Vladimir already said that this gives him the same behavior. However, the $1 has me confused a little. That would mean that if a user requests /index.html, Vladimir’s rule would expand out to:

    RewriteRule /index.html /index.html [F]

    I would have thought that the [F] string would unconditionally send the requester to the forbidden page, regardless of the “substitution” string (the 2nd string in the RewriteRule), and in fact I do think that is the behavior that Vladimir is getting, which is why his $1 is simply superfluous and not actually wrong. However, you give some examples of sending the requester off to a different place with: RewriteRule ^(.*)$ http://your-domain.tld/ [F,L], indicating that in at least some situations the RewriteRule with the [F] does take the substitution string into account and send the requester there. Perhaps because you are including the http:// on your substitution string it works that way, but in Vladimir’s case it tries to send the requester to somewhere on the local domain and thus ignores that since it’s all forbidden.

    In your examples you used “^(.*)$” to do that same job. That expression is: Starting at the start of the request string, match any string all the way to the end, and again the parenthesis mark it as something that can be referred to later. Your expression is overkill. You’re not referencing the matched string later with the $1 syntax, so there’s no need to includ the parens, thus you could just say: “^.*$”. And if you’re matching the entire string, there’s no need to specify the begin and/or end, so you can just say “.*”. Thus, your rules of the form:

    RewriteRule ^(.*)$ - [F,L]

    Would work the same and be simpler (and therefore better) if written as:

    RewriteRule .* - [F,L]

    I’m fuzzy on Vladimir’s substitution, but at least I hope that I’ve explained the parens, the $1, and the lack of ^ and $.

  36. [ Gravatar Icon ] Jennifer says:

    Thanks for all the work put into all of you terrific articles. I have always been plagued with questions and confused when reading articles about mod-rewrite rules and conditions. So I’m going to risk asking for some explanation here because there are so many examples.

    I’m assuming that if one employs all of the techniques in this article that they would be in one statement within the .htaccess file. Is that correct?

    The reason I ask is that in every example there are always these common statements.
    The opening if statment
    RewriteEngine On
    RewriteRule ^(.*)$ - [F,L]
    and the closing if statement.

    The first example is the only exception to this because it includes these statements:
    ServerSignature Off
    Options +FollowSymLinks

    I’m assuming again that these statements could be used within a single block of code in conjunction with the conditions without error. Correct?

    So ultimately, what I’m asking is; can’t one use a single block of code in the .htaccess file, instead of all the redundancy?

    Example:
    Opening if statement
    RewriteEngine On
    ServerSignature Off
    Options +FollowSymLinks

    All Rewrite conditions from each example here

    RewriteRule ^(.*)$ - [F,L]

    Closing if statement

    Thanks for allowing me to ask what may seem like a simple minded question.

  37. [ Gravatar Icon ] Jeff Starr says:

    Hi Jennifer,

    Yes very true, there are many ways to fashion your Apache directives, but this article treats the eight techniques as independently functioning examples that people can grab and modify without having to read the entire article.

    A good example of combining these different techniques is seen in these htaccess blacklists:

    Thanks for the great questions! :)

  38. [ Gravatar Icon ] Byzon says:

    I have a website that is highly flooded last days.
    The flood comes trough scripts that generates fake url requests to my site. I want to block them but i dont now how can u please help me to block theese types of requests

    109.193.54.12 - - [23/Jan/2011:20:16:32 -0500] "GET /index.php?app=forums&amp;module=ajax&amp;section=markasread&amp;secure_key=4qgvsqett1glepq2o5p1exqazhndd6hl&amp;i=1&amp;forumid=471&amp;secure_key=4qgv sqett1glepq2o5p1exqazhndd6hl HTTP/1.0" 403 432 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.13) Gecko/20101203 YFF35 Firefox/3.6.13"

    OR

    95.174.209.84 - - [23/Jan/2011:20:31:54 -0500] "GET /index.php?app=forums&amp;module=ajax&amp;section=markasread&amp;secure_key=tkqnx2x8ceo1ipglifpmo7qepd7s1ogo&amp;i=1&amp;forumid=361&amp;secure_key=tkqnx2x8ceo1ipglifpmo7qepd7s1ogo HTTP/1.0" 403 432 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Maxthon; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; InfoPath.2; Creative AutoUpdate v1.10.10)"

    OR

    79.170.163.222 - - [23/Jan/2011:20:30:13 -0500] "GET /public/style_images/test/topt-bg.gif HTTP/1.0" 200 9746 "http://forum.soundarea.org/index.php?/forum/236-gevkpsyrln/page__prune_day__100__sort_by__A-Z__sort_key__starter_name__topicfilter__all__st__1380" "Opera/9.80 (Windows NT 5.1; U; ru) Presto/2.7.62 Version/11.00"

    OR

    92.72.151.58 - - [23/Jan/2011:20:30:46 -0500] "GET /index.php?/forum/98-tuaeiwmbzh/page__prune_day__100__sort_by__A-Z__sort_key__starter_name__topicfilter__all__st__1540 HTTP/1.0" 200 26817 "-" "Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.7.62 Version/11.00"

    OR

    2.37.132.232 - - [23/Jan/2011:20:30:45 -0500] "GET /index.php?/forum/638-vlyxkruhxe/page__prune_day__100__sort_by__Z-A__sort_key__last_poster_name__topicfilter__all__st__1420 HTTP/1.0" 200 23762 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; BTRS28059; GTB6.6; MRA 5.7 (build 3757); MRSPUTNIK 2, 3, 0, 288; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30

    Thank you in advance!

  39. [ Gravatar Icon ] Ben says:

    Jeff, I am looking at your code for blocking by User-Agent. Here is the rule that you have written for putting into a VHost file:

    <IfModule mod_rewrite.c>
     RewriteEngine On
     RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
     RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
     RewriteCond %{HTTP_USER_AGENT} ^.*(HTTrack|clshttp|archiver|loader|email|nikto|miner|python).* [NC,OR]
     RewriteCond %{HTTP_USER_AGENT} ^.*(winhttp|libwww\-perl|curl|wget|harvest|scan|grab|extract).* [NC]
     RewriteRule ^(.*)$ - [F,L]
    </IfModule>

    What I want to know, is what does this line mean:

    RewriteCond %{HTTP_USER_AGENT} ^$ [OR]

    It appears to block any request where the user agent is not specified, or is specified only as “-”. Can you elaborate on what exactly this line means. I just read the mod_rewrite documentation and that is the only thing I can’t piece together.

    The issue is that I want to block bots, but we have a piece of software that we are writing to send requests to our server and it seems to send it’s user-agent header in as “-”. I have corrected the issue with one of our programmers and now the newest version of the software sends out it’s name as the user-agent header, but the old versions of the software (which are out in the field) still don’t have a proper user-agent string, and I can’t block them.

  40. [ Gravatar Icon ] Ben says:

    Ok after some further investigation, it seems that I have confirmed my suspicion, and that this line will block empty User Agent headers.

    RewriteCond %{HTTP_USER_AGENT} ^$ [OR]

    From this site: http://johannburkard.de/blog/www/spam/block-empty-user-agent.html

    I think it would be good if you put something in your tutorial here indicating what that line does, as some people may try to use it and be unaware that they are blocking empty User Agent headers. I realize the merits behind doing it, and that in a lot of cases it is probably a smart idea to do it if you are blacklisting by User Agent, however, I think it’s important for people to know that they are doing that, and based on reading your description of what I was doing with those lines of code, I didn’t realize that I was blocking empty User Agents.

  41. [ Gravatar Icon ] Ali says:

    I was looking in the ‘block ranges of IP addresses’ to be able to redirect all Chinese visitors to my site to a different page (they can’t access Vimeo or YouTube and I want to offer them the download anyway, but only them; the others can access the normal video). But I can’t seem to get it happening. When I paste the code into the .htaccess file and upload, I get a 403 access denied, whatever IP-address I include. Any suggestions? Sorry I have to bother you (I’m no pro in this field :-), but this does seem like the best site around. Cheers, Ali

  42. [ Gravatar Icon ] Jeff Starr says:

    What is it specifically that you are adding to htaccess that is causing the error? (Please wrap each line of code with <code> tags before posting comment). If you are using any of the code from the article, keep in mind that they are meant as examples only – using them may require further customization.

  43. [ Gravatar Icon ] stOrM! says:

    Very nice article!
    If you don’t mind a .htaccess noob asking a question related to block some spam freaks on my site…

    I do have a lot of spam coming from two different hosts:

    unassigned.psychz.net

    and

    173.234.159.194.rdns.ubiquityservers.com

    last one with a lot of different ip adresses so my htaccess looks like e.g.:

    Deny from ubiquityservers.com
    Deny from rdns.ubiquityservers.com
    Deny from 69.147.227.178
    Deny from 69.147.227.194

    I don’t know if that is the correct syntax for stopping those two referers from accessing my site at least I guess it isn’t since they still come back without any problem. So maybe a little help or suggestion would be highly appriciated…

    Kindest regards,
    s!

  44. [ Gravatar Icon ] Michael says:

    hi Jeff

    Some nice use of the rewrite and rewritecond. I’ve spent a bit of time looking for a solution with no luck, hopefully you may have some experience that will help.

    I wish to rewrite the request method.

    for example if the request method is DELETE “/file/file-to-delete.txt”

    I would like to rewrite it with
    MOVE “/file/file-to-delete.txt” “/file/deleted/file-to-delete.txt”

    Any suggestions or links would be greatly appreciated.

    Regards
    Michael

  45. [ Gravatar Icon ] Nizzy says:

    You are missing the awesome mod_rewrite DB way, so that you can build your DB dynamically.

    http://perlcode.org/tutorials/apache/attacks.html

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Please use basic markup. Wrap code with <code> tags!