Latest TweetsVerify any search engine or visitor via CLI Forward-Reverse Lookup…
Perishable Press

Building the 3G Blacklist, Part 2: Improving Security by Preventing Query-String Exploits

[ 3G Stormtroopers (Green Machine) ]

In this continuing five-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. In this second article, I present an incredibly powerful method for eliminating malicious query string exploits. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.

Improving Security by Preventing Query String Exploits

A vast majority of website attacks involves appending malicious query strings onto legitimate, indexed URLs. Any webmaster serious about site security is well-familiar with the following generalized access-log entries:




Generally, sites that suffer query-string attacks receive hundreds or thousands of such URL requests during relatively short periods of time. Apparently, these URLs are generated automatically via scripts executed from decentralized networks of compromised computers. As discussed elsewhere, these automated “zombie” attacks leave tracks that are difficult to unify via some common element. For example, consider the following:

  • Attackers frequently and randomly disguise declared user agents
  • Scripts are generally executed via random sets of IP addresses
  • Referrer information is typically absent from site access logs
  • Attacks target a wide variety of well-known (indexed) URLs
  • Query strings are usually appended to unpredictable file names
  • Query strings consist of unpredictable character sequences

These dissassociated characteristics make it difficult to successfully predict and thus protect against future attacks. Fortunately, the large number of URL requests employing malicious query strings contain a secondary URL, which is most likely the address of some nasty exploit script. These secondary URLs generally consist of apparently random, unpredictable sequences of characters and terms. Nonetheless, virtually all of these secondary query-string URLs include one of these three transfer protocols:

  • ftp://
  • http://
  • https://

But there’s a catch: when used in secondary URLs, these protocol prefixes are not always complete. Either intentionally or not, URLs in query strings often register improperly, with one or both slashes, the colon, or even portions of the http (or equivalent) omitted entirely. Thus, trying to match any of these character strings too closely will inevitably lead to false negatives.

On the other hand, legitimate inclusion of various transfer protocols is quite common, especially among content management platforms. For example, the deathly popular WordPress passes HTTP referrer data via query strings during various comment management tasks. In the process, the characters “http_” are included in the query string. Thus, matching any of the protocol strings too loosely will inevitably lead to false positives, and worse.

The key to blocking a vast majority of malicious query-string exploits requires precise regex pattern matching, thereby optimizing the ratio of false negatives and false positives. Fortunately, by taking advantage of the QUERY_STRING directive of Apache’s mod_rewrite function, it is possible to effectively and efficiently block a vast majority of malicious query-string exploits.

The Magic Bullet

To implement this elegant blacklist solution, place the following code into either your site’s root htaccess file or your server’s configuration file (Note: this method will also be included in the final/release version of the 3G Blacklist):

<ifmodule mod_rewrite.c>
 RewriteCond %{QUERY_STRING} ftp\:   [NC,OR]
 RewriteCond %{QUERY_STRING} http\:  [NC,OR]
 RewriteCond %{QUERY_STRING} https\: [NC]
 RewriteRule .* -                    [F,L]

Upload and test, test, test. I have been testing these directives here at Perishable Press for several weeks now, and have experienced excellent results. This single chunk of code is responsible for a dramatic 80% drop in the overall number of server attacks as recorded in my access and error logs. Highly recommended! ;)

Briefly, let’s examine how this code works. First, notice that we are enclosing the four rewrite rules within an <ifmodule> container. This will prevent the code from crashing your site should the required Apache module ( mod_rewrite ) prove unavailable. Within the test container, we employ three RewriteConditions designed to match three commonly deployed protocols, ftp:, http:, and https:. As previously discussed, these protocols omit the two forward slashes to reduce the number of false negatives. Finally, after matching for the target character strings, the RewriteRule in the last line returns a server status 403 — Forbidden — HTTP error code for all matched URLs. Nice, clean, and easy. ;)

Wrap Up..

Once in place, this method will effectively eliminate a significant amount of malicious query-string attacks. Upon investigation of your access/error logs, you should see a dramatic decrease in the number of rogue 404 Not Found responses, and a dramatic increase in the number of returned 403 Forbidden responses. Overall, this is one the most elegant, effective, and efficient blacklisting techniques I have ever had the pleasure of using. :)

As always, if you are unable to access certain URLs on your site after implementing this method, immediately comment out the three RewriteCond lines and try again. If you then have access, check the URL for a query string containing one of the blocked character strings. Depending on what you find, uncomment and/or edit the code as necessary. If you need further assistance or have specific questions, leave a comment or contact me directly and I will do my best to help you out.


Stay tuned for the continuation of Building the 3G Blacklist, Part 3: Improving Site Security by Selectively Blocking Rogue User Agents. If you have yet to do so, I highly encourage you to subscribe to Perishable Press. As always, thank you for your generous attention.

Jeff Starr
About the Author Jeff Starr = Web Developer. Book Author. Secretly Important.
12 responses
  1. I’m not sure to properly understand the following paragraph.

    But there’s a catch: when used in secondary URLs, these protocol prefixes are not always complete. Either intentionally or not, URLs in query strings often register improperly, with one or both slashes, the colon, or even portions of the http (or equivalent) omitted entirely. Thus, trying to match any of these character strings too closely will inevitably lead to false negatives.

    So, I probably don’t understand the idea of forbidding URLs matching the http/https/ftp: structure, when in the previous post (Part 1), you simply blocked any request containing the // item.

    Could you please explain the “catch” again, so I can understand the idea of this second part.

    By the way, it was a great read as usual Jeff. Very clean.

  2. Jeff Starr

    Hi Louis :)

    Keep in mind that this post deals with targeting and blocking and protocol data (e.g., ftp:, http:, https:) in the query string of malicious URL requests. In the previous article, we target double slashes in the root portion of the URL. The directives used for matching query strings do not consider the root URL. And then likewise, the directives used to match the root URL will not match against the query string. So basically they need to be treated separately. Hopefully this helps! :)

  3. Thanks for the explanation. Tough, I still may not understand.

    You mean that RedirectMatch 403 // wouln’t block a request like the following one ?


    [ Edit: URL split into two lines to prevent breaking layout ]

    The RedirectMatch directive does not scan the whole URL ?

    Although, if that is the case, wouldn’t it be more efficient to use this directive ?

    RewriteCond %{QUERY_STRING} // [NC,OR]

    Sorry if my questions look strange. Something the langage is a real barrier :/

  4. Jeff Starr

    No, RedirectMatch does not match against query strings. And your second directive would match the double slashes in the query string only. This is okay, but the double-slash string is generally located before the query string. Targeting double slashes in the URL may be effective to some degree, but is not as effective as explicitly matching the protocol strings as discussed in the article. Btw, your questions are great — I appreciate the opportunity to explain :)

  5. OT:

    Hi, I’m a total noob when it comes to website security and I’m one of your regular blog readers. You’ve been mentioning security logs, and 404 logs (I’m not even sure if they’re the same logs you’re talking about) – I’d like to know how to create those logs? And how many is the recommended different logs there should be?

  6. Hey Jeff,

    this comment above is very handy as it gathers the three posts on logs.

    You might want to make a quick post out of it :)

  7. Jeff Starr

    Hi marian, it is great to hear from you! :) I currently use three different types of logs to keep track of web traffic, attempted exploits, and other server interactions. First and foremost is your site’s server log, which depending on your server environment is usually accessible via your domain’s control panel. For example, in cPanel, you can access log information under the Logs section. There, you will find several ways to check server logs, error logs, statistics, and many other useful logs. In addition to the main server log, I am also keeping an eye on Apache’s rewrite activity. Likewise, I also watch my site’s PHP errors. Taken together, these three logs provide a fairly comprehensive look into your site’s server activity. Regardless of which types of logs you decide to use, the important thing is to play an active role in your site’s performance, functionality, and activity. I hope this helps — the topic of recording and using server data is as deep as it is wide ;)

  8. Jeff Starr

    Good idea, Louis! I want to post the series in order, but will definitely followup with a post explaining the various types of logging that webmasters can use to keep an eye on things. Thanks for the idea :)

  9. Thanks a lot for the info, definitely its a big help!

  10. Jeff Starr

    Absolutely my pleasure, marian — always glad to be of service :)

  11. How about following directive?

    RewriteCond %{QUERY_STRING} ^.*=(ht|f)tp://.*$ [NC]
    RewriteRule .* - [F,L]

  12. Jeff Starr

    Looks like an effective way to block requests that include URLs in the query string. The 5G Firewall includes this functionality, although uses different matching rules to do so.

[ Comments are closed for this post ]