Spring Sale! Save 30% on all books w/ code: PLANET24
Web Dev + WordPress + Security

Blank Space / Whitespace Character for .htaccess

Working on the next version of the G-Series Blacklist, I needed a way to match a wide variety of UTF-8-encoded (hex) character strings. Those familiar with their site’s traffic will recognize this particular type of URI request string, which is typically associated with malicious server scanning, exploits, and other malicious behavior. As I explain in this post, pattern-matching and blocking the blank-space, or whitespace character in URL-requests is an effective way to improve the security of your website.

Examples of blank-space characters in URL requests

Here is a selection of malicious URL patterns that I want to match and block using 6G blacklist techniques (via the UTF-8 (hex) encoder):

UTF-8 encoded Decoded request
http://example.com/hack%20*/ http://example.com/hack */
http://example.com/%3Ca%20href= http://example.com/<a href=
http://example.com/%5bNext%20URL%20in%20series%5d http://example.com/[Next URL in series]
http://example.com/XHTML%20Document%20Header%20Resource http://example.com/XHTML Document Header Resource
http://example.com/%22%20title=%22%22%20rel=%22nofollow http://example.com/" title="" rel="nofollow
http://example.com/Apache%20Module%20mod_authz_host http://example.com/Apache Module mod_authz_host
http://example.com/%27.%20get_permalink()%20. http://example.com/'. get_permalink() .
http://example.com/search/%20%20%20/page/13/ http://example.com/search/ /page/13/
http://example.com/%20%20%20/page/8/ http://example.com/ /page/8/
http://example.com/%3Ca%20href= http://example.com/<a href=
http://example.com/%20*/ http://example.com/ */

This gives you an idea of what these encoded requests are targeting using the UTF-8 (hex)-encoded characters. According to HTTP Specification, any character that is not one of the following must be encoded in order to appear legitimately within URLs:

Regular-use characters - allowed unencoded within URLs

$ - _ . + ! * ' ( ) ,

0 1 2 3 4 5 6 7 8 9

a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Not included in this “safe-character” list, the humble white space (or blank space) must be encoded when included in the URL. As explained by the Network Working Group:

Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs.

Looking back at our target URLs, we find that the least common denominator is the encoded whitespace character, %20. Oh sure, there are plenty of other encoded characters that could be targeted, but zeroing in on blank spaces in the URL is an effective way to catch and block many of these types of malicious requests.

How to match the blank-space/whitespace character with .htaccess

Now that we have a reason to do so, let’s use .htaccess to match and block all URL requests that include one or more whitespace characters. It’s as simple as adding this line to your root .htaccess file

<IfModule mod_alias.c>
 RedirectMatch 403 \s
</IfModule>

So the punchline to this diatribe is that an escaped “s” character (\s) is the regex to match blank spaces when using .htaccess directives via mod_alias (RedirectMatch) and mod_rewrite (RewriteRule). Here is an example using Apache’s mod_rewrite:

<IfModule mod_rewrite.c>
 RewriteCond %{REQUEST_URI} !^/$
 RewriteCond %{REQUEST_URI} \s
 RewriteRule .* https://perishablepress.com/ [R=301,L]
</IfModule>

This example will redirect any requests that include whitespace to the home page (edit to match your own URL). To block them instead, replace the RewriteRule with this:

RewriteRule .* - [F,L]

Note that it doesn’t matter if the initial requests are encoded or not — the end result of any encoded request is the un-encoded, canonical URL (not including the query string), so targeting literal whitespace in the request URI is effective. In fact, you should only use this method if you know what you are doing and are certain that none of your URLs contain whitespace or blank spaces.

Matching whitespace in query strings

Wrapping up, here is how to block blank spaces in the query-string portion of the URL, which is impossible using either of the previous two examples. Using mod_rewrite, we can target the %{QUERY_STRING} variable to catch any whitespace:

<IfModule mod_rewrite.c>
 RewriteCond %{REQUEST_URI} !^/$
 RewriteCond %{QUERY_STRING} \s
 RewriteRule .* - [F,L]
</IfModule>

No editing required — just drop into your .htaccess file and good to go. As always, comments and questions welcome, and thanks for reading! :)

About the Author
Jeff Starr = Fullstack Developer. Book Author. Teacher. Human Being.
WP Themes In Depth: Build and sell awesome WordPress themes.

2 responses to “Blank Space / Whitespace Character for .htaccess”

  1. Excellent idea!!!

    I’m running my sites on a VPS with configserver firewall and was wondering how much nefarious stuff would be caught before .htaccess kicks in.

    I would imagine including the wrong things in .htaccess would either expose a person or actually shoot them in the foot.

  2. Excellent code! But I can not make it work. When someone agrees to: mysite.com/whatever-content/% 20%20; receive an “Access Forbidden”. But if I agree to: mysite.com/whatever-content/ (space invisible, no “%20%20”) is redirected properly. Any ideas? This has me somewhat concerned, there are many people that you can not see my site … = (

    Thanks!

Comments are closed for this post. Something to add? Let me know.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
WP Themes In Depth: Build and sell awesome WordPress themes.
Thoughts
I live right next door to the absolute loudest car in town. And the owner loves to drive it.
8G Firewall now out of beta testing, ready for use on production sites.
It's all about that ad revenue baby.
Note to self: encrypting 500 GB of data on my iMac takes around 8 hours.
Getting back into things after a bit of a break. Currently 7° F outside. Chillz.
2024 is going to make 2020 look like a vacation. Prepare accordingly.
First snow of the year :)
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.