Perishable Press HTAccess Spring Cleaning, Part 2

[ ~:{*}:~ ] Before Summer arrives, I need to post the conclusion to my seasonal article, Perishable Press HTAccess Spring Cleaning, Part 1. As explained in the first post, I recently spent some time to consolidate and optimize the Perishable Press site-root and blog-root HTAccess files. Since the makeover, I have enjoyed better performance, fewer errors, and cleaner code. In this article, I share some of the changes made to the blog-root HTAccess file and provide a brief explanation as to their intended purpose. Granted, most of the blog-root directives affected by the renovation involve redirecting broken/missing URLs, but there are some other gems mixed in as well. In sharing these deprecated excerpts, I hope to inspire others to improve their own HTAccess and/or configuration files. What an excellent way to wrap up this delightful Spring season! :)

Step 1: Eliminate Duplicate Code

Comparing my site-root HTAccess file to my blog-root HTAccess file, I noticed several repetitious code blocks. As HTAccess directives affect all subordinate directories, the following directives are no longer necessary because they are included in the site’s root web directory:

PHP error display and logging rules:

# disable display of php errors
php_flag display_startup_errors off
php_flag display_errors off
php_flag html_errors off

# PHP error logging
php_flag  log_errors on
php_value error_log /the/path/to/php_error.log

Basic configuration and default character set:

# basic configurationz
Options +FollowSymLinks
Options All -Indexes
ServerSignature Off
RewriteEngine on

# default character set
AddDefaultCharset UTF-8
AddLanguage en-US .html .htm .css .js

Step 2: Rethink Existing Code

An excellent way to stop comment spam involves blocking no-referrer requests. The following code had been in place since the publication of the corresponding article. Somewhere along the way, the method was revamped to exclude several major user-agents from the block. Of course, this is just silly and I honestly don’t remember why I felt it was important. Needless to say, I replaced the following, bloated method with the original version:

Block comment spam by denying access to no-referrer requests

# block no-referrer requests
<ifmodule mod_rewrite.c>
 RewriteCond %{REQUEST_METHOD}  POST
 RewriteCond %{REQUEST_URI}     .*wp-comments-post\.php
 RewriteCond %{HTTP_REFERER}    !.*perishablepress.com.* [OR,NC]
 RewriteCond %{HTTP_USER_AGENT} !^.*mozilla.*            [OR,NC]
 RewriteCond %{HTTP_USER_AGENT} !^.*google.*             [OR,NC]
 RewriteCond %{HTTP_USER_AGENT} !^.*slurp.*              [OR,NC]
 RewriteCond %{HTTP_USER_AGENT} !^.*msn.*                [NC]
 RewriteCond %{HTTP_USER_AGENT} ^$                       [NC]
 RewriteRule .*                 -                        [F,L]
</ifmodule>

Step 3: Optimize Existing Code

Several years ago, while beginning my journey into the fascinating realms of HTAccess, I embraced the importance of protecting my image and multimedia content from the grubby hands of unscrupulous hotlinkers. At the time, I had established an admittedly barbaric anti-hotlinking strategy, which gradually evolved into this ghastly behemoth:

Anti-hotlinking directives

# anti-hotlinking
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://monzilla.biz/.*$                  [NC]
RewriteCond %{HTTP_REFERER} !^http://monzilla.biz$                     [NC]
RewriteCond %{HTTP_REFERER} !^http://www.monzilla.biz/.*$              [NC]
RewriteCond %{HTTP_REFERER} !^http://www.monzilla.biz$                 [NC]
RewriteCond %{HTTP_REFERER} !^http://perishablepress.com/.*$           [NC]
RewriteCond %{HTTP_REFERER} !^http://perishablepress.com$              [NC]
RewriteCond %{HTTP_REFERER} !^http://labs.perishablepress.com/.*$      [NC]
RewriteCond %{HTTP_REFERER} !^http://labs.perishablepress.com$         [NC]
RewriteCond %{HTTP_REFERER} !^http://www.perishablepress.com/.*$       [NC]
RewriteCond %{HTTP_REFERER} !^http://www.perishablepress.com$          [NC]
RewriteCond %{HTTP_REFERER} !^http://planetwordpress.planetozh.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://planetwordpress.planetozh.com$    [NC]
RewriteCond %{HTTP_REFERER} !^http://www.google.com/.*$                [NC]
RewriteCond %{HTTP_REFERER} !^http://www.google.com$                   [NC]
RewriteCond %{HTTP_REFERER} !^http://www.netvibes.com/.*$              [NC]
RewriteCond %{HTTP_REFERER} !^http://www.netvibes.com$                 [NC]
RewriteCond %{HTTP_REFERER} !^http://www.google.com/reader/view/.*$    [NC]
RewriteCond %{HTTP_REFERER} !^http://www.google.com/reader/m/view/.*$  [NC]
RewriteCond %{HTTP_REFERER} !^http://www.feedburner.com/.*$            [NC]
RewriteCond %{HTTP_REFERER} !^http://feeds.feedburner.com/perishablepress$         [NC]
RewriteCond %{HTTP_REFERER} !^http://feeds.feedburner.com/perishablepresscomments$ [NC]
RewriteRule .*\.(gif|jpg|jpeg|png|bmp|js|css|zip|mp3|avi|wmv|mpg|mpeg|tif|tiff|raw|swf)$ http://perishablepress.com/hotlink.jpe [R,NC,L]
# RewriteRule .*\.(gif|jpg|jpeg|png|bmp|js|css|zip|mp3|avi|wmv|mpg|mpeg|tif|tiff|raw|swf)$ - [F,NC]

Without getting into it, suffice it to say that this is major overkill. Since implementing and expanding this madness, I have studied the technique in-depth and developed an optimal anti-hotlinking strategy. Thus, this HTAccess nightmare is now replaced with a much leaner, more accurate ruleset.

Step 4: Remove Deprecated Redirects

Over the course of time, any responsible webmaster inevitably will be faced with excessively large numbers of HTAccess redirects. Either permanent or temporary, individual redirects are important for resolving 404 errors resulting from misplaced or relocated files, misdirected external links, and so on. At some point, webmasters have two options: leave the rules in place and keep adding to them as needed, or purge and prune the rules as much as possible. Needless to say, I decided to clean things up a bit. After much testing and research, I managed to reduce my collection of HTAccess redirects by around 75%. Granted, after clearing things out, I experienced a significant increase in 404 errors, however, the situation is slowing improving as the search engines continue to update their databases. Here is a peek at the collection of 301 redirects that have been removed from my blog root HTAccess file:

Perishable Press Blog-Root Deprecated Redirects

Audi 5000 G

That’s it for this fun-filled adventure into HTAccess land. Hopefully, this article has inspired you to optimize and streamline your own HTAccess strategy. With a little time, a focused mind, and a knowledgeable guide ;), improving your site’s performance is easily accomplished. — Happy Spring Cleaning! :)