Perishable Press HTAccess Spring Cleaning, Part 1

[ Psychedelic Blossom ] While developing the 3G Blacklist, I completely renovated the Perishable Press site-root and blog-root HTAccess files. Since the makeover, I have enjoyed better performance, fewer errors, and cleaner code. In this article, I share some of the changes made to the root HTAccess file and provide a brief explanation as to their intended purpose and potential benefit. In sharing this information, I hope to inspire others to improve their own HTAccess and/or configuration files. In the next article, I will cover some of the changes made to the blog-root HTAccess file. As always, suggestions and questions are always welcome — just drop a comment below! Have fun!! :)

Step 1: Remove Deprecated Code

Due to changes on my shared server, I no longer need to specify mod_gzip directives in my root HTAccess. As of now, mod_gzip is currently enabled by default, so this hefty block of rules has been removed entirely:

# ENABLE GZIP
mod_gzip_on yes

# GZIP TEXT FILES
mod_gzip_item_include file \.txt$
mod_gzip_item_include mime ^text/plain$

# GZIP CSS FILES
mod_gzip_item_include file \.css$
mod_gzip_item_include mime ^text/css$

# GZIP PHP AND HTML FILES
mod_gzip_item_include file \.php$
mod_gzip_item_include file \.html$
mod_gzip_item_include mime ^text/html$

# GZIP JAVASCRIPT FILES
mod_gzip_item_include file \.js$
mod_gzip_item_include mime ^application/x-javascript$

# DISABLE GZIP FOR IMAGE FILES
mod_gzip_item_exclude mime ^image/

Step 2: Eliminate Redundant Code

A couple of years ago, using Apache’s ExpiresByType caching directive to “fix” IE-6’s well-known image-rollover flickering problem (404 link removed 2012/11/23) was all the rage. At the time, I adopted the following code to resolve the issue for one of my themes:

# PREVENT IMAGE FLICKER IN IE6
<IfModule mod_expires.c>
 ExpiresActive On
 ExpiresByType image/gif A2592000
 ExpiresByType image/jpg A2592000
 ExpiresByType image/png A2592000
</IfModule>

Since then, this code has been sitting there, taking up space in my root HTAccess file. Meanwhile, somewhere along the way, I also managed to adopt a decent set of caching rules:

# PERISHABLE PRESS CACHING RULES
<IfModule mod_expires.c>
 ExpiresActive On
 ExpiresByType text/html  "access plus 1 second"
 ExpiresByType image/gif  "access plus 1 month"
 ExpiresByType image/jpeg "access plus 1 month"
 ExpiresByType image/png  "access plus 1 month"
 ExpiresByType text/css   "access plus 1 month"
 ExpiresByType text/xml   "access plus 1 hour"
 ExpiresByType text/javascript          "access plus 1 month"
 ExpiresByType application/x-javascript "access plus 1 month"
</IfModule>

Obviously, with these elaborate caching rules in place, the previous “image-flickering” directives are effectively redundant. The only difference involves the period for which images are cached. For my purposes here, a month is probably long enough. Thus, the redundant “image-flickering” rules were eliminated.

Step 3: Remove Unnecessary Code

I will be the first person to admit that custom error pages delivered via HTAccess are like, totally radical and everything; but honestly, I don’t think they are always necessary. Here at Perishable Press, I employed a nice set of custom error pages for the longest time. Each of these custom error pages would then immediately redirect to a brief article that further explained the situation. Up until not too long ago, these myriad redirects were not a problem, however, once the site began to attract more attention, the number of HTTP errors began to climb. Especially since getting serious about security, the vast number of blocked cracker exploits has greatly increased the number of generated 403 Error errors. Thus, in an effort to trim bandwidth and improve performance, I decided to remove all custom error pages, fancy redirects, whatever — visitors now get the server defaults, and that’s just fine with me ;)

# CUSTOM ERROR PAGES > 512 BYTES
ErrorDocument 400 /errors/400.html
ErrorDocument 401 /errors/401.html
ErrorDocument 403 /errors/403.html
ErrorDocument 404 /errors/404.html
ErrorDocument 500 /errors/500.html

..and I have been happy ever since. Another piece of pointless code (in my opinion) is this popular little nugget:

# FIX BASIC SPELING ERRORS
<IfModule mod_speling.c>
 CheckSpelling On
</IfModule>

Frankly, I have never seen evidence of this working. How effective is a misspelled spelling checker, anyway? Until someone proves otherwise, this code is simply not worthwhile.

Step 4: Consolidate Similar Code

Before this year’s HTAccess spring cleaning festival, I had amassed a significant collection of individual file-protection rules, similar to the following:

# PROTECT PRIVATE FILES
<Files errors.log>
 order allow,deny
 deny from all
</files>

<Files stats.log>
 order allow,deny
 deny from all
</files>

<Files crawl.log>
 order allow,deny
 deny from all
</files>

<Files admin.dat>
 order allow,deny
 deny from all
</files>

<Files trap.dat>
 order allow,deny
 deny from all
</files>

Too much! Once I took the time to examine these rules, I noticed that the first three files each have a .log extension, while the last two files each have a .dat extension 1. With a little pattern matching, I consolidated the 21 lines of code into five:

# PROTECT PRIVATE FILES
<Files ~ "^.*\.([Ll][Oo][Gg]|[Dd][Aa][Tt])">
 order allow,deny
 deny from all
</Files>

This is the part where everyone “oooohhs” and “aaahhhs” in perfect unison ;)

Step 5: Improve Existing Code

One important area to focus on while cleaning up your HTAccess files involves anything security-related. For example, in a previous article, I discussed several common ways to protect your site’s HTAccess files. The article concludes with an optimal protection technique that uses strong pattern matching to prevent external access. Following my own advice, I replaced this sad little weakling:

# WEAK HTACCESS PROTECTION
<Files .htaccess>
 order allow,deny
 deny from all
</Files>

..with this much-stronger protective measure:

# STRONG HTACCESS PROTECTION
<Files ~ "^.*\.([Hh][Tt][Aa])">
 order allow,deny
 deny from all
</Files>

Step 6: Miscellaneous Changes

Finally, here are a few miscellaneous root-HTAccess ommissions, expained via their associated comments:

# SITEMAP REDIRECTS (NO LONGER NECESSARY)
redirect 301 /press/sitemap.xml.gz http://perishablepress.com/sitemap.xml.gz
redirect 301 /press/sitemap.xml    http://perishablepress.com/sitemap.xml

# ROBOTS.TXT REDIRECT (NO LONGER NECESSARY)
redirect 301 /press/robots.txt http://perishablepress.com/robots.txt

# REDIRECT FUDGED EXTERNAL LINKS (NO LONGER CARE)
redirect 301 /,  http://perishablepress.com/
redirect 301 /$1 http://perishablepress.com/

# MINIMALIST THEME REDIRECTS (NO LONGER NECESSARY)
redirect 301 / minimalist.php http://perishablepress.com/
redirect 301 / default.php    http://perishablepress.com/

# PRE-3G BLACKLIST EFFORT (RESOLVED VIA 3G BLACKLIST)
<IfModule mod_alias.c>
 redirectmatch 301 \.\.\. http://perishablepress.com/
 redirectmatch 301 \/\,   http://perishablepress.com/
</IfModule>

Slow Down..

Hold up there, partner! — We’re not done yet! Stay tuned for the second part of this incredibly fascinating voyage — Perishable Press HTAccess Spring Cleaning, Part 2 — coming up next week!

Footnote:

  • 1 Note that the real file names have been changed for security reasons.