Perishable Press HTAccess Spring Cleaning, Part 1
While developing the 3G Blacklist, I completely renovated the Perishable Press site-root and blog-root HTAccess files. Since the makeover, I have enjoyed better performance, fewer errors, and cleaner code. In this article, I share some of the changes made to the root HTAccess file and provide a brief explanation as to their intended purpose and potential benefit. In sharing this information, I hope to inspire others to improve their own HTAccess and/or configuration files. In the next article, I will cover some of the changes made to the blog-root HTAccess file. As always, suggestions and questions are always welcome — just drop a comment below! Have fun!! :)
Step 1: Remove Deprecated Code
Due to changes on my shared server, I no longer need to specify mod_gzip
directives in my root HTAccess. As of now, mod_gzip
is currently enabled by default, so this hefty block of rules has been removed entirely:
# ENABLE GZIP
mod_gzip_on yes
# GZIP TEXT FILES
mod_gzip_item_include file \.txt$
mod_gzip_item_include mime ^text/plain$
# GZIP CSS FILES
mod_gzip_item_include file \.css$
mod_gzip_item_include mime ^text/css$
# GZIP PHP AND HTML FILES
mod_gzip_item_include file \.php$
mod_gzip_item_include file \.html$
mod_gzip_item_include mime ^text/html$
# GZIP JAVASCRIPT FILES
mod_gzip_item_include file \.js$
mod_gzip_item_include mime ^application/x-javascript$
# DISABLE GZIP FOR IMAGE FILES
mod_gzip_item_exclude mime ^image/
Step 2: Eliminate Redundant Code
A couple of years ago, using Apache’s ExpiresByType
caching directive to “fix” IE-6’s well-known image-rollover flickering problem was all the rage. At the time, I adopted the following code to resolve the issue for one of my themes:
# PREVENT IMAGE FLICKER IN IE6
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType image/gif A2592000
ExpiresByType image/jpg A2592000
ExpiresByType image/png A2592000
</IfModule>
Since then, this code has been sitting there, taking up space in my root HTAccess file. Meanwhile, somewhere along the way, I also managed to adopt a decent set of caching rules:
# PERISHABLE PRESS CACHING RULES
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType text/html "access plus 1 second"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType text/css "access plus 1 month"
ExpiresByType text/xml "access plus 1 hour"
ExpiresByType text/javascript "access plus 1 month"
ExpiresByType application/x-javascript "access plus 1 month"
</IfModule>
Obviously, with these elaborate caching rules in place, the previous “image-flickering” directives are effectively redundant. The only difference involves the period for which images are cached. For my purposes here, a month is probably long enough. Thus, the redundant “image-flickering” rules were eliminated.
Step 3: Remove Unnecessary Code
I will be the first person to admit that custom error pages delivered via HTAccess are like, totally radical and everything; but honestly, I don’t think they are always necessary.
Here at Perishable Press, I employed a nice set of custom error pages for the longest time. Each of these custom error pages would then immediately redirect to a brief article that further explained the situation. Up until not too long ago, these myriad redirects were not a problem, however, once the site began to attract more attention, the number of HTTP errors began to climb. Especially since getting serious about security, the vast number of blocked cracker exploits has greatly increased the number of generated 403
Error
errors.
Thus, in an effort to trim bandwidth and improve performance, I decided to remove all custom error pages, fancy redirects, whatever — visitors now get the server defaults, and that’s just fine with me ;)
# CUSTOM ERROR PAGES > 512 BYTES
ErrorDocument 400 /errors/400.html
ErrorDocument 401 /errors/401.html
ErrorDocument 403 /errors/403.html
ErrorDocument 404 /errors/404.html
ErrorDocument 500 /errors/500.html
..and I have been happy ever since. Another piece of pointless code (in my opinion) is this popular little nugget:
# FIX BASIC SPELING ERRORS
<IfModule mod_speling.c>
CheckSpelling On
</IfModule>
Frankly, I have never seen evidence of this working. How effective is a misspelled spelling checker, anyway? Until someone proves otherwise, this code is simply not worthwhile.
Step 4: Consolidate Similar Code
Before this year’s HTAccess spring cleaning festival, I had amassed a significant collection of individual file-protection rules, similar to the following:
# PROTECT PRIVATE FILES
<Files errors.log>
order allow,deny
deny from all
</files>
<Files stats.log>
order allow,deny
deny from all
</files>
<Files crawl.log>
order allow,deny
deny from all
</files>
<Files admin.dat>
order allow,deny
deny from all
</files>
<Files trap.dat>
order allow,deny
deny from all
</files>
Too much! Once I took the time to examine these rules, I noticed that the first three files each have a .log
extension, while the last two files each have a .dat
extension1. With a little pattern matching, I consolidated the 21 lines of code into five:
# PROTECT PRIVATE FILES
<Files ~ "^.*\.([Ll][Oo][Gg]|[Dd][Aa][Tt])">
order allow,deny
deny from all
</Files>
This is the part where everyone “oooohhs” and “aaahhhs” in perfect unison ;)
Step 5: Improve Existing Code
One important area to focus on while cleaning up your HTAccess files involves anything security-related. For example, in a previous article, I discussed several common ways to protect your site’s HTAccess files. The article concludes with an optimal protection technique that uses strong pattern matching to prevent external access. Following my own advice, I replaced this sad little weakling:
# WEAK HTACCESS PROTECTION
<Files .htaccess>
Order allow,deny
Deny from all
</Files>
..with this much-stronger protective measure:
# STRONG HTACCESS PROTECTION
<Files ~ "^.*\.([Hh][Tt][Aa])">
Order allow,deny
Deny from all
</Files>
Step 6: Miscellaneous Changes
Finally, here are a few miscellaneous root-HTAccess ommissions, expained via their associated comments:
# SITEMAP REDIRECTS (NO LONGER NECESSARY)
redirect 301 /press/sitemap.xml.gz https://perishablepress.com/sitemap.xml.gz
redirect 301 /press/sitemap.xml https://perishablepress.com/sitemap.xml
# ROBOTS.TXT REDIRECT (NO LONGER NECESSARY)
redirect 301 /press/robots.txt https://perishablepress.com/robots.txt
# REDIRECT FUDGED EXTERNAL LINKS (NO LONGER CARE)
redirect 301 /, https://perishablepress.com/
redirect 301 /$1 https://perishablepress.com/
# MINIMALIST THEME REDIRECTS (NO LONGER NECESSARY)
redirect 301 / minimalist.php https://perishablepress.com/
redirect 301 / default.php https://perishablepress.com/
# PRE-3G BLACKLIST EFFORT (RESOLVED VIA 3G BLACKLIST)
<IfModule mod_alias.c>
redirectmatch 301 \.\.\. https://perishablepress.com/
redirectmatch 301 \/\, https://perishablepress.com/
</IfModule>
Slow Down..
Hold up there, partner! — We’re not done yet! Stay tuned for the second part of this incredibly fascinating voyage — Perishable Press HTAccess Spring Cleaning, Part 2 — coming up next week!
Footnotes
- 1 Note that the real file names have been changed for security reasons.
6 responses to “Perishable Press HTAccess Spring Cleaning, Part 1”
Thank you Jeff for sharing these lines of HTAccess goodness.
I think you are the one that have made me realise the logic of using Apache and its enormous power to handle many situations I would have handle with PHP.
So again, thank you for making these great posts on HTAccess.
Hey man i was implementing the mod_gzip_item_include command but it starting pulling up 500 error log stated the above is not defined or included?
any help appreciated; as always great work
@Louis: Thank you for the positive feedback. It is very encouraging to receive such appreciative comments :)
@Don: that is precisely the reason I needed to remove that particular chunk of htaccess code: my site suddenly locked up with 500 server errors. After investigating, I discovered that my host had pre-configured (in one form or another) the various
item_include
directives at the server level. Thus, declaring the specific values locally (via htaccess) was triggering the perpetual 500 error. Aside from that, you may also want to double-check thatmod_gzip
is installed on your server. If so, you may want to verify whether or not compression is working correctly by using Firebug/YSlow or any number of online compression checkers. I hope that helps!Cheers Perishable; yea gzip was enabled issue; I tweaked a bit here and there and got it working, and yes as the mime types were already configured they were causing the conflict.
2) re: the caching method mentioned, even though i have the supercache plugin for worpress I find that the mentioned rules actually do help as well. Just my opinion…hopefully you’ve bettered it :D
Glad to hear you got it working! As for the caching rules, yes they definitely improve performance and I certainly do continue to use them. For awhile there, I had some redundant caching rules in place (to prevent IE image flickering), but those have now been removed in favor of the full set of caching directives (as presented in the article). Cheers!