Optimizing WordPress Permalinks with htaccess

Okay, so Summer’s over, kids are back in school, and I’m finding all sorts of free time to continue writing and posting. One of my Summer projects involved updating & optimizing one of my old project sites, DeadLetterArt.com. It was basically a huge clean-up session that included lots of content consolidation and permalink restructuring. So that’s the topic of this post, how to use htaccess to optimize WordPress permalinks. I’ll go through some htaccess techniques and explain how they can improve your WordPress-powered site.

Change year/month/day permalinks to year only

The first thing I wanted to do is restructure the site’s permalinks. When I designed the site back in 2006, long URLs were all the rage, so I went with the convention du jour:

/%year%/%monthnum%/%day%/%postname%/

This permalink structure makes your WordPress URLs look like this:

http://deadletterart.com/2008/09/10/impertinent-art-review/
http://deadletterart.com/2011/06/18/california/
http://deadletterart.com/2011/06/19/join-the-empire/

And so on. Obviously these days long URLs are out and “shorter is better” is the current trend. After some thought, I decided to keep the four-digit year and remove the month and day numbers.

http://deadletterart.com/2008/impertinent-art-review/
http://deadletterart.com/2011/california/
http://deadletterart.com/2011/join-the-empire/

Not exactly bitly sized, but an improvement that results in a flatter directory structure, which they say is good for SEO. So there are numerous ways to make this change using htaccess, and after experimenting with several different approaches, I crafted this tasty little htaccess snippet:

# change year/month/day permalinks to year only
RedirectMatch 301 ^/([0-9]+)/([0-9]+)/([0-9]+)/(.*)$ http://deadletterart.com/$1/$4

To use this on your own site, change deadletterart.com to match your own domain. No further edits are required, but you should test thoroughly that everything is working properly. Note that we’re using a 301 status code for the redirects, so the search engines know that the URL changes are permanent.

Redirect removed pages

After optimizing the permalink structure, I consolidated a bunch of content, mostly pages that are no longer needed. Of course, whenever you delete a post or page on your site, the search engines panic, and seem to request those pages over and over and over again. That’s way too many 404 requests for my comfort, so I made sure that these missing pages are redirected to someplace useful, like the home page of the site. This is simple to do with a few lines of htaccess, as seen here for a handful of removed pages:

RedirectMatch 301 /feeds/?$ http://deadletterart.com/
RedirectMatch 301 /submit/?$ http://deadletterart.com/
RedirectMatch 301 /contact/?$ http://deadletterart.com/
RedirectMatch 301 /sitemap/?$ http://deadletterart.com/
RedirectMatch 301 /guestbook/?$ http://deadletterart.com/
RedirectMatch 301 /slideshow/?$ http://deadletterart.com/

These redirects share a common pattern, so we can optimize our code by rewriting this in single-line format:

RedirectMatch 301 /(feeds|submit|contact|sitemap|guestbook|slideshow)/?$ http://deadletterart.com/

Notice that we’re redirecting these page requests to the site’s home page, which you may edit to whatever URL you wish. This is good for SEO as it preserves any page rank that the removed pages may have accumulated. You can funnel that love wherever you would like, such as a sales page or other key resource.

Make dead pages go away

For pages that you would rather not redirect, but rather just declare them as officially “gone”, well there’s a code for that. As explained by Mark Pilgrim (404 link removed – what happened to diveintomark.org?), you can return a 410 - Gone status code to all requests for pages that you’d rather forget about. For example:

RedirectMatch gone /nu\-
RedirectMatch gone /nuer
RedirectMatch gone /renu
RedirectMatch gone /nuwest
RedirectMatch gone /nustyle
RedirectMatch gone /big\-nu

As before, we can take advantage of mod_alias’ pattern-matching skillz to combine rules into a single line, like so:

RedirectMatch gone /(nuer|renu|nuwest|nustyle|big\-nu)

So with this technique, we’re telling search engines (and anything else) that these resources are literally gone. This is an effective way to eliminate pages from the search engines, so be careful. A note about the pattern-matching – it’s a little bit different than in our previous example. This example excludes the following characters from the match:

/?$

So this example will match any URL that begins with /nustyle, such as:

/nustyle-1
/nustyle-1-2-3
/nustylebananaspiders

Admittedly the term “nustyle” is rare enough to avoid unwanted redirects, but more common terms require some fiddling with the pattern-matching. Moving on..

Redirect entire category to another site

Another part of the site’s restructuring involved branching a category into a new site. At the DLa site, a friend of mine and I had started posting a series of weird, random images in a category named “Chunks”. After about 15 pages worth of these bizarre chunks, we decided it would be a better idea to move ’em to a new site and then continue posting from there.

To do this, we created a “Chunks” category at the new chunks site, and re-posted everything. After that, redirecting the entire category requires a single line of htaccess:

RedirectMatch 301 ^/category/chunks/?(.*)$ http://echunks.com/category/chunks/$1

That’s about as clean as it gets. The only trick when redirecting category (and other) URLs to other WordPress sites is getting the post permalinks to match exactly. It takes some time, especially if you tend to punctuate post titles with apostrophes and such. For some reason that I haven’t figured out, apostrophes were not included in permalink URLs at one site, but they were included in the other. Just a head’s up.

More stuff on the way

Stay tuned ;)