Redirecting Subdirectories to the Root Directory via HTAccess

.htaccess made easy

One of the most useful techniques in my HTAccess toolbox involves URL redirection using Apache’s RedirectMatch directive. With RedirectMatch, you get the powerful regex pattern matching available in the mod_alias module combined with the simplicity and effectiveness of the Redirect directive. This hybrid functionality makes RedirectMatch the ideal method for highly specific redirection. In this tutorial, we will explore the application of RedirectMatch as it applies to one of the most difficult redirect scenarios: redirecting all requests for a specific subdirectory (or any subordinate directory or file) to the root (or any parent) directory. We will explore how to accomplish this redirect using PHP in a subsequent article.

The Scene

When developing sites, an excellent way to maintain a clean directory structure involves using subdirectories to organize content. A great example of this is seen in the placement of a site’s blog into its own subdirectory, such as this:

http://domain.tld/blog/

In this scenario, the blog directory serves as the home page for the blog itself. This works great if the content displayed on the site’s home page (i.e., at http://domain.tld/) is meant to be different than that displayed on the blog’s home page. So, for example, your site’s main page would greet visitors with a few photos and a nice welcome message. Then, from the welcome page you would link to your blog, which would feature your posts, pages, and other blog content.

Conversely, many sites prefer to display their blog content on the home page, or root, of the site. This, of course, is easily accomplished by simply placing your blog in the root directory. If you aren’t planning on growing, expanding, or restructuring your site in the future, and aren’t really concerned with maintaining a clean, organized directory structure, then by all means, go ahead and throw your life away by installing your blog in the root directory. If, on the other hand, you would like to display your blog content on the site’s home page, but would prefer to give the blog its own directory, read on..

The Challenge

The challenge facing the described strategy is duplicate content. By serving your blog from its own directory and delivering its content at the root directory, duplicate content is available in the following locations:

http://domain.tld/
http://domain.tld/blog/

When users visit either of these URLs, they will see the same content, namely, the main index page of your blog. This is a convoluted scenario for human visitors and a potential penalty from the search engines as well. Indeed, it would be far better to either redirect the blog root to the site root or vice versa. Logically, of course, it makes more sense to use your domain as the home page, so we will craft our HTAccess solution based on the following configuration:

  • Blog installed/located in its own (sub)directory
  • Blog content displayed on the site’s home page
  • Requests for the blog directory redirected to the home page

Once established, this configuration will enable you to place your blog in its own directory and display your blog on the home page (root URL) of your site. Further — and this is the trick — only the /blog/ directory will be redirected to the home page; all other blog pages will continue to be available in their expected locations. Here is a sampling of hypothetical URLs demonstrating this configuration:

  • http://domain.tld/
    [home page of site, shows blog content, no redirect]
  • http://domain.tld/blog/
    [blog subdirectory, redirected to home page]
  • http://domain.tld/blog/an-example-post/
    [blog post, shows blog directory in URL]
  • http://domain.tld/blog/an-example-page/
    [blog page, shows blog directory in URL]
  • http://domain.tld/blog/an-archive-page/
    [archive page, shows blog directory in URL]

Bottom line: this redirect strategy facilitates subdirectory blog installation, eliminates duplicate content, and enables the display of your blog on the home page of your site. Let’s have a look..

The Solution(s)

WordPress

Depending on your server configuration and blogging platform, there are several ways to implement this strategy. First of all, WordPress provides a “built-in” mechanism for giving your blog its own directory. This method works best if implemented during or immediately after the installation of WordPress. With this technique, WordPress rewrites all of your permalinks such as to remove the /blog/ portion of your URLs. Thus, if you are launching a new WordPress-powered blog, this strategy is aesthetically superior to the HTAccess solution by simply eliminating all references to the blog’s subdirectory; it will be as if you had installed your blog in the root directory to begin with. But keep in mind that there are potential shortcomings with the WordPress method, such as are encountered when trying to integrate additional content to your site. WordPress’ activated rewrite mechanism may interfere with the proper functionality of galleries, blogs, and other peripheral content. It should also be noted that implementing WordPress’ redirection feature will break any external links to your blog’s pages. Depending on the number of inbound links, this could result in a large number of 404 errors. But relax! If all of this WordPress redirection stuff makes your skin crawl, read on for a better solution.

Apache/HTAccess

In general, I prefer to handle redirection at the server level. Using Apache’s powerful rewrite functionality, it is possible to craft highly specific redirects for virtually any configuration. In this case, where we need to redirect requests for the /blog/ (sub)directory to the site’s root directory, Apache’s RedirectMatch directive accomplishes the task perfectly. Given the previously described scenario, the following Apache directive should be placed in the site’s root HTAccess file (or, alternatively, in the server’s configuration file):

RedirectMatch 301 ^/blog/$ http://domain.tld/

Pure magic. Once in place on an Apache-powered server, this single line will redirect all requests for the /blog/ directory to the site’s root directory. As outlined previously, all subordinate URLs will include the /blog/ directory in the address string and continue to function as expected. This is a concise, direct, effective solution that is as simple as possible. Only a single, target URL is affected, enabling you to easily integrate new features into your site without overzealous rewrite interference.

As for the functional specifics of the RedirectMatch technique, the process is very straightforward:

  1. Match all URLs containing the specified character string
  2. Redirect all matches to the specified target URL
  3. Deliver a redirect status of permanent (301) with all requests

Also, notice how RedirectMatch differs from the similar Redirect directive. With RedirectMatch, exact pattern matching is possible using regular expressions. Conversely, Redirect uses prefix matching, which affects any URL that includes the specified character string. Regular expressions aren’t allowed in Redirect directives, but they are allowed with RedirectMatch.

Variations

In addition to redirecting the subdirectory of your blog to the site’s root directory, you can also use the RedirectMatch directive for many other case-specific redirects. For example, I recently shared a technique for redirecting all requests for a nonexistent file to the actual file. In that article, I prescribe a technique for redirecting all misdirected requests for the site’s favicon.ico back to the actual file located in the root directory of the site:

# REDIRECT FAVICON REQUESTS
<ifmodule mod_rewrite.c>
 RewriteEngine on
 RewriteCond %{REQUEST_URI} !^/favicon\.ico [NC]
 RewriteCond %{REQUEST_URI} favicon\.ico [NC]
 RewriteRule (.*) http://domain.tld/favicon.ico [R=301,L] 
</ifmodule>

Of course, this method works completely well, but may be simplified greatly using our new friend, RedirectMatch:

RedirectMatch 301 ^/favicon.ico$ http://domain.tld/favicon.ico

Comparing the two different techniques reveals a wealth of information, and I highly encourage it. But, rather than get into all of that here, let’s move on with another variation.

Unlike the redirect scenario addressed in the article, let’s imagine a case where we would like to redirect not only the blog directory, but all other files and subdirectories as well, such that:

  • http://domain.tld/blog/
  • http://domain.tld/blog/file-01.html
  • http://domain.tld/blog/file-02.hmtl
  • http://domain.tld/blog/sub-01/
  • http://domain.tld/blog/sub-02/

..and etc. will all be redirected to some specified target location. This target location can be anything — the home page, a single file, another subdirectory, etc. — and on any server. The possibilities and uses for such a redirect are endless. Here is how it looks in the HTAccess file:

RedirectMatch 301 ^/blog/.*$ http://domain.tld/target.html

Other variations on this technique include specifying temporary (302) redirects rather than permanent (301) by editing as follows:

# This is a permanent redirect
RedirectMatch 301 ^/blog/.*$ http://domain.tld/target.html
# This is a temporary redirect
RedirectMatch 302 ^/blog/.*$ http://domain.tld/target.html

Likewise, you may also write:

# This is a permanent redirect
RedirectMatch permanent ^/blog/.*$ http://domain.tld/target.html
# This is a temporary redirect
RedirectMatch temp ^/blog/.*$ http://domain.tld/target.html

The End (for now..)

You know I could on and on with this stuff, but I am getting hungry now, so I will leave it here and grab a sandwich. I may sharpen things up a bit or add some more once I get back, but probably not. Instead, I think I will just call it good and see if anyone actually makes it through the entire article to read this. My money says that a few will, but not the majority. I mean, come on, who has time to waste with all of this geeky nonsense anyway? Certainly not me! ;)