Latest TweetsWordPress and the Blank Target Vulnerability (aka rel noopener + noreferrer): perishablepress.com/wordpress-… #WordPress #security #html
Perishable Press

Redirecting Subdirectories to the Root Directory via HTAccess

One of the most useful techniques in my HTAccess toolbox involves URL redirection using Apache’s RedirectMatch directive. With RedirectMatch, you get the powerful regex pattern matching available in the mod_alias module combined with the simplicity and effectiveness of the Redirect directive. This hybrid functionality makes RedirectMatch the ideal method for highly specific redirection. In this tutorial, we will explore the application of RedirectMatch as it applies to one of the most difficult redirect scenarios: redirecting all requests for a specific subdirectory (or any subordinate directory or file) to the root (or any parent) directory. We will explore how to accomplish this redirect using PHP in a subsequent article.

The Scene

When developing sites, an excellent way to maintain a clean directory structure involves using subdirectories to organize content. A great example of this is seen in the placement of a site’s blog into its own subdirectory, such as this:

http://domain.tld/blog/

In this scenario, the blog directory serves as the home page for the blog itself. This works great if the content displayed on the site’s home page (i.e., at http://domain.tld/) is meant to be different than that displayed on the blog’s home page. So, for example, your site’s main page would greet visitors with a few photos and a nice welcome message. Then, from the welcome page you would link to your blog, which features your posts, pages, and other content.

Conversely, many sites prefer to display their blog content on the home page, or root, of the site. This, of course, is easily accomplished by simply placing your blog in the root directory. If you aren’t planning on growing, expanding, or restructuring your site in the future, and aren’t really concerned with maintaining a clean, organized directory structure, then by all means, go ahead and throw your life away by installing your blog in the root directory. If, on the other hand, you would like to display your blog content on the site’s home page, but would prefer to give the blog its own directory, read on..

The Challenge

The challenge facing the described strategy is duplicate content. By serving your blog from its own directory and delivering its content at the root directory, duplicate content is available in the following locations:

http://domain.tld/
http://domain.tld/blog/

When users visit either of these URLs, they will see the same content, namely, the main index page of your blog. This is a convoluted scenario for human visitors and a potential penalty from the search engines as well. Indeed, it would be far better to either redirect the blog root to the site root or vice versa. Logically, of course, it makes more sense to use your domain as the home page, so we will craft our HTAccess solution based on the following configuration:

  • Blog installed/located in its own (sub)directory
  • Blog content displayed on the site’s home page
  • Requests for the blog directory redirected to the home page

Once established, this configuration will enable you to place your blog in its own directory and display your blog on the home page (root URL) of your site. Further — and this is the trick — only the /blog/ directory will be redirected to the home page; all other blog pages will continue to be available in their expected locations. Here is a sampling of hypothetical URLs demonstrating this configuration:

  • http://domain.tld/
    [home page of site, shows blog content, no redirect]
  • http://domain.tld/blog/
    [blog subdirectory, redirected to home page]
  • http://domain.tld/blog/an-example-post/
    [blog post, shows blog directory in URL]
  • http://domain.tld/blog/an-example-page/
    [blog page, shows blog directory in URL]
  • http://domain.tld/blog/an-archive-page/
    [archive page, shows blog directory in URL]

Bottom line: this redirect strategy facilitates subdirectory blog installation, eliminates duplicate content, and enables the display of your blog on the home page of your site. Let’s have a look..

The Solution(s)

Here are two possible solutions for this scenario..

WordPress

Depending on your server configuration and blogging platform, there are several ways to implement this strategy. First of all, WordPress provides a “built-in” mechanism for giving your blog its own directory. This method works best if implemented during or immediately after the installation of WordPress. With this technique, WordPress rewrites all of your permalinks such as to remove the /blog/ portion of your URLs. Thus, if you are launching a new WordPress-powered blog, this strategy is aesthetically superior to the HTAccess solution by simply eliminating all references to the blog’s subdirectory; it will be as if you had installed your blog in the root directory to begin with.

But keep in mind that there are potential shortcomings with the WordPress method, such as are encountered when trying to integrate additional content to your site. WordPress’ activated rewrite mechanism may interfere with the proper functionality of galleries, blogs, and other peripheral content. It should also be noted that implementing WordPress’ redirection feature will break any external links to your blog’s pages. Depending on the number of inbound links, this could result in a large number of 404 errors. But relax! If all of this WordPress redirection stuff makes your skin crawl, read on for a better solution.

Apache/HTAccess

In general, I prefer to handle redirection at the server level. Using Apache’s powerful rewrite functionality, it is possible to craft highly specific redirects for virtually any configuration. In this case, where we need to redirect requests for the /blog/ (sub)directory to the site’s root directory, Apache’s RedirectMatch directive accomplishes the task perfectly. Given the previously described scenario, the following Apache directive should be placed in the site’s root HTAccess file (or, alternatively, in the server’s configuration file):

RedirectMatch 301 ^/blog/$ http://domain.tld/

Pure magic. Once in place on an Apache-powered server, this single line will redirect all requests for the /blog/ directory to the site’s root directory. As outlined previously, all subordinate URLs will include the /blog/ directory in the address string and continue to function as expected. This is a concise, direct, effective solution that is as simple as possible. Only a single, target URL is affected, enabling you to easily integrate new features into your site without overzealous rewrite interference.

As for the functional specifics of the RedirectMatch technique, the process is very straightforward:

  1. Match all URLs containing the specified character string
  2. Redirect all matches to the specified target URL
  3. Deliver a redirect status of permanent (301) with all requests

Also, notice how RedirectMatch differs from the similar Redirect directive. With RedirectMatch, exact pattern matching is possible using regular expressions. Conversely, Redirect uses prefix matching, which affects any URL that includes the specified character string. Regular expressions aren’t allowed in Redirect directives, but they are allowed with RedirectMatch.

Variations

In addition to redirecting the subdirectory of your blog to the site’s root directory, you can also use the RedirectMatch directive for many other case-specific redirects. For example, I recently shared a technique for redirecting all requests for a nonexistent file to the actual file. In that article, I prescribe a technique for redirecting all misdirected requests for the site’s favicon.ico back to the actual file located in the root directory of the site:

# REDIRECT FAVICON REQUESTS
<ifmodule mod_rewrite.c>
 RewriteEngine on
 RewriteCond %{REQUEST_URI} !^/favicon\.ico [NC]
 RewriteCond %{REQUEST_URI} favicon\.ico [NC]
 RewriteRule (.*) http://domain.tld/favicon.ico [R=301,L] 
</ifmodule>

Of course, this method works completely well, but may be simplified greatly using our new friend, RedirectMatch:

RedirectMatch 301 ^/favicon.ico$ http://domain.tld/favicon.ico

Comparing the two different techniques reveals a wealth of information, and I highly encourage it. But, rather than get into all of that here, let’s move on with another variation.

Unlike the redirect scenario addressed in the article, let’s imagine a case where we would like to redirect not only the blog directory, but all other files and subdirectories as well, such that:

  • http://domain.tld/blog/
  • http://domain.tld/blog/file-01.html
  • http://domain.tld/blog/file-02.hmtl
  • http://domain.tld/blog/sub-01/
  • http://domain.tld/blog/sub-02/

..and etc. will all be redirected to some specified target location. This target location can be anything — the home page, a single file, another subdirectory, etc. — and on any server. The possibilities and uses for such a redirect are endless. Here is how it looks in the HTAccess file:

RedirectMatch 301 ^/blog/.*$ http://domain.tld/target.html

Other variations on this technique include specifying temporary (302) redirects rather than permanent (301) by editing as follows:

# This is a permanent redirect
RedirectMatch 301 ^/blog/.*$ http://domain.tld/target.html
# This is a temporary redirect
RedirectMatch 302 ^/blog/.*$ http://domain.tld/target.html

Likewise, you may also write:

# This is a permanent redirect
RedirectMatch permanent ^/blog/.*$ http://domain.tld/target.html
# This is a temporary redirect
RedirectMatch temp ^/blog/.*$ http://domain.tld/target.html

The End (for now..)

You know I could on and on with this stuff, but I am getting hungry now, so I will leave it here and grab a sandwich. I may sharpen things up a bit or add some more once I get back, but probably not. Instead, I think I will just call it good and see if anyone actually makes it through the entire article to read this. My money says that a few will, but not the majority. I mean, come on, who has time to waste with all of this geeky nonsense anyway? Certainly not me! ;)

Jeff Starr
About the Author Jeff Starr = Designer. Developer. Producer. Writer. Editor. Etc.
Archives
33 responses
  1. Jeff Starr

    @Donace: ah, I see.. not sure if that is possible with pure Apache/htaccess, but I will keep it in mind and see if anything comes up..

  2. Jeff Starr

    @Joshua: Excellent idea! As you say, there are quite a few htaccess generator scripts floating around, but I have yet to see anything aimed specifically at mod_rewrite. If I had the time, I would definitely work on something like this..

  3. Thanks very much for this post. I have been searching for and wide for this final result: RedirectMatch 301 ^/blog/.*$ http://domain.tld/

    As I had my old wordpress blog pointing to a folder and and had an htacces file in there to rename files. So when I tried several other techniques, it was all good for the folder, but anything beyond that was ERROR!

    Thanks again!

  4. Jeff Starr

    @Jay: Happy to help! The power of Apache’s RedirectMatch is one of my best-kept secrets. It’s like the swiss-army knife of HTAccess. Glad to hear it solved your issue as well. Cheers! :)

  5. Hi,

    I’ve been using htaccess and mod_rewrte for a few years, but have never seen a solution to the following.

    Let’s say I have content to serve in a sub-dir of my web root, for example in /c. That directory corresponds to a Zend Framework app, so going to http://mydomain.com/zf-quickstart/ is fine, and I access the index controller.

    But if I want to go to http://mydomain.com/ and still be ‘hitting’ the index controller, is that possible? In other words, can I remove the “zf-quickstart” part of the url in htaccess, so that it appears as if my web root is the base directory for the application, and not http://mydomain.com/zf-quickstart/?

    Any advice would be greatly appreciated, this is driving me nuts, Im not sure its even possible!

    thanks
    paul

  6. Typo in my previous post: second line should read:

    Let’s say I have content to serve in a sub-dir of my web root, for example in zf-quickstart.

  7. SystemTraderFX March 2, 2009 @ 7:09 pm

    Thanks Jeff!

    This tutorial was useful in redirecting a subdirectory to another domain.

    Never knew about RedirectMatch!
    Much more easier to use than RewriteRules!

  8. John PLumridge March 26, 2009 @ 10:48 am

    I moved my blosxom.cgi script from the subfolder ‘blosxom’ to the root webserver directory, public_html, because my server allows cgi scripts to run from any directory. Incidentally, I changed the blosxom.cgi file name to index.pl, which means that the home page is reached with just the domain name.
    This is how I accomplished it with an apache rewrite in the .htaccess file:

    RedirectMatch 301 ^/blosxom/blosxom.cgi/(.*)$ http://roobi.nfshost.com/index.pl/$1

    Thanks for your help!

  9. Thank you so much.

    clear. well written. precise.

  10. Jeff Starr

    My pleasure, niv — glad to be of service :)

  11. Hi.

    Thanks for explaining this. I think I understand it well enough to be confused about how to do what I need.

    In my case
    If you type or link to any sub directory at
    mysite.biz
    I do NOT want to redirect you.
    I want you to get to where you linked to.

    If you you type or link to the home directory at

    mysite.biz
    I want you to go to
    mysite.com

    mysite.biz/sub no redirect
    mysite.biz …redirect to mysite.com

    What ever I try does not work…
    Everything seems to redirect or nothing seems to redirect.

    Can you can help… Thanks again.

    Jason

  12. Jeff Starr

    @Jason: Great comment! Reads more like an E.E. Cummings poem than a request for help. Each case is different, and without access to the code and setup, it is difficult to treat such generalized cases effectively and efficiently. Even so, I can’t help but try a little shot in the dark:

    RedirectMatch 301 ^/$ http://mysite.com/

    Place in the web-accessible root directory of your mysite.biz domain.

[ Comments are closed for this post ]