Universal www-Canonicalization via htaccess
During my previous rendezvous involving comprehensive canonicalization for WordPress, I offer my personally customized technique for ensuring consistently precise and accurate URL delivery. That particular method targets WordPress exclusively (although the logic could be manipulated for general use), and requires a bit of editing to adapt the code to each particular configuration. In this follow-up tutorial, I present a basic www-canonicalization technique that accomplishes the following:
- requires or removes the
www
prefix for all URLs - absolutely no editing when requiring the
www
prefix - minimal amount of editing when removing the
www
prefix - minimal amount of code used to execute either technique
I have found this “universal” www-canonicalization technique extremely useful in its simplicity and elegance. Especially when requiring the www
prefix, nothing could be easier: simply copy, paste, done — absolutely no hard-coding necessary!
Require the www prefix
To ensure that all URLs of a given domain present with the www
prefix, open the domain’s root htaccess
file and add the following chunk of code (no editing required!):
# universal www canonicalization via htaccess
# require www prefix for all urls of any domain - no editing required
# https://perishablepress.com/press/2008/04/30/universal-www-canonicalization-via-htaccess/
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{HTTP_HOST} ^([^.]+\.[a-z]{2,6})$ [NC]
RewriteRule ^(.*)$ http://www.%1/$1 [R=301,L]
Again, this htaccess
code will ensure that all of your URLs display with the www
prefix. The first three lines are comments explaining the purpose of the code. The next two lines initialize Apache’s mod_rewrite
module and specify the base path for the operation. Note that you may not need to include the RewriteEngine
directive if it has been included previously in the htaccess
document. The final three lines of code provide the desired canonical functionality as follows:
RewriteCond %{HTTP_HOST} !^www\. [NC]
- This directive is a condition that checks for the presence of the
www
prefix in the URL. Processing stops here if the URL already contains thewww
prefix. The[NC]
flag renders the string as case-insensitive. RewriteCond %{HTTP_HOST} ^([^.]+\.[a-z]{2,6})$ [NC]
- This directive is a condition that matches the general pattern of a domain name. The regular expression matches any string of valid characters that is followed by a literal dot (
.
) and an alphabetic string containing two to six characters. For example, the common example of a domain name,domain.tld
, will be matched by the regex. Likewise, the condition is designed to match any domain name — thus the term “universal” in the title of this post. ;) RewriteRule ^(.*)$ http://www.%1/$1 [R=301,L]
- This directive is where the actual URL rewriting takes place. Whenever both of the previous conditions prove true, the
RewriteRule
directs Apache to rewrite the URL such that it includes thewww
prefix. The^(.*)$
pattern matches any valid character string proceeding the domain name (and top-level domain). Finally, thehttp://www.%1/$1
serves as the pattern for the rewritten URL. The[R=301,L]
flag signals that the change is permanent (i.e.,301
), and also that this happens to be the last directive in this sequence ofRewrite
rules.
Remove the www prefix
To ensure that all URLs of a given domain present without the www
prefix, open the domain’s root htaccess
file and add the following chunk of code:
# universal www canonicalization via htaccess
# remove www prefix for all urls - replace all domain and tld with yours
# https://perishablepress.com/press/2008/04/30/universal-www-canonicalization-via-htaccess/
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC]
RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L]
When using this code to remove the www
prefix, this technique requires two simple edits: change both instances of “domain
” and “tld
” to match the target domain name and top-level domain name, respectively. For example, if your domain was located at “sweetdomain.com
”, you would edit the code as follows:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^sweetdomain\.com$ [NC]
RewriteRule ^(.*)$ http://sweetdomain.com/$1 [R=301,L]
And that’s all there is to it. Again, this code will remove the www
prefix from your URLs. Essentially, this works along the same general lines as the previous method, only this time the code matches any URL that doesn’t already exclude the www
prefix. For all such matches, the code then rewrites the URL in non-www
format.
Ground Control to Major Tom
After uploading either of the methods, remember to test your URLs vigorously. If you haven’t already discovered the immense power of Apache’s mod_rewrite
, rest assured that even the slightest error will immediately crash your website and celebrate by serving an unlimited supply of 500-Error
hors d'oeuvres to all of your visitors. Or something. The point, again, is to upload and test as many different URL configurations as possible. Nothing should go wrong, but never assume that it won’t! ;)
12 responses to “Universal www-Canonicalization via htaccess”
I use this code in my .htacces too. It works like a charm!
Thanks for the code confirmation, Geld — glad to know that everything is functioning properly. Cheers!
i have a problem with my mod-rewrite:
google have indexed this page
http://www.domain.com///index.html and i would like with only a backslash… any idea?
Try this:
RedirectMatch 301 ^///index.html$ http://www.domain.com/index.html
Add that to your root htaccess (or Apache config file) and check the results. Of course, I haven’t tested this code specifically, but it should solve the issue. You may need to remove one of the forward slashes from the match condition for it to work.
Cheers,
Jeff
There is simplest way:
RewriteRule ^ - [E=via:http]
RewriteCond %{HTTPS} =on
RewriteRule ^ - [E=via:https]
RewriteCond %{HTTP_HOST} !^www
RewriteRule (.*) %{ENV:via}://www.%{HTTP_HOST}/$1 [L,R=301]
Rules ensure
http
,https
and about.@Cezary Tomczyk: Thanks for sharing! Looking forward to trying it out! :)
Here is the method I use, similar to the one provided in the article:
RewriteCond %{HTTP_HOST} !^www\.[a-z-]+\.[a-z]{2,6} [NC]
RewriteCond %{HTTP_HOST} ([a-z-]+\.[a-z]{2,6})$ [NC]
RewriteRule ^/(.*)$ http://%1/$1 [R=301,L]
Works like a dream.
@August Klotz: Yes, Your method works well, but with domain http://www.one.two. Nothing more. But domains contains somethimes more words ;-)
—
Cezary Tomczyk
Thanx for this great and efficient code. And if I understand it correctly it will automatically 301 redirect my original (old) urls?
Hi Jeff! :D
I tried this rule but Firefox says (I am translating into English):
“This site does not redirect in correct mode”.
Where do you think could be the problem?
Here the rule I tried to use:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^aldolat.it$ [NC]
RewriteRule ^(.*)$ http://aldolat.it/$1 [R=301,L]
@Persoonlijke lening: Yes, precisely. 301 redirect according to your canonicalizational preferences (either remove or add www prefix).
@Aldo: Try it without the
RewriteBase
directive:RewriteEngine On
RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC]
RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L]
Thanx for this technique; very useful to me.