Pimp Your 404: Presentation and Functionality

I have been wanting to write about 404 error pages for quite awhile now. They have always been very important to me, with customized error pages playing a integral part of every well-rounded web-design strategy. Rather than try to re-invent the wheel with this, I think I will just go through and discuss some thoughts about 404 error pages, share some useful code snippets, and highlight some suggested resources along the way. In a sense, this post is nothing more than a giant “brain-dump” of all things 404 for future reference. Hopefully you will find it useful in pimping your own 404.

When requested page is not found by server, error message is returned; this is the essence of the 404 — Ancient Chinese proverb

What is a 404 and why should I care?

Technically, you don’t need to even think about 404 errors. They are handled automatically by server. But the problem is that, by default, your server is going to deliver one sick-looking message:

[ Screenshot: Default Apache 404 Error Page ]
Default Apache 404 error page

While that sort of message is fine for uber-geeky tech sites, chances are that “normal” visitors are going to crap themselves if they see such a thing, thinking:

What does that mean? Did I break something? I don’t have time for this. I’m outta here!

You don’t want that, and neither do your visitors. Granted, most visitors are experienced enough to “deal with it,” but there are many that just don’t understand. This is why many designers take the time to customize their 404 error pages to make them a little more “user-friendly” and not so bloody frightening. It’s not like we’re a bunch of robots, after all.

[ Screenshot: A 'user-friendly' 404 Error Page ]
A nice, “user-friendly” 404 error page
(404 link removed 2013/05/13)

Bottom line, here are three reasons why you should give a flying flip about 404 error pages:

  • Default 404 errors are ugly and scary to regular people
  • Default 404 errors may result in a higher bounce rate, because people are scared
  • Custom 404 pages tell the reader that you care so much about them, and that it’s “all good”
  • User-friendly 404 pages draw the reader into your site, instead of scaring them away
  • Default 404 errors are useless, but your custom 404 pages can actually help the user find what they are looking for
  • You get the idea..

In short, a well-designed 404 page keeps the user engaged and helps build trust. It just makes for an all-around better site.

[ Screenshot: Another 'user-friendly' 404 Error Page ]
Another “user-friendly” 404 error page

I’m sold, how do I do it?

Here are some tried and true best practices for creating that perfect 404 error page:

Keep it familiar
Your custom error page should look like a well-integrated, natural part of your website. One of the reasons why default 404 error pages are so hideous is that they look so alien to your visitors. Unless your site is nothing but black serif fonts on plain white background, the default just isn’t going to “blend in.” So keep it real, and make sure that your 404s (and other error pages) share the same design as the rest of your site.
Explain the situation
Everyone in their right mind understands that errors happen. There is no need to apologize for anything, but you do want to explain the situation. This doesn’t have to be anything too serious, just a brief sentence or two telling the user that “something happened” and that the “requested page was not found.” Try to match the tone and flow of your site. If your site is formal, best to stay that way. If your site is hilarious, don’t blow it on the 404.
Provide some guidance
A user sitting there staring at a 404 error page is obviously lost. Try to provide some guidance with a search bar, a site map, or perhaps a recipe for some strong, mixed drinks. There is a good chance that the visitor is looking for some of your popular content, so you may want to suggest a few popular site destinations. It is also helpful to include a search form (or a link to one), so that the user may report the error or ask a question about that darling resource they couldn’t find.
Display the requested URL
As the user sits there scratching their head, it may be helpful to show them the URL that was received by the server. For example, the user may think that they entered “http://yoursite.com/blondie/”, but in reality they might have entered something like “http://yoursite.com/blodie/” instead. Echoing back the originally requested URL makes it easier for the user to spot any mistakes. Later in the article, we’ll see a nice way to do this.
Be mindful of file size
As design guru Chris Coyier points out, keeping an eye on the overall size of your 404 page is a good idea. Your 404 will be delivered for every missing request, not just the ones triggered by your visitors. This includes everything from bizarre favicon requests and non-existent robots.txt files to missing scripts, stylesheets, images, and everything in-between. Needless to say, all of these 404s can add up quickly, so if bandwidth is an issue, be sure to keep an eye on total 404 size.
Take advantage
Just because life is sucking for the people who can’t find your content doesn’t mean that you can’t benefit. When visitors trigger 404 errors, collect some data about the event so you can fix the issue and prevent further occurrences. For each 404 error, there is a great deal of information that is available to you, such as the requested URL, referrer info, IP address, and much more. Once you have this data recorded somewhere, cleaning things up is much easier. Later in the article, we’ll look at some cool code snippets to help you implement some keen functionality for your 404s.
Have some fun
A confused visitor is a scared visitor. The 404 page is a great opportunity to lighten things up with a little humor. Nothing too condescending, but just enough to let ‘em know that you’re on their side, and that you hate being lost just as much as they do. A quick laugh can gloss over just about anything, and you never know — if your page is clever enough, it might be featured in one of those ridiculously popular 404 error-page galleries ;)

Later in the article, we’ll check out some classic examples of effective and useful 404 pages. For now, let’s see how to set ‘em up..

How do I implement a custom 404 page?

That depends on how your site is setup. If you are running WordPress, simply create a theme file named “404.php” and add whatever code and content you wish. Also in WordPress is the intra-page error, which is triggered when no content is found for a specific type of page view. Those familiar with the WordPress loop will recognize this as the portion of code located after the final else condition. This is not technically a 404 error — more like an intra-WordPress 404 — but it may also be customized and optimized for your visitors.

[ Screenshot: The code responsible for the 'intra-WordPress 404' ]
This portion of loop code is responsible for the “intra-WordPress” 404 error message

For non-WordPress sites, the easiest way to override the default 404 page and deliver your own customized page is to tweak your .htaccess with the following directive:

ErrorDocument 404 /error/404.php

This directive will replace the default 404 error page with the one specified (“/error/404.php” in our example). Edit the path according to the location of your custom file. Apache handles the rest. Thanks Apache.

Check out my previous article for more information on customizing error messages with HTAccess. You can do much more than custom 404s!

Simply brilliant, let’s see some code snippets

As mentioned, the underlying functionality of your 404 pages is just as important as the user-friendly interface. You could have the swellest 404 on teh Web, but unless you are actively working to resolve the errors, their frequency will inevitably increase. There are many awesome ways to track errors and enhance the underlying functionality of your 404 page. Here are some of my favorites.

Smart 404 for WordPress
Michael Tyson shares this excellent method to make WordPress’ 404 handler a little bit smarter:

I changed my template’s 404 page to do a search for what the viewer was really after, and redirect them there. If it can’t find an exact match, it’ll perform a search with keywords extracted from the URL. If it finds a single result, it’ll redirect, otherwise it’ll put up a few results as suggestions on the 404 page.

Sounds nice, doesn’t it. And extremely helpful as well. Here is the code to place into the top of your active theme’s 404.php file, immediately preceding the get_header() tag:

<?php global $wp_the_query;
$search = preg_replace(array("@[_-]@", "@\.html$@"),array(" ",""),urldecode(basename($_SERVER["REQUEST_URI"])));
$posts = $wp_the_query->query(array("name" => $search));

if (count($posts) == 1) {
	wp_redirect(get_permalink($posts[0]->ID), 301);
	exit();
}
$posts = $wp_the_query->query(array("s" => $search));

if ( count($posts) == 1 ) {
	wp_redirect(get_permalink($posts[0]->ID), 301);
	exit();
} ?>

That’s the juice, right there. No editing required. Then, once that is in place, you can provide the user with some helpful suggestions. Just place this code somewhere within the <body> of your 404.php page:

<?php if (count($posts) > 0) : ?>
<ul>
<?php foreach ($posts as $post) : ?>

<li><a href="<?php echo get_permalink($post->ID); ?>"><?php echo $post->post_title; ?></a></li>

<?php endforeach; ?>
</ul>
<?php endif; ?>

With that code in place, the user will see a list of potentially relevant list of posts that may include something useful or of interest. There is quite a bit that can be done with this technique, so have some fun with it.

Automatic notification emails with all the trimmings

Delivering user-friendly 404 pages to your visitors is great, but don’t let that stop you from eliminating as many lost pages as possible. Cleaning up loose ends makes your site tighter, cleaner, and more usable. One way to keep an eye on your 404 errors is to simply crack open a log and dig in. In other situations, you may prefer to have the error information emailed to you in real-time. Here is a sweet little script that will do just that — send you an informative email every time a user triggers a 404 error. The email will contain a ton of related information, including everything from IP address and server name to requested URI and user agent. This strategy isn’t recommended for high-volume sites, but for smaller blogs and niche sites, it may be just what the 404 doctor ordered.

Here is the script that makes it happen:

<?php 
// 404 auto-mailer script from Perishable Press
function errorEmailAlerts() {

	header("HTTP/1.1 404 Not Found");
	header("Status: 404 Not Found");

	// configure next two lines with your info

	$site  = "Your Site Name";
	$email = "your-email@address.com";

	// gather some data

	$http_host    = $_SERVER['HTTP_HOST'];
	$server_name  = $_SERVER['SERVER_NAME'];
	$remote_ip    = $_SERVER['REMOTE_ADDR'];
	$remote_host  = $_SERVER["REMOTE_HOST"];
	$request_uri  = $_SERVER['REQUEST_URI'];
	$cookie       = $_SERVER["HTTP_COOKIE"];
	$http_ref     = $_SERVER['HTTP_REFERER'];
	$query_string = $_SERVER['QUERY_STRING'];
	$user_agent   = $_SERVER['HTTP_USER_AGENT'];
	$error_date   = date("D M j Y g:i:s a T");

	// prepare teh email

	$subject = "404 Alert";

	$headers  = "Content-Type: text/plain"."\n";
	$headers .= "From: ".$site." <".$email.">"."\n";

	$message  = "404 Error Report for ".$site."\n";
	$message .= "Date: ".$error_date."\n";
	$message .= "Requested URL: http://".$http_host.$request_uri."\n";
	$message .= "Query String: ".$query_string."\n";
	$message .= "Cookie: ".$cookie."\n";
	$message .= "Referrer: ".$http_ref."\n";
	$message .= "User Agent: ".$user_agent."\n";
	$message .= "IP Address: ".$remote_ip." - ".$remote_host."\n";
	$message .= "Whois: http://ws.arin.net/cgi-bin/whois.pl?queryinput=".$remote_ip;

	// send teh email

	mail($email, $subject, $message, $headers);

} ?>

Include this script in your active theme’s functions.php file and edit the first two variables with your specific information. Then, place the following function call at the very top of your theme’s 404.php file:

<?php errorEmailAlerts(); ?>

Once in place and properly configured, this function will ensure that the proper 404 header is sent to the client, collect as much useful information as possible, and then send you an email with all the trimmings. Larger, high-volume sites may want to consider using a more robust method of logging and analyzing error data, but smaller, low-traffic sites and blogs will certainly benefit from tracking their 404 errors in real time.

Whitelist specific user-agents and URL-request patterns

As you begin to monitor your errors, you will inevitably discover three different types of 404:

  • 404 errors caused by malicious behavior (exploit scanning, etc.)
  • 404 errors caused by benign requests for non-existent resources
  • legitimate 404 errors caused from missing resources or mistyped URLs

Of these three types of 404 errors, the first is by far the most common. The Web is full of unscrupulous bastards who insist on spamming, cracking and exploiting even the humblest of sites. Here at Perishable Press, I devote many resources to the fighting of this type of malicious behavior, including blacklists, scripts, and education.

For the second type of 404 error, we are referring mainly to persistent requests made by otherwise benign scripts that check for the presence of legitimate, proprietary extensions, files, and other things. A good example of this is seen in location mapping systems (2012/05/21: 404 link removed), which typically request the following strings when crawling your site:

  • http://domain.tld/path/resource/_vpi.xml
  • http://domain.tld/path/resource/_vti_bin/

These resources exist on servers that have been configured to accommodate visitors running programs such as Weblin. When they are discovered on a domain, they facilitate the discovery of site resources; when they are not found on a domain, the requests result in 404 errors. The number of these errors can be quite numerous and may appear as malicious.

In addition to specific file requests, there are also certain user agents that may exhibit unusual behavior when crawling your site. Good examples of this include bots that don’t understand fragment identifiers and search engines that try guessing for the presence of expected resources.

To prevent these sorts of benign requests from polluting your 404 monitoring process, you can either blacklist them or add them to a whitelist. To whitelist select agents, we can insert the following snippet into our previous errorEmailAlerts() function:

function errorEmailAlerts() {

	if (
		($_SERVER['HTTP_USER_AGENT'] != 'ia_archiver') && 
		($_SERVER['HTTP_USER_AGENT'] != 'Yahoo! Slurp') && 
		(strpos($request_uri, '/_vpi.xml') === false) && 
		(strpos($request_uri, '/_vti_bin') === false)
		) {

		// function contents go here

	}

}

Once in place, this conditional statement will check for any “whitelisted” user agents and/or request strings in the URL. You can easily whitelist additional items by emulating the existing code. I am thinking this is self-explanatory, but don’t hesitate to ask any specific questions in the comments section.

Fabulous, let’s see some more examples of great 404 pages

Finally, let’s wrap things up with a look at some great examples of 404 pages. Here are some of my favorite 404 error pages from around the Web.

[ Screenshot: http://patterntap.com/404 ]
http://patterntap.com/404

[ Screenshot: http://www.expansionbroadcast.com/404 ]
http://www.expansionbroadcast.com/404 (404 link removed 2014/02/11)

[ Screenshot: http://slonky.com/404 ]
http://slonky.com/404

[ Screenshot: http://monzilla.biz/web/404 ]
http://monzilla.biz/web/404

[ Screenshot: http://www.productplanner.com/404 ]
http://www.productplanner.com/404

[ Screenshot: http://thehpage.com/404 ]
http://thehpage.com/404

[ Screenshot: http://wakinglimb.com/404/ ]
http://wakinglimb.com/404/

[ Screenshot: http://24-7media.de/404 ]
http://24-7media.de/404

[ Screenshot: http://southcreative.com.au/404.shtml ]
Link 404 as of 2013/05/26

[ Screenshot: http://trentwalton.com/404 ]
http://trentwalton.com/404