Content Negotiation for XHTML Documents via PHP and htaccess

[ ~:{*}:~ ] In this article, I discuss the different MIME types available for XHTML and explain a method for serving your documents with the optimal MIME type, depending on the capacity of the user agent. Using either htaccess or PHP for content negotiation, we can serve complete, standards-compliant markup for our document’s header information. This is especially helpful when dealing with Internet Explorer while serving a DOCTYPE of XHTML 1.1 along with the recommended XML declaration.

According to the RFC standards 1 produced by IETF 2, web documents formatted as XHTML 3 may be served as any of the following three MIME types:

  • text/xml
  • application/xml
  • application/xhtml+xml

Yet, while all three of these MIME types are technically correct, use of either text/xml or application/xml may yield unexpected, inconsistent, and undesired results 4. Thus, application/xhtml+xml is the recommended MIME type for XHTML documents. Such documents must adhere to XML formatting specifications. In general, well-formed XML involves:

  • properly nested elements
  • properly closed elements
  • properly quoted attributes
  • markup characters in lowercase

When delivered as application/xhtml+xml, well-formed XHTML documents are processed correctly by any supportive user agent. Unfortunately, although most browsers understand the application/xhtml+xml MIME type, Internet Explorer does not 5. Rather than show the contents of such documents, IE displays a blank page and a download prompt. To prevent this behavior, many developers and designers serve their XHTML documents with the following, incorrect MIME type:

text/html

When served with this MIME type, XHTML documents are processed as HTML instead of XML. This may circumvent IE’s MIME deficiency, and even facilitate non-compliant markup, but it is technically incorrect and will elicit warnings upon attempted validation 6. Further, W3C guidelines advise against text/html MIME types for XHTML 1.1 7.

Content Negotiation

At this point, developers have the following options when using XHTML:

  • Adhere to web standards, serve the correct MIME type (application/xhtml+xml), and abandon incompatible browsers (i.e., IE)
  • (or)
  • Abandon web standards, serve an incorrect MIME type (text/html), and ensure that all browsers (including IE) understand the content.

Of course, a better solution involves using content negotiation 8 to deliver an optimal MIME type regardless of user agent. Content negotiation enables us to serve XHTML as application/xhtml+xml to standards-compatible browsers (such as Firefox, Opera, Safari), while serving text/html to incompatible browsers (such as Internet Explorer). And best of all, implementing content negotiation is relatively easy.

On Apache servers, setting the HTTP headers required for the correct MIME type is as simple as adding the following directive to your htaccess (or httpd.conf) file 9:

Options +Multiviews
AddType application/xhtml+xml;qs=0.8
AddType text/html;qs=0.9

Note the qs=0.8, which is a “source quality” parameter that determines whether or not the AddType directive applies the specified MIME type. Setting the qs value to 0.8 (on a scale of 0.000 to 1.000) ensures that application/xhtml+xml is delivered only to compatible agents; incompatible agents will receive the preferred (as indicated by the higher qs value) text/html MIME type. 9

Content Negotiation via PHP

Setting optimal MIME types via content negotiation is also possible using PHP. Using the server variables contained in the $_SERVER array, PHP will evaluate the HTTP Accept header of the user agent and set the appropriate MIME type via the header function. In the excellent article, MIME Types and Content Negotiation 4, Juicy Studio demonstrates a basic technique for setting the MIME type via PHP:

<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) 
	header("Content-Type: application/xhtml+xml; charset=utf-8");
else
	header("Content-Type: text/html; charset=utf-8"); 
?>

When present at the beginning of any PHP document, this code will set the MIME type to application/xhtml+xml for supportive user agents, and text/html for everything else. For example, Firefox, Safari, and Opera will interpret your XHTML pages as application/xhtml+xml, while Internet Explorer processes them as text/html. This technique also eliminates those rather unpleasant “Conflict between Mime Type and Document Type” warnings from the W3C validator!

Taking Advantage of PHP Content Negotiation

Of course, a potential shortcoming of the previous content-negotiation technique is the inevitable mismatch between MIME type and DOCTYPE. For example, there will be a discrepancy if your page explicitly declares any XHTML DOCTYPE (e.g., XHTML 1.0 Strict, XHTML 1.1, etc.) while text/html is served as the MIME type for certain user agents.

Fortunately, we can expand the scope of the previous PHP script and take full advantage of its content-negotiation functionality. Consider the following code, which demonstrates a typical implementation of the content-negotiation script for an XHTML-1.1 document 3:

header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) 
	header("Content-Type: application/xhtml+xml; charset=utf-8");
else
	header("Content-Type: text/html; charset=utf-8");
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
	<head profile="http://gmpg.org/xfn/11">
		<title>Perishable Press: Digital Design and Dialogue ~</title>
.
.
.
	</head>

For the script to work, it must precede all other content, resulting in an unwanted blank line generated in the source-code output. This blank line will create problems for browsers such as Internet Explorer, which will jump into “quirks” mode 10, despite the otherwise complete DOCTYPE declaration. Even worse (in my mind) is the idea of serving XHTML 1.1 as text/html, especially after going through the steps required to markup and deliver the content as application/xhtml+xml. It just defeats the whole purpose of formatting via XHTML 1.1 in the first place. We may as well avoid the fuss and stick with good ‘ol HTML 4.01 as text/html instead.

Yet, for those of us insisting on XHTML, we can improve our basic content-negotiation script such that:

  • The optimal MIME type is sent via PHP header
  • The correct DOCTYPE is declared in the document head
  • The empty line is eliminated from appearing at the top of the page (before the first line of markup)
  • The recommended <?xml> declaration is included when serving XHTML 1.1

Let’s take a look at a couple of different implementations, one for XHTML 1.0 and another for XHTML 1.1.

PHP Content Negotiation for XHTML 1.0

To serve the optimal MIME type when using XHTML 1.0 for the DOCTYPE, replace your static markup with the following:

<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) {
	header("Content-Type: application/xhtml+xml; charset=utf-8"); ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
	<head>
<?php } else { 
	header("Content-Type: text/html; charset=utf-8"); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
	<head>
<?php } ?>
.
.
.
[ place title, link, and other <head> elements here ]
.
.
.
	</head>

This code will ensure that you are serving XHTML 1.0 as application/xhtml+xml to supportive user agents, and HTML 4.01 as text/html to the dumb kids. No further editing should be required, although the code itself remains flexible enough to facilitate further customization and adaptation ;)

PHP Content Negotiation for XHTML 1.1

To serve the optimal MIME type when using a DOCTYPE of XHTML 1.1, replace your static markup with the following:

<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) {
	header("Content-Type: application/xhtml+xml; charset=utf-8");
	echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"."\n"; ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
	<head profile="http://gmpg.org/xfn/11">
<?php } else { 
	header("Content-Type: text/html; charset=utf-8"); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
	<head>
<?php } ?>
.
.
.
[ place title, link, and other <head> elements here ]
.
.
.
	</head>

This code will ensure that you are serving XHTML 1.1 as application/xhtml+xml to supportive user agents, and HTML 4.01 as text/html to everyone else. As before, no further editing should be required. The key for this piece of code is the inclusion of the <?xml> declaration for XHTML 1.1 DOCTYPEs. Many developers and designers omit this recommended 11 piece of information because of the challenges presented when doing so. Fortunately, this code makes it easy to successfully include the <?xml> declaration when serving XHTML 1.1 as application/xhtml+xml. Even better, the element is omitted when serving text/html for HTML 4.01, which of course does not require the <?xml> element in the first place.

Wrap up..

Hopefully, this article will help designers and developers deploy XHTML with greater accuracy, precision, and adherence to standards. Too often, XHTML pages are served incorrectly as text/html, primarily because of Internet Explorer’s sorry lack of support for application/xhtml+xml. Using the methods presented in this article, it is possible to negotiate user-agent support and subsequently serve the optimal DOCTYPE and corresponding MIME type. As always, please share any comments, questions, or concerns via the comments section below.

References