Content Negotiation for XHTML Documents via PHP and htaccess
In this article, I discuss the different MIME types available for XHTML and explain a method for serving your documents with the optimal MIME type, depending on the capacity of the user agent. Using either htaccess or PHP for content negotiation, we can serve complete, standards-compliant markup for our document’s header information. This is especially helpful when dealing with Internet Explorer while serving a DOCTYPE of XHTML 1.1 along with the recommended XML declaration.
According to the RFC standards1 produced by IETF2, web documents formatted as XHTML3 may be served as any of the following three MIME types:
text/xml
application/xml
application/xhtml+xml
Yet, while all three of these MIME types are technically correct, use of either text/xml
or application/xml
may yield unexpected, inconsistent, and undesired results4. Thus, application/xhtml+xml
is the recommended MIME type for XHTML documents. Such documents must adhere to XML formatting specifications. In general, well-formed XML involves:
- properly nested elements
- properly closed elements
- properly quoted attributes
- markup characters in lowercase
When delivered as application/xhtml+xml
, well-formed XHTML documents are processed correctly by any supportive user agent. Unfortunately, although most browsers understand the application/xhtml+xml
MIME type, Internet Explorer does not5. Rather than show the contents of such documents, IE displays a blank page and a download prompt. To prevent this behavior, many developers and designers serve their XHTML documents with the following, incorrect MIME type:
text/html
When served with this MIME type, XHTML documents are processed as HTML instead of XML. This may circumvent IE’s MIME deficiency, and even facilitate non-compliant markup, but it is technically incorrect and will elicit warnings upon attempted validation6. Further, W3C guidelines advise against text/html
MIME types for XHTML 1.17.
Content Negotiation
At this point, developers have the following options when using XHTML:
- Adhere to web standards, serve the correct MIME type (
application/xhtml+xml
), and abandon incompatible browsers (i.e., IE) - (or)
- Abandon web standards, serve an incorrect MIME type (
text/html
), and ensure that all browsers (including IE) understand the content.
Of course, a better solution involves using content negotiation8 to deliver an optimal MIME type regardless of user agent. Content negotiation enables us to serve XHTML as application/xhtml+xml
to standards-compatible browsers (such as Firefox, Opera, Safari), while serving text/html
to incompatible browsers (such as Internet Explorer). And best of all, implementing content negotiation is relatively easy.
On Apache servers, setting the HTTP headers required for the correct MIME type is as simple as adding the following directive to your htaccess (or httpd.conf
) file9:
Options +Multiviews
AddType application/xhtml+xml;qs=0.8
AddType text/html;qs=0.9
Note the qs=0.8
, which is a “source quality” parameter that determines whether or not the AddType
directive applies the specified MIME type. Setting the qs
value to 0.8
(on a scale of 0.000
to 1.000
) ensures that application/xhtml+xml
is delivered only to compatible agents; incompatible agents will receive the preferred (as indicated by the higher qs
value) text/html
MIME type.9
Content Negotiation via PHP
Setting optimal MIME types via content negotiation is also possible using PHP. Using the server variables contained in the $_SERVER
array, PHP will evaluate the HTTP Accept header of the user agent and set the appropriate MIME type via the header function. In the excellent article, MIME Types and Content Negotiation4, Juicy Studio demonstrates a basic technique for setting the MIME type via PHP:
<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml"))
header("Content-Type: application/xhtml+xml; charset=utf-8");
else
header("Content-Type: text/html; charset=utf-8");
?>
When present at the beginning of any PHP document, this code will set the MIME type to application/xhtml+xml
for supportive user agents, and text/html
for everything else. For example, Firefox, Safari, and Opera will interpret your XHTML pages as application/xhtml+xml
, while Internet Explorer processes them as text/html
. This technique also eliminates those rather unpleasant “Conflict between Mime Type and Document Type” warnings from the W3C validator!
Taking Advantage of PHP Content Negotiation
Of course, a potential shortcoming of the previous content-negotiation technique is the inevitable mismatch between MIME type and DOCTYPE. For example, there will be a discrepancy if your page explicitly declares any XHTML DOCTYPE (e.g., XHTML 1.0 Strict, XHTML 1.1, etc.) while text/html
is served as the MIME type for certain user agents.
Fortunately, we can expand the scope of the previous PHP script and take full advantage of its content-negotiation functionality. Consider the following code, which demonstrates a typical implementation of the content-negotiation script for an XHTML-1.1 document3:
header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml"))
header("Content-Type: application/xhtml+xml; charset=utf-8");
else
header("Content-Type: text/html; charset=utf-8");
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head profile="http://gmpg.org/xfn/11">
<title>Perishable Press: Digital Design and Dialogue ~</title>
.
.
.
</head>
For the script to work, it must precede all other content, resulting in an unwanted blank line generated in the source-code output. This blank line will create problems for browsers such as Internet Explorer, which will jump into “quirks” mode10, despite the otherwise complete DOCTYPE declaration. Even worse (in my mind) is the idea of serving XHTML 1.1 as text/html
, especially after going through the steps required to markup and deliver the content as application/xhtml+xml
. It just defeats the whole purpose of formatting via XHTML 1.1 in the first place. We may as well avoid the fuss and stick with good ‘ol HTML 4.01 as text/html
instead.
Yet, for those of us insisting on XHTML, we can improve our basic content-negotiation script such that:
- The optimal MIME type is sent via PHP header
- The correct DOCTYPE is declared in the document
head
- The empty line is eliminated from appearing at the top of the page (before the first line of markup)
- The recommended
<?xml>
declaration is included when serving XHTML 1.1
Let’s take a look at a couple of different implementations, one for XHTML 1.0 and another for XHTML 1.1.
PHP Content Negotiation for XHTML 1.0
To serve the optimal MIME type when using XHTML 1.0 for the DOCTYPE, replace your static markup with the following:
<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) {
header("Content-Type: application/xhtml+xml; charset=utf-8"); ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<?php } else {
header("Content-Type: text/html; charset=utf-8"); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<?php } ?>
.
.
.
[ place title, link, and other <head> elements here ]
.
.
.
</head>
This code will ensure that you are serving XHTML 1.0 as application/xhtml+xml
to supportive user agents, and HTML 4.01 as text/html
to the dumb kids. No further editing should be required, although the code itself remains flexible enough to facilitate further customization and adaptation ;)
PHP Content Negotiation for XHTML 1.1
To serve the optimal MIME type when using a DOCTYPE
of XHTML 1.1, replace your static markup with the following:
<?php header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"], "application/xhtml+xml")) {
header("Content-Type: application/xhtml+xml; charset=utf-8");
echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"."\n"; ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head profile="http://gmpg.org/xfn/11">
<?php } else {
header("Content-Type: text/html; charset=utf-8"); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<?php } ?>
.
.
.
[ place title, link, and other <head> elements here ]
.
.
.
</head>
This code will ensure that you are serving XHTML 1.1 as application/xhtml+xml
to supportive user agents, and HTML 4.01 as text/html
to everyone else. As before, no further editing should be required. The key for this piece of code is the inclusion of the <?xml>
declaration for XHTML 1.1 DOCTYPEs. Many developers and designers omit this recommended11 piece of information because of the challenges presented when doing so. Fortunately, this code makes it easy to successfully include the <?xml>
declaration when serving XHTML 1.1 as application/xhtml+xml
. Even better, the element is omitted when serving text/html
for HTML 4.01, which of course does not require the <?xml>
element in the first place.
Wrap up..
Hopefully, this article will help designers and developers deploy XHTML with greater accuracy, precision, and adherence to standards. Too often, XHTML pages are served incorrectly as text/html
, primarily because of Internet Explorer’s sorry lack of support for application/xhtml+xml
. Using the methods presented in this article, it is possible to negotiate user-agent support and subsequently serve the optimal DOCTYPE and corresponding MIME type. As always, please share any comments, questions, or concerns via the comments section below.
References
- 1 See: RFC 3023 and RFC 3236
- 2 IETF: The Internet Engineering Task Force
- 3 Bare-Bones HTML/XHTML Document Templates
- 4 MIME Types and Content Negotiation
- 5 Sending XHTML as text/html Considered Harmful
- 6 W3C Markup Validation Service
- 7 XHTML Media Types
- 8 About Content Negotiation
- 9 Apache: Content Negotiation
- 11 Prolog and Document Type Declaration