Introduction to XHTML

Peter Coxhead [1]


 2 Strictly Conforming XHTML
3 XHTML Compatibility Issues


XHTML is basically just an 'XML-ized' version of HTML. It supports all the essential HTML elements and attributes. However, because it is in XML syntax, it must be well-formed. This stricter syntax is much more easily processed by tools, e.g. search engines, semantic web tools, screen scrapers for price comparisons, etc. One important constraint to remember is that since XML names are case sensitive, all XHTML element names and attributes must be in lower case.

There are a number of different versions of the XHTML standard:

Only XHTML 1.0 Strict will be covered in this module. My selection of core XHTML constructs is given in "Summary of Core XHTML".

There are many XHTML references online. Miroslav Nic's site can be recommended; the Wikipedia entry for XHTML has a number of links to specifications, references and online XHTML validators. The XHTML 1.0 standard is quite readable. See the References/Bibliography section below.

2 Strictly Conforming XHTML 1.0

The XHTML 1.0 specification introduced a new type of correctness to add to XML's well-formedness and validity: strictly conforming. This involves meeting a number of requirements (paraphrased or copied from the XHTML 1.0 standard):

<html xmlns="" xml:lang="en" lang="en">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

Thus a simple example of a strictly conforming XHTML 1.0 Strict document is the following:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns="" xml:lang="en" lang="en">
    <title>Virtual Library</title>
    <p>Moved to <a href=""></a>.</p>

3 XHTML Compatibility Issues

When a web server sends a document to a client, e.g. a browser, it also sends information on what kind of document it is. There are several ways in which this information can be presented, but the trigger is often the document's extension. XHTML can be 'served' (i.e. presented) to a browser in two quite distinct modes:

In the first mode, browsers typically handle XHTML in much the same way that they handle HTML. They do not enforce well-formedness, validity or conformity to any greater degree than they do with normal HTML (i.e. very little!).

In the second mode, browsers should enforce all three of these requirements. However, XHTML served as XML/XHTML is currently poorly supported by web browsers. In particular, Microsoft browsers up to Version 7 of Internet Explorer do not support it at all. Hence, for compatibility reasons, it is usually best to write XHTML so that it can be interpreted as pure HTML and put it into documents with the extension '.html'. This can be done without losing the advantages of writing proper XHTML and is now the recommended approach for all web page development.

However, it does require writing a somewhat 'dumbed down' version of XHTML. A full list of guidelines will be found in Appendix C of the XHTML 1.0 standard. Some of the most important are briefly described here:


See the other handouts for this module, which are available online.

Goodman, Danny (2007). Dynamic HTML: The Definitive Reference. O'Reilly. 0-596-52740-3. [This is an excellent and detailed reference manual for browser compatibility of all kinds of HTML, CSS, JavaScript, and DOM related issues. However, it is purely a reference and not suitable for learning these topics in the first instance.]

Nic, Miroslav (2002). XHTML 1.0 reference with examples. [Excellent online reference for XHTML.]

W3C (2002). XHTML 1.0 The Extensible HyperText Markup Language (Second Edition).

Wikipedia (undated). Wikipedia Entry for XHTML.


[1] This document is based on an original by Alan Sexton © 2007, modified by permission.

GoHome Page for "Information and the Web"