Summary of Core XHTML

Peter Coxhead

Contents

Structure
head elements: meta, title, link, style, script, noscript
body elements
Block vs. inline elements
Simple Blocks: h1 to h6, div, p, pre, br
Lists: ul, li, ol
Tables: table, tr, td
Text Styles: span, em, strong, code
Other Inline Elements: a, img
User Interaction: input, select
Scripting: script, noscript
Validity

Structure

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <!-- head elements go here -->
  </head>
  <body>
    <!-- body elements go here -->
  </body>
</html>

head elements

meta

title defines the text shown in the title bar of the window.

link specifies the location of an external CSS stylesheet.

style defines styles within the document. Individual elements can be styled via a style attribute. See link above to specify an external stylesheet.

script may be used as a head or a body element. Defines JavaScript, either within the document or as an external file.

noscript may be used as a head or a body element. Browsers which have JavaScript support disabled or do not support JavaScript process HTML placed in a noscript element; otherwise they ignore it.

body elements

Block vs. inline elements

An important distinction is between block elements and inline elements. By default, some HTML elements create separate 'blocks' when laid out in a page. For example p elements create distinct paragraphs, each starting on a new line. Other HTML elements are 'inline', which basically means that they are laid out in the same way as text. For example, a strong element inside a paragraph (p) doesn't affect the position of its contents; it just changes their appearance.

(Styles could in principle be used to over-ride such default behaviour, but this would be very confusing to anyone reading the resulting HTML.)

Lists and tables are effectively special kinds of blocks, where the 'basic' blocks are list items (li) and table cells (td). However, these have to be enclosed in outer elements (such as ul for unordered lists or tr for a table row) to be displayed properly.

Simple Blocks

h1 to h6 are predefined block elements which should be used to create paragraphs to serve as headings and subheadings. Start with a single h1; keep them in a sensible order in the document.

div defines a block in the document, that is a section which by default starts on a new line and forces a new line afterwards. It is often used so that a different style may be applied to a section of the page.

p is a pre-defined block element, corresponding to a 'paragraph'. Most browsers will put extra space before and after a p element compared to a div. Unlike div elements, which can be nested, paragraphs should not contain other block elements in XHTML.

pre is a pre-defined block element. Unlike other HTML elements, it retains white space from the source HTML file. It uses a fixed width font. It is thus useful for laying out code with the correct indentation.

br can be used within blocks of any kind to force a newline.

Lists

ul defines an unordered list with bullets.

ol defines an ordered list with numbers.

Tables

Simple tables have rows defined by tr, cells defined by td. The summary attribute should give a brief text description of the table for users who rely on spoken text access to web pages.

Note that only elements which define 'table components' can appear inside table or tr elements; 'general markup' can only appear inside table cells. Thus the following is not valid:

Text Styles

span defines an inline piece of text (i.e. a 'span'), usually so that different text styling can be applied.

em, strong and code are three of a number of pre-defined text styles, respectively for emphasis, strong emphasis and for a fixed width font for code. These elements should probably be avoided now in favour of a styled span.

Other Inline Elements

a defines a hyperlink OR an anchor point in the document to which a link can be made.

img defines the source file of an image to be shown on the page and also sets its size (in pixels). Browsers will normally use the pixel size of the source file if width and height are omitted or scale the image to the size given if different from the source file. The alt attribute should present a alternative brief text description of the image for users who rely on non-visual access to web pages.

User Interaction

input defines a displayed inline item which is used to interact with the user. An important use of input is in form elements, not covered here.

The type attribute can take a number of values to specify the kind of input item. Examples are button, text, checkbox, radio.

To respond to user interaction, either explicit JavaScript event handlers must be provided in attributes such as onclick, or the input element must be part of a form.

Adding the attribute disabled="disabled" will disable the input.

select defines an inline drop down menu list; it's a particular kind of input item. When an option is chosen, its value attribute becomes the string value of the select item as a whole, accessed in JavaScript through the value field of the select object. Typically the value of an option is set to a shortened version of the text shown in the menu.

Validity

XHTML files can be validated via the W3C Markup Validation Service, by either inputting a URL or uploading a file. Do validate your XHTML!

However, there are at present difficulties with including JavaScript in valid XHMTL web pages (external files are fine).

Well-formed and valid XHTML documents can be presented to a browser (user agent) in several modes, e.g. as HTML, as XHTML or as pure XML. The file extension is normally the main trigger; for example, files whose names end in ".html" will usually be treated as HTML and files whose names end in ".xml" or ".xhtml" as XHTML/XML.

The problem at present is that some of the major browsers, particularly Internet Explorer, do not correctly process XHTML served as XHTML/XML. So for compatibility, it is best use the extension ".html" or ".htm", which normally forces an XHTML page to be served as HTML. This means that JavaScript should be written as in the example in the first bullet point above.

However, if the DOCTYPE and XML name space declaration identify the document as XHTML, the W3C Markup Validation Service will then report rightly validation errors if the JavaScript contains & or uses < or > in ways which look like tags.

The best solution is to put all JavaScript into external files, with at most function calls embedded into the XHTML. Alternatively, accept that some JavaScript will cause validation errors until browsers catch up.