Doctypes

Be warned, this is “here be dragons” territory…

To cope with both old (and badly written) code and well-formed up-to-date code, modern browsers have two modes: Quirks mode for dealing with the former, allowing non-compliant code to be displayed without (usually) too much ill effect, and Standards mode for displaying compliant documents according to current standards. There is a third mode called Almost Standards which is used by Mozilla, Safari and Opera 7.5 with certain doctypes and in this mode some vertical sizing is handled by pre-CSS2 standards. Standards mode in IE6 and older versions of Opera could really be called Almost Standards because these browsers do not comply fully with CSS2 anyway as regards vertical sizing. Towards the bottom of this page is a chart which explains the other differences between the behaviour of the various browsers according to mode.

XHTML documents are worth an aside to note that they really should not be delivered as mime-type text/html at all, but as application/xhtml_xml and when they are delivered that way, doctypes are redundant. Unfortunately, no-one using IE6 would then be able to read your pages at all, so when delivered as text/html, they get treated as regular HTML documents. That reality makes a nonsense of many of the claims made by XHTML evangelists.

Ironically, since the point of Document Type Declarations (doctypes) is to reduce the need for hacks, they are in themselves arguably hacks. Their job is to tell browsers according to which mode the documents they are reading should be interpreted. Hacks or not, they are vital because if you have coded for one doctype but fail to identify it correctly to your visitors’ browsers, your carefully crafted pages could end up looking very different from the way you intended.

All modern browsers use doctype sniffing (aka switching) and look to the very first line of an (X)HTML document for a doctype. You may put some php code (for example) that will be interpreted server side ahead of this declaration, but any code which will remain visible in the source, must follow the doctype. Fail to do that and all browsers will revert to quirks mode, ignoring any CSS2 coding you may have used.

Doctypes have two components. The first is compulsory, it is an identifier, much as the numbers attached to versions of PHP are. The second (optional) part is a URL pointing to a document which contains information and rules which apply to this identifier. The two components must be matched if you use both. You can leave the URL out, but – depending on the level of HTML you have coded for – doing so will cause some browsers to drop from Standards to Almost Standards or Quirks mode, even if you have declared a Strict doctype.

Oops. Another term to identify!

In the context of browsers, we talk about Quirks, Standards and Almost Standards modes. When we are dealing with HTML documents, the relevant terms are Strict and Transitional. When we declare that a document conforms to Strict standards, we are telling browsers to interpret it according to the latest (currently CSS2) standards. Any code which is non-CSS2 compliant will be ignored and if we attempt to validate a document which is declared as Strict which includes non-CSS2 compliant code, it will fail. A Transitional document will not fail validation only because it includes older code, modern browsers will (usually) recognize any CSS2 code which is included while also accomodating older code, and older browsers, providing they are not too old, will mostly feel right at home.

At first glance there are good reasons to declare a Transitional doctype, but Strict DTD’s do encourage the complete separation of structure and presentation which you must have in order to get the full benefit from CSS. And, as some people are about to learn with IE7 which is dropping support for the star html hack in Standards mode, you adopt non-compliant techniques at your own risk.

If you want to dig into this further, there is a detailed chart of browser behavior relative to doctypes here and W3C has a list of all the currently valid doctypes. These are the most relevant from which to make your choice:

HTML 4.01 – Strict, Transitional, Frameset

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
   "http://www.w3.org/TR/html4/frameset.dtd">

XHTML 1.0 – Strict, Transitional, Frameset

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

And I guess we now need an explanation of how to choose between plain old HTML and XHTML (extensible hypertext markup language). There are several potential advantages in using XHTML, primarily that it is easier to read and write than HTML and that the “extensible” part of the name refers to the ability to mix in other XML elements (from MathML for example). But because XHTML cannot be properly served up to some browsers (and not at all to IE6), XHTML ends up being treated by browsers exactly as if it were as regular HTML. Thus any advantages it has remain theoretical at the present time, except to authors working in very specific areas. Never use XHTML 1.1 because it is not fully compatible with HTML and if you have read further on your own and come across XML declarations for XHTML, be aware that prefacing an XHTML doctype with an XML declaration will force IE6 into Quirks mode, regardless of the doctype you use.

One parting note before I go take some aspirins, is that there are elements which are not valid for Strict doctypes. These include center, font, strike, u and – the one which might affect more webmasters – iframe, popular in the delivery of geo promos and live webcam promos. If you want Strict documents to validate, in place of iframes you should use objects:

<iframe src="page.html" width="400" height="300">Alternative text in case the iframe cannot be displayed</iframe>

becomes

<object type="text/html" data="page.html" width="400" height="300">Alternative text in case the object cannot be displayed</object>

There is a fairly long list of attributes which are also not permitted with Strict doctypes. That list can be found here and the one many will want to be aware of is that the target tag is among the no-no’s. To use target and still validate, this piece of javascript will solve the problem.

Although browsers may read the URL you have included in your doctype, they are not required to do so. Since the treatment of entities such as &nbsp; is defined in that DTD, it is therefore safer to use the numeric equivalents in your code. A full list is here.

Leave a Reply