HTML & XHTML

What is "HTML"?

Since the olden days web pages have been written using a code called HTML, which stands for HyperText Mark-up Language. "HyperText" means that it is something more than just plain text, "Mark-up" means that the text is "marked-up" by adding codes (which gives it the hyper-ness) and "Language" tells us that it is a specific language of codes, not just any old thing.

HTML is one of the key things which makes the World Wide Web possible. Without it we'd still be reading plain text files like our primitive cave-person ancestors. Luckily for us, HTML is also pretty easy to use.

Using HTML means we wrap the parts of a text in tags which tell a computer what to do when the text is viewed in a browser. For instance, if some text is written between tags like this:


Pack my box with <em>five dozen</em> liquor jugs


the text will be displayed in a browser like this:


Pack my box with five dozen liquor jugs


What happened here is that the web-browser software worked through the text "Pack my box with ", casually displaying each letter, until it reached the <em> tag. This tag is an instruction that says "every letter from now on must be emphasised", so the obedient browser does as it is told and each subsequent letter is emphasised—usually with italic text. Then the </em> is discovered and the web-browser understands that this means "stop emphasising" so it does as it is told and the following text is not emphasised.

In effect this is a small computer program which tells the computer to start doing something and then tells it to stop. Of course every web page is composed of many HTML instructions like this, but they all follow the same principle that when the browser software arrives at an instruction (a tag) it does what it's told until it is instructed to stop. When you see a "/" in an HTML tag you can usually read that as a stop sign.

Muddy waters

Way back when the rules of HTML were written things were quite relaxed, so much so that mark-up which may not be very well constructed may nevertheless work well enough to be displayed and understood. The programmers who made the various web browsers also liked to add features which were not in the official specification for HTML and while this made new tricks possible it also led to a lot of confusion.

The atmosphere was like that of a lawless frontier town, where anything went as long as it worked for somebody.

XHTML

With XHTML we have a standard which attempts to clear up some of the confusion while establishing rules which guide us to creating better code. The "X" stands for "eXtensible" and comes from the fact that this language is a reformulated version of HTML, written to conform to stricter but more flexible rules of XML—eXtensible Mark-up Language. When discussing HTML we should keep in mind that what we mean is XHTML, since the current standard version of HTML (4.01) is XHTML.

XHTML tells us to shape up our act, to throw out our bad habits and start going straight. But this doesn't mean we have to be dull (those who read The Dot and the Line as a child will know the territory). XHTML rules allow us to add depth to our content while giving us more freedom and flexibility. The old way will still work, but by following the standards we are ensuring that our readers have a better experience while we have an easier and more satisfying job.