Contents lists and indexes
Discussion
When any document reaches the point where clarity demands headings, it's also arrived at the need for a contents list or index.1 Even short documents can benefit from a brief list of links to sections within the document, and long documents should certainly have such a list.
While usability guidelines often suggest that readers do not like to scroll through long documents,2 there are many times when other practicalities require us to post long pages, so we should make life easier for our reader by providing a contents list.
Obviously, we don't want our contents list to be a dead-text listing of what the reader will find when s/he scrolls the page, we should exploit hypertext and make sure that each item listed is a hot-link to that place in the document (similar to our footnote technique). This allows readers to go straight to the part of the document which interests them and—when they get there—it allows them to bookmark that section if they wish.
We can also make the contents list navigable in both directions (again, like our footnotes), so that headings link back to the contents list. In many browsers this will also provide additional visual clues to navigation.3
Another big advantage of providing a hypertext contents list is that we get—for no extra effort and completely free of charge—the ability to address a specific chapter or subheading from anywhere else on the internet.
Technique
Our contents list (or index) should appear at the top of a page and the best structural mark-up is the list—either unordered or ordered. We use the list because it has a built-in hierarchy which we can exploit.4
Contents lists are fairly straightforward to construct, but we need to be methodical about it. We should review how lists are made, understand document structure and headings in particular, and ensure we are familiar with the anatomy of anchors.
As with footnotes, each item to be featured in the contents list requires a unique name. With small loosely-structured documents (like this one) the choice of name is straightforward and uncomplicated. But with long documents with several levels of heading we find that while the principle remains the same, the complexities increase because of the difficulty of navigating around such large amounts of text.5
The scheme used for choosing values for the attributes will depend on preferred usage, and we'll go into several possibilities. And since the link structure is the same for short and long documents, let's begin with the short stuff even if we intend to edit longer documents.
Simple documents
The section heading for "Technique" (above, an <h2> heading) has been given an anchor with id and name attributes, each with the value "technique", like this:
<h2><a id="technique" name="technique">Technique</a></h2>
Of course we also want to reciprocate with the item in the contents list, so we give that the id/name of "techniqueref" and add an href to the link on our heading anchor.6 The anchor on the heading now looks like this:
<h2><a href="techniqueref" id="technique" name="technique">Technique</a></h2>
To complete the pair of reciprocal anchors we do the same thing in our contents list code, but reverse the values so it finishes up like this:
<li>
<a href="#technique" id="techniqueref" name="techniqueref">Technique</a>
</li>
The actual order of the attributes within the anchors doesn't matter at all, but something to watch out for is that the referencing link is properly made with an octothorp (#) in the href value. The anchor's target values (the values we give to id and name) must not have an octothorp. Neither will work if improperly formed.
A diagram might help clarify how the href and id/name relate to each other.
This should now be as clear as is the summer's sun
. If not, we should review it since if we can't grasp this bit then we'll be in big trouble when we try to work with complex documents.
Complex documents
We shouldn't begin to build a contents list until the structure of a document is complete and all the items which will eventually be referred to by the contents list have been given their basic name/id anchors, as described below. When these jobs are done it may be possible to generate a contents list automatically from such a well formatted document. Some software will even add the anchors to headings and give them names out of a hat (though this won't give us human-readable code).
The steps are broadly:
-
Mark-up the headings
- Deal with the standard parts
- Work through the headings, level by level
- Tackle any elements which are outside the flow
- Build the index
Once the standard parts of the document have been dealt with, we go through the various levels of heading in passes which deal with one level at a time, then we go through the elements which are outside the normal flow of the document and treat them as required. This way we tackle the layers of complexity in manageable slices and can quickly spot troublesome areas before getting too deep.
Marking-up the headings
-
Standard parts
With complex documents we can usually deal quickly with the standard parts of the document. We should use a standard set of names in our indexing code and our choice selection is:
- contents
- introduction
- acknowledgements
- foreword
- summary
- abstract
- (the main text is here)
- conclusion
- footnotes
- bibliography
- notes
These are presented in roughly the order they normally appear in a document, though for a more detailed description, see the notes on structure.
It may well be that these headings are created automagically on the page by virtue of the database output being designed that way. If so, well, what a deal. We've saved ourselves some work. But, sadly, we didn't have the satisfaction of getting an easy job out the door before getting stuck into the tricky stuff.
If any of these standard parts appear in the document, they should be treated exactly as the example under simple documents, above and, without exception, these should be level
<h2></h2>headings. -
Working through the headings—the index-ID method
When it comes to chapters, headings, subheadings, sections, tables, figures and boxes, etc. we need something more orderly.
Because
idandnameattribute values need to start with an alphabetic character, let's knock down one skittle by deciding to prefix each value applied to a heading (other than the standard ones, above) with "index".If none of our documents had heading levels deeper than
<h2></h2>then we would simply identify them as "index1", "index2", etc. and get home in time for the football. But we're assuming that we have to handle documents with several levels plus boxes and tables. We've already dealt with the standard parts as above. We reserve<h1></h1>for our title, which leaves us with a hierarchy from<h2></h2>onwards.Experience tells us that we seldom reach
<h6></h6>so it's fairly safe to limit ourselves to four steps.7 And experience likewise tells us that we won't be needing more than two figures to number any heading (if we ever did then we should probably be using something other than headings, such as a table).So, to identify any heading from level 2 to level 5, including its position in the document's hierarchy, we use this structure:
index-00-00-00-00
The dashes are purely to aid readability. The pairs of digits represent the levels of heading from 2 to 5, like this:
So, for example, this index:
index-02-01-06-00
represents the sixth
h4beneath the firsth3beneath the secondh2. The last two digits tell us that there are no headings at this level beneath this particularh4. When embedded in an anchor, this index would look like this:
<h4><a id="index-02-01-06-00" name="index-02-01-04-00">heading</a></h4>
The first level
The first level under the title will all be
<h2></h2>, so all will be given an index like this:
index-01-00-00-00
Only the first pair of digits will be incremented since only these represent level
h2. We include the remaining digits so that all our links have the same structure, which makes reading, searching and processing a little easier.Subsequent levels
Once we have marked-up all the
h2headings in the main document flow, we go to the firsth3. Because we follow our rules for headings we know that theh3must be beneath anh2in the hierarchy, so the index we give this heading will be based on the heading it is beneath, like this:
<h2><a id="index-01-00-00-00" name="index-01-00-00-00">heading</a></h2>
<h3><a id="index-01-01-00-00" name="index-01-01-00-00">subheading</a></h3>
Thereafter, we increment only the second pair of digits for each
h3under that specifich2. When we come to the nexth2and find that it has anh3beneath it, the counter returns to the start, like this:
<h2><a id="index-02-00-00-00" name="index-02-00-00-00">heading</a></h2>
<h3><a id="index-02-01-00-00" name="index-02-01-00-00">subheading</a></h3>
And so we continue, with each sequence of headings taking the index of the one above it in the hierarchy and then incrementing its own counter.
Anyone familiar with outliner methods used in word processors and stand-alone outliners will recognize this scheme as familiar territory. All we're doing is applying this to a formal link structure.
While doing this indexing we should make sure that our index numbers follow any given in the document headings themselves. The code would still work if we didn't, but it would be confusing and frustrating to edit and would miss taking advantage of an established hierarchy.
Sometimes we need to include footnotes at the end of a section rather than the end of the page, or we need to include other standard parts under a section while maintaining the flow. In these cases we combine the section index with the standard part name, such as this:
<h2><a id="index-02-00-00-00" name="index-02-00-00-00">heading</a></h2>
<h3><a id="index-02-01-00-00-footnotes" name="index-02-01-00-00-footnotes">footnotes</a></h3>
We code the footnotes themselves in the standard way but prefixed with the index of the section. See the footnotes discussion for more details.
Any standard parts lower in the hierarchy than an
<h2>heading will always be level<h3>. The index of the section comes first, followed by the name of the standard element.Sometimes we need to include headings in one or another of the standard sections, such as the introduction or where there are multiple appendices. In these cases we use either of two different methods. In our prelims we can use the same numeric indexing scheme but starting at "index-00" instead of "index-01":
<h2><a id="introduction" name="introduction">Introduction</a></h2>
<h3><a id="index-00-01-00-00" name="index-00-01-00-00">introduction subheading</a></h3>
Alternatively, we use the name of the section, like this:
<h2><a id="appendix-01-00-00-00" name="appendix-01-00-00-00">Appendix 1</a></h2>
<h3><a id="appendix-01-01-00-00" name="appendix-01-01-00-00">subheading</a></h3>
-
Tackling elements which are outside the document flow
Boxes and tables require slightly different treatment. Boxes are outside the normal flow of the document, regardless of where they are situated in the linear order of the text. Tables are most often regarded as being within the document flow, but they are sufficiently different from the main heading structure that we treat them separately.
Headings within boxes are given the prefix "box" instead of "index", but are otherwise numbered in sequence just like headings in the main text. We should take care to match the box heading numbers to the box numbers. The title of a box is anchored like this:
<h2><a href="box-01-00-00-00ref" id="box-01-00-00-00" name="box-01-00-00-00"><span class="itemnumber">Box 1: </span>The box title</a></h2>
When boxes contain standard elements, such as footnotes, then we construct our anchor in just the same way but we suffix the
nameandidwith the name of the standard part, like this:
<h3><a href="box-01-00-00-00-footnotesref" id="box-01-00-00-00-footnotes" name="box-01-00-00-00-footnotes">Notes</a></h3>
Notice that these examples also include the reciprocal
hrefwhich links back to the contents list. It is the same index as the heading but with the added suffix "ref".
Building the contents list
[watch this space... I know how to do it, but it takes bloody ages to write it up sensibly. I suppose it'll be worth it in the end.]
Footnotes
1. I use the terms "contents list" and "index" fairly interchangeably. There is little call for formal indexes in web pages since the text is essentially searchable on far more terms than would ever be included in a written index. However, when it comes to the actual linking code, "index" seems the better choice.
2. I believe much of this advice is outdated. In the early days we were all very concerned with clickable buttons and page-turners to give something of the feel of the printed book, but with greater familiarity and the availability of better tools and better screens, we've grown more accustomed to reading long texts and are better equipped to do so.
3. Of course this technique will mean that headings are by default displayed in bright blue in most browsers, or bright red for visited links. That is unless the style is overridden by the site designer. Many people (including myself) would prefer headings were not bright blue or red, so we write our style sheet to overcome this. But this means that while on the one hand we provide an aid to navigation, on the other hand we remove a strong visual clue to its existence. Once discovered, the fact that headings link back to the contents list will be found to be useful in long documents, so we hope that the reader will forgive us for breaking this particular rule of page accessibility.
4. If we used plain paragraphs or a table (as one might think would suit a table of contents) we would soon find that on the one hand we were limited when it came to showing nested subheadings, and on the other we had let ourselves in for much more complication than we need.
5. Handling long documents is another reason to make sure that the tools in use are the best for the job. If a text editor is preferred (and it certainly is here) it should allow us to work with the text unwrapped—each paragraph a separate line—and should be capable of reformatting text in various ways.
6. We try to make sure that we are as consistent as possible, so that if we use one naming convention for something then we should use the same convention for something similar. In this case I've appended "ref" to the id and name values when these are the values in the referring link rather than the object being referred to. This is the same convention as with footnotes.
7. I have worked on hundreds of long complex multi-level documents and have never required the use of a level 6 heading.
