Workflow
We usually start preparing texts using tools which suit the purpose. We might begin using pen and paper, or write into a word processor or a computer-based notebook. Everyone has their own preference and method which is familiar and works for them, and this is exactly as it should be.
However, when wishing to add our texts to the pools of content which constitute the Web, we find that we must soon start using methods tailored to that end. The exact point at which we start directing our effort towards HTML is still a matter of choice, but at some point we must do so.
These notes are largely concerned with the specifics of preparing text for entry into a database (more specifically an ActionApps database), but much of what is here applies more generally too.
Making the break
For simple articles, entering the text directly into the database fields may be the easiest method. In such rare cases we have no need for editors or word-processors and the short article can be completed and made available online with satisfying immediacy.
For documents which are never required to be seen anywhere except on the web it is often convenient to create them in a HTML-aware text editor. Such text editors usually have features similar to word processors but are more focussed on the task of preparing HTML.1
Longer, complex documents usually starts life in a word processor. The powerful tools are just what is required when writing and, if there is a need to circulate documents for comment and revision, the word processor format is much more widely familiar than HTML.
In most cases the point where we switch to creating the web version of our text will be the point where the rounds of writing, correction and revision are done. At this point we have an article where the content is finished and complete.
Example workflow
-
We start with our article as a word-processor document, edited to perfection.
As well as the content of our article, we should also have established its structure.
-
We transfer the text to an HTML editor ready for formatting.
We'll need to work with plain text, so we just select the entire text of the document; copy it; and then paste it into a new text-editor document.2
[it would probably help to provide some examples at this stage, so I will do so when I can get to it]
-
Next we work through the text, adding the HTML code as required. It's often easier to do this in several passes:
- the first pass adds the basic block-level paragraph, list and heading tags to provide the outline structure;
- the second pass covers the inline tags such citations, emphasis and external anchors;
- the third pass concentrates on creating links within the page, such as for footnotes and contents lists.
At each stage the page code should be validated, and it is probably a good idea to save a version of the file at each stage in case we need to go back a step.
It doesn't matter what the layout of the code looks like, just that its structure is correct and supports the content.
-
We copy our coded text into the database.
We are only concerned with what is between the and
</body>tags of the document, and we don't want the body tags themselves.Depending on how our database is configured, we might simply copy the entire contents of the
<body>and paste it into one field of the database, such as the "Main text" field, or we might need to copy portions of our coded text into different fields, such as those for "Main text" and "Footnotes".When we copy our text into the database, we need to consider that some fields require block-level code while others are require inline code. We also need to make sure that we don't leave bits of code behind, such as closing tags.
-
Once the text is installed we can set the various other options our database might include, such as topic, category, author, publish date, etc.
-
Refer to the checklist for a reminder of some of the issues which need to be taken into account.
Once in the database, the item there should be regarded as the definitive or reference version. The reasons for this are that once in the database the article is in XHTML format (close in structure to XML) and as such can be converted to virtually any other format with little further editing. File formats such as .doc, .rtf, .pdf, etc. are partially interchangeable but all require additional work before being converted into most another formats, and a lot of work to become valid XHTML again. As the site is developed it will be possible to extract articles in a variety of formats, not just HTML.
1. This is my own preferred method, using the marvellous BBEdit.
2. Why don't we simply export HTML from the word processor? Well, this might be a good method but most often the exported code is excruciatingly bad and experience shows that to get to the lean and valid code we're after it takes longer to clean up the rubbish code than to start from scratch.
