On the proper use of HTML

HTML is intended to be a device-independant medium for expressing content. It was not designed for specifying style or making documents distinctive. Making a document look exactly the way you want it to look on every display is a very difficult problem, which HTML does not attempt to solve. Recently, under pressure from various sources, support for specifying style and appearance have been added; these, however, are not nearly good enough for the task they set out to do, and generally make any document look worse.

A few pointers for making web pages:

A few common traps to avoid:

Have Content

Only have a web page if you have something to say. The fact that you are provided with some ammount of space where you can put stuff that the world can access does not, in fact, mean that you ought to do so. If there is nothing that you really want to tell the general public, do not have a web page without any information.

Do not get other people to make web pages for you. Unless they know you very well, they are unlikely to be able to express what you want to say better than you can. The exception to this rule is if they are doing a bunch of web pages that should have the same format; in this case, it is more clear for them to do everyone's web page-- provided they know what everyone has to say.

Having made a page for the purpose of content, let that content dictate how you arrange your site. Make it easy to find the information that people are looking for. Design your site with the expectation that your reader has a very short attention span, and needs some indication that a page has some interesting content before reading it. Give this impression, and then do not disappoint your reader.

Use Tags Only for Content

Since your intent is to provide information to readers who will read your page if and only if the content interests them, you should do absolutely nothing that might either distract an interested reader or entice an uninterested reader.

First of all, get to know <p> very well. If something is not plain text in a paragraph, there should be a good reason for it. These reasons include such things as:

A tag indicates a very specific difference between how that part of the text should be seen and how the rest of the document should be seen. For example, a heading is generally a sentence fragment, and describes the text that follows it. An italicized phrase, if read, would be in a different tone of voice. An obvious consequence of these rules is that most of the document is not in a tag (besides <p>), since it does not differ from itself.

Optimize for Speed

The web is slow. As hardware improves to give people better connections, more people use it. Since you are assuming your audience will run out of patience soon, you should try to get as much content out before people get bored.

The first thing to do is put size attributes on <img> tags. If you have a picture you wish people to see, be sure the text on the page can be displayed before the image is done loading. If people are reading interesting text, they may not notice how long it takes the picture to load.

Provide a quick path from your front page to whatever a person is there to find. This doesn't mean that every page should have a link from your front page; if a page needs some introduction, don't put the whole thing on your front page (that makes it take longer to find other things), but make sure someone who knows what he's looking for can find it without looking at (and waiting for) other pages.

Remember that people won't bookmark everything. Make sure a single bookmark for your site can get a reader to any interesting page as fast as possible.

Provide Context

In my bookmark file, I have (among other things) these two items: TR 9401:1995 - Entity Management and HTML 4.0 Specification. I know what both of them are, with a bit of thought, but on seeing only the titles, the first is difficult to guess, whereas the second is entirely obvious. Remember that the contents of the <title> tag are often used with no context. My bookmark file has very few pages whose titles don't provide context; this is not a coincidence.

Remember that the web is not linear; if you have a page which is actually interesting, it is likely to be found by a search engine; arriving at this page out of the blue should not confuse the reader. It should also be easy to get to the rest of the information you provide even after getting there from a bookmark.

Be General

I generally use the Linux version of Netscape 3.0 or Lynx 2.7, but I also use browsers on a few other machines. I frequently resize Netscape if it's getting in the way. This all means that if a page was specifically designed to be viewed on the author's browser, it will probably look terrible to me. If it has special Internet Explorer tags, I won't see them. I tend to keep JavaScript turned off, and don't have Java set up at all on my machine.

HTML is supposed to be easy for browsers to display; generally they'll be able to make a page look good if they know what you're trying to say. Help them out by not getting in the way, and you'll reach a larger audience.


Avoid Formatted Text

Formatted text is bad for several reasons: it's generally less likely to look good on everyone's browser, formatting generally doesn't provide content, and it's often slower.

It's very tempting to mess with formatting until it looks exactly how you want. If you find yourself doing this, resize your browser window a couple times and see how it looks. Probably, it won't look the way you want. Now think about the people who have a different window size (probably almost everyone out there, since you just reset it randomly...). You're not impressing anyone, and you aren't adding content.

I just looked at a site which had a very nicely implemented page with two columns. I read the left column, and then scrolled back up to the top of the page to read the right hand column (already a bad sign) and realized that I had no idea how the two columns were intended to relate to each other. Presumably, the right column was to be read after the left one, but there's no reason it couldn't have been one column, maybe with a <hr> in the middle.

Minimize Color

Color is a very good way of conveying information. Text of different colors is very easy to separate, and it's easy to search a document for a certain color. That's why browsers use color for links and such. But you shouldn't use different colors, because it doesn't specify any particular content unless the user knows what it means, and you can't read the user's mind.

A major problem with color is that it's very hard to know what colors will stand out on what backgrounds, especially because it differs between systems and monitors and people. Your site with red text on a green background may be fine for me, but if you want to reach anyone who's red-green colorblind, that won't work.

On a similar note, don't use backgrounds. They make the the text less noticable, determine the background color, and require a text color that contrasts with all of the colors in the background. Anything worth using despite these problems is too good to put text on top of.


A writing of Daniel Barkalow