I got nerdy these days for semantic web, due to a project – migration of Tandberg Knowledge Base articles to We had tons of bad (and sometimes, incorrect) HTML in the KB (as a result of human typing HTML directly), and I believe it is a good time to tidy up those HTML by my Magical Script.

OK, so here is a list of HTML tags I do NOT want to see in any of my future work:

<b>, <i> and <tt> – These tags had been misused over centuries and people still (mis-)use them actively these days (that includes me, too). Altough they still stil valid in all web standards. Its use is discouraged.

Why? They are defined by W3C as Typographic Elements. It is just for typographic styling and does not carry any meaning.

So? Use <em> and <strong>, with CSS to style them accordingly. These tags are defined by W3C as Idiomatic Elements. They carry semantic meaning: <em> indicate emphasis and <strong> indicate stronger emphasis. It also helps screen reader to put emphasis in the voice. For styling purpose only, you may use <b> or <i>, or even better to use: <span class=”bold”>, etc. Just to separate presentation from the content.

<font> – Another biggest enemy of the time.

Why? It is highly presentational.

So? Use CSS to style texts. <span class=”blah”>some text</span>

<big>, <small> – These are highly presentational, too. Use CSS instead.

<s>, <strike> – They are deprecated in HTML4. For creating horizontal strike on texts. Use CSS for presentational reasons and <del> for semnatically signify a real change to the text.

<u> – For underlining texts. It is highly presentational. Use CSS instead.

<marquee> – This is non-standard tag but have very good browser support. Used for animating texts across the page.

Why? It is non-standard and annoying.

So? Avoid it. If you have any good reason to animate your texts, use Javascript. Still, this should be avoided.

<nobr> – Ugly as it sounds. It is used to avoid text being wrapped.

Why? It is non-standard even though supported by all major browsers.

So? Use CSS: white-space: nowrap; instead.

<q> – Inline quotation. Most browers will render the <q> tag into quotation marks. It carry semantic meaning of quotation.

Why? I know, it is a very good tag and I do like it. But… obvsiously not for IE version < 8.0.

So? Type your quotation mark!

<basefont>, <blink>, <plaintext>, <wbr>, <xmp>, <comment> – Never heard those? Neither do I. Please forget them if you know any of them, these tags are EVIL by nature.

