I've been reading today an excellent article posted by Evan Goer on Mezzoblue. The topic is should I move to XHTML or stay with HTML 4.01?. The author gives no firm answer other than it depends of your needs, because there are trade-offs, and the XHTML applications are not just there yet. I thaught I should bring to the table some of my (limited, I admit) experience with XHTML and how it has been useful to me.

Two objections have been discussed in the article:

  1. XHTML and the promises XML brings to it are not there yet.
  2. If your web site lets people leave comments on your articles, they may break the validation by entering tag soup.

I'd love to contradict those two sentences with my experience.

  • First, I want to share an issue I had to solve a few months ago. I had started my blog with the blogger.com service, and at some point, it became so buggy I was unable to continue using it. I had to switch to some other blogging tool. But my content was stored in blogger.com's database, on their servers, and I could not manage to get it back. The only thing I had was my HTML archives, listing a hundred articles on each single web page. Fortunately, I was using XHTML to write these articles. A friend of mine, Olivier Meunier, used XML-based, freely available technologies (namely XSLT) to parse these pages and separate the articles into separate chunks, so that I could store them in my own database (as opposed to blogger.com's). Then I was back in control of my content again. Would that have been possible with HTML 4.01? Maybe. But it would have been a much lengthier and painful process. This little story shows something which is often forgotten about XHTML: it offers you the durability of your content on the long term. Furthermore, because I use strict markup, I do separate structure from presentation, enabling me to offer stylesheet switching (presentation independence) and better accessibility.
  • Second, the tag-soup-in-comments issue. This is a real-world issue. It also goes with the current lack of XHTML-compliant tools. This is no more with the advent of the excellent DotClear blogging tool (written by the same Olivier Meunier). For now, the Dotclear website is only in french (but that should change), so non-french readers will have to trust me on this one. DotClear has been written from scratch to produce XHTML-compliant web logs. This means that it includes a article-validation system (making sure that your own content won't break validation), and a very interesting component, called Wiki2XHTML, which transforms Wiki-like content into valid XHTML. Both the blogger and his readers can use Wiki syntax to write articles and comments, hence limiting the risk of tag-soup (and JS-related security issues, too). Syntax validation is also enabled for comments, making sure that the whole site stays valid. I should mention that DotClear is free and released under the Mozilla Public License 1.1.

Hopefully by now, you should have understood my points : XHTML is useful now, because it gives you control over your content in the future. Also, modern tools such as DotClear enable you to easily leverage the power of XHTML, today. Does it make HTML 4.01 obsolete? Not yet. But the balance is now more on the XHTML side than HTML 4.01. Heck, we have to switch to XHTML 1.0 Strict anyway: the spec is already almost 4 years old! ;-)