About This Website

Have you ever looked at the source code of these pages and wondered which content management system would generate such clean XHTML, and why anyone would use it? Here is the answer.

Why not HTML?

Even though these pages are relatively simple, writing them directly in HTML has a number of substantial drawbacks.

Some of these problems can be solved by investing work in HTML parsing and validation tools, and thus, XML processing. But tackling the fundamental problems requires a different source representation. HTML is not a convenient language for authoring documents.

XML Processing

Input files are validated by xmllint (part of libxml2 <http://xmlsoft.org/>), against a custom DTD (modeled after the ideas in Notes on XML DTD Design. After that, a special-purpose Perl script (which is implemented using the XML::DOM Perl module <http://search.cpan.org/author/TJMATHER/XML-DOM/>, see XML Processing with DOM and Perl) reads all documents, processes them, and writes all the XHTML documents which are part of the website. To catch errors in the processor, the generated files are validated using xmllint, this time against the official XHTML 1.0 <http://www.w3.org/TR/xhtml1/> Strict DTD.

The internal hyperlinks of this website are automatically generated. The author assigns to each page (and each file which is part of the website) is assigned a unique name, and other pages can reference it by that name. By a special declaration in the document header, the author can add a document to a list. Such lists can be included in other documents, and are automatically updated when the HTML page is generated, of course.

At the moment, results of the conversion are not cached, so all the website has to be regenerated after each change. This needs optimization for larger websites.

Version Control

Since April 2007, version control for the document source files and the programs is provided by Git <http://git-scm.com/>. Until Feburary 2004, GNU arch <http://www.gnu.org/software/gnu-arch/> was used, and Subversion <http://subversion.apache.org/> after that.

Currently, there is only one development branch. After some change has been made and previewed, the rendered documents are transmitted to the public web server using rsync <http://samba.anu.edu.au/rsync/>. However, it would be straightforward to establish multiple branches with parallel development (if there was anybody else working on these pages).

Related Documents


Florian Weimer
Home Blog (DE) Blog (EN) Impressum RSS Feeds