Home XML FOP Linebreaking ConTeXt Madelief Madelief-XML waarnemingslijsten

XML utilities


Spui stands for SAX Parser User Interface, but it is also the name of a well-known little square in the center of Amsterdam.

Spui contains a graphical user interface, a command line interface and an error reporting error handler for any validating SAX2 java parser. With this combination you can validate an XML file against a DTD or a W3C XML Schema (the latter with Xerces2-J only). In addition, the package contains two content handlers:

You need to have a validating SAX2 compliant java parser, and the JAXP classes, which were probably installed with the parser.

Spui comes in two alternative packages:

Installation: Download the jar file of your choice, Spui.jar or Spui-cli.jar, and put it in your class path.

Running the graphical user interface:

Running the command line interface:

In java 1.4, you may have xml-apis.jar, xercesImpl.jar and resolver.jar in a directory in the java.endorsed.dirs path, instead of the places described above.

If you have downloaded the Spui-cli package, you should replace Spui.jar with Spui-cli.jar in the commands for the command line interface.


The files Spui.jar and Spui-cli.jar include the class files of the gnu.getopt package. The complete gnu.getopt package including source files and documentation can be obtained from the web site of the author, Aaron M. Renn: java-getopt-1.0.9.tar.gz or java-getopt-1.0.9.jar.

Parser package with user interface

This package has been renamed to Spui-cli.

Parser document handler with graphical user interface

This package has been renamed to Spui.

Normalizer document handler

This package has been removed. Its functionality is now provided by the XMLSerializer class in the Spui package.

Normalize-spaces XSLT file

This is an XSLT file that strips spaces from the elements listed in the xsl:strip-space directive in the XSLT file. The XSLT file offered here does not list all docbook elements which contain ignorable white space. Instead, it lists the same elements that are listed in a similar directive (\defineXMLDBstripspace) in the DocbookInContext mapping file. This tool works only with Docbook files due to the specific list of elements from which spaces are stripped.

Download the XSLT file. Put it in a suitable place and run it as required by your XSLT processor. For example, in Xalan it is used as follows: java org.apache.xalan.xslt.Process -in your-file.xml -xsl normalize_space.xsl -out your-file.normalized-xsl.xml.

Entity resolution

The Spui packages have an option for the entity resolver which takes the values 'system' and 'catalog'; any other value is tried as the resolver class name. The value 'catalog' uses Norman Walsh' entity resolver, which handles XML catalogs, SGML catalogs and Apache Xcatalogs. Both the resolver implementation offered by Sun, in the package com.sun, and the resolver implementation offered by Apache xml-commons, in the package org.apache.xml, are recognized.

This works if in addition you have resolver.jar in the same directory as the Spui jar file (for the java -jar invocation) or in your class path (otherwise). In java 1.4, you may have resolver.jar in a directory in the java.endorsed.dirs path, instead of the places described above. Some JREs require that you also have the package javax.xml.transform in your classpath; this package is bundled with xalan.jar. XSLT processors have their own option to specify the desired entity resolver.

My entity resolution factory, which is contained in the Spui packages, includes software developed by the Apache Software Foundation (http://www.apache.org/).

See Oasis Open XML Catalogs for details about XML catalogs, and links to Norman Walsh' entity resolver classes and other implementations.

Writing an application for a SAX-compliant XML parser

I give a short overview of the Simple API for XML (SAX). I describe how a SAX-compliant parser and a SAX application interact, and how one should proceed to write a SAX application. The description focuses on the Python implementation of SAX. The examples are written in Python.

This document is now fairly old (winter 1998/1999). It describes SAX1 and the examples use outdated code. The general idea, however, is still valid.