XML 2001 Schema Panel Results

The teams produced the following results.

The Team Schemas

Document Type Definition Team Schemas, by The DTD Team

The DTD group modelled a speech with commentary. We tried to model a simple text document type, and came to the conclusion that text documents are not simple. We have assumed that all of the participants in this exercise can read DTDs and have not described every model in detail. Instead, below, we describe the unusual and key points demonstrated in the DTD. We can produce full DTD documentation if desired (but it will be pretty bulky).

As we model it, a speech consists of metadata followed by a body. The metadata must contain a who, is allowed to contain a date and an biblio, and may contain multiple occasions. These may occur in any order.

biblio contains one or more bibitems, and each bibitem must contain both an author and an mtitle, in any order.

The body contains one or more units. units are the natural units of the speech, sort of like written paragraphs. unit contains character data and all of the elements in the manner class (classes are started in the DTD, but can be expanded or redefined in the the document), intermingled as needed. Also, at any point in the unit (including inside any elements inside the unit unless otherwise specified),can be explanations and opinions.

The manner class contains, initially, manner but note that in the internal subset of the DTD in one of the documents it has been enhanced.

opinion is a recursively nested structure, containing one or more sections, but which may not contain opinions. Note that it may contain explanation because explanation is an inclusion at the same level as opinion and is not excluded.

One special character, an ndash, has been allowed any place other characters are allowed.

The DTD is heavily commented, which we are aware some readers don't like. For their benefit we have included a copy of the complete DTD, without comments, in a comment at the end of the DTD.

Please let me know if you have any comments or questions on the DTD (bakeoff.dtd) or either of the two documents (ham.sgm and onBooks.sgm).

RELAX NG Team Schemas, by The RELAX NG Team

Here's the RELAX NG schema for the schema comparison panel. We've chosen a relatively simple example, so that the audience will get an idea of how the schema languages compare for simple tasks as well as for more complex tasks.

The problem is to write a schema for a book list. A book list is represented by a bookList element which contains zero or more book elements. A book element can contain three kinds of field: title, author and price. A book must contain exactly one title field, one or more author fields and zero or one price fields. For any book, a particular field can be represented by either an attribute or by child elements: the only restriction is that no book element can use both an attribute and a child element to represent the same field. Child elements may occur in any order. Elements must all be in the namespace "http://www.example.org"; attributes must not be namespace qualified. Author and title fields can contain arbitrary strings; price fields must contain decimal numbers. The schema is closed: no attributes and elements other than those specified above are allowed.

Schematron Team Schemas, by The Schematron Team

Please find attached a Schematron schema for RDF, as our proposed schema.

We may update it slightly over the next week, but it is basically done. Sorry it is not pretty-printed.

As for test files, we have used the RDF test suite, which is not so big. (I have an open question to the RDF people about some tests: it seems that RDF constrains that elements in other namespaces used in some positions are not allowed to have local attributes with the same name as unqualified RDF attributes, which would be completely against the point of the namespaces spec. Hmmm.)

I wonder if the other teams would agree to accept the RDF test suite (perhaps sans the disputed errors) rather than the two sample files: we can even do a simple count of how many legit RDF documents pass and how many illegit RDF documents fail.

The RDF team also has a "refactored" syntax document out. Other teams may find it easier to limit themselves to that. Dan Connolly fixed up an old RDF DTD I made a couple of years ago, and it might be useful as the basis for a new one. I believe that there was some kind of draft XML Schema for RDF in the Schema WG at one stage.

I think RDF is a good example because it has been around a long time, the existing DTDs etc provides good headstarts, it has a test suite, and it is a horrible mess apparantly created by people who refused to believe that validation against standard executable schema languages was useful and who therefore designed the language with no thought for how it could be integrated into the rest of the XML world.

rdf-schema.sch

XML Schema Team Schemas, by The XML Schema Team

Sorry for the delay. Herewith our schema and task specification: in brief, it's the MathML2 DTD. I realise this leaves the DTD folk with nothing to do, but they might like to critique the W3C MathML Working Group's work :-)

We chose this large example to give us and the other groups the chance to make use of existing forward conversion tools (we started with the DTD-to-Schema tool started by Dan Connolly and enhanced by Mary Holstege), and to explore some of the issues involved in defining the document types of large vocabulary applications.

Note that the test3.xml file actually has one error in it.

The Common Schemas

Document Type Definition Team: TechMemo.zip and Orderform.zip. The DTD team also provided a schema for the RELAX NG team's book list schema: Booklist.zip.
RELAX-NG Team: schema-panel.zip
Schematron Team: Schematron-schemas.zip. The Schematron team also provided schemas for the RELAX NG team's book list schema (in the ZIP file).
XML Schema Team: xsd-3schemas.zip. The XML Schema team also provided a schema for the RELAX NG team's book list schema: booklist.xsd.