Document Management

Norman Walsh

MarkLogic Corporation

Introduction
Non-technical challenges
Reuse in XML documents
Validation
Managing documents
Workflow
Technical hodge-podge
Q&A

An understanding of the challenges and opportunities afforded by the promise of reusable XML documents.

Modern publishing environments demand reuse and repurposing of content to maximize its value.

What do we mean by reuse and repurpose?
Non-technical challenges
Technical challenges

Reuse is using the same content in different documents.

Write two documents that share several common figures
Write two books that share several chapters
Write two help sets that share several topics
Write two web pages that share the same boilerplate (copyrights, legal notices, etc.)

Repurposing is presenting the same content in different media.

Publish a document on US Letter and A4 paper
Publish a document in print and on the web
Publish a document in print, on the web, and as an EPUB
Publish a document in print and as an “app”
Publish a document as an iPhone app and an Android app

Some reuse involves repurposing, some repurposing involves reuse. These words don't have a strict, technical meaning.

Most writing happens in a particular context.
That context is based on an expectations about the reader:
- If you're reading chapter 5, you've read chapters 1-4
- If you're reading chapter 5, it's preceded by chapters 1-4
- If you're reading the Unix guide, you're on a Unix system

The author may have other context in mind

The document is printed on paper
Figures are always on the right hand side of a spread
Procedures never break across page boundaries
The document is printed in black-and-white

etc.

Reuse and repurposing places content in new contexts
In the worst case, into contexts that are incompatible with the context in which they were written:
- “In the preceding chapter, we…”
- “As the figure on the right shows, …”
To avoid the worst case, reuse is limited by context

Do these notions of context seem coherent? Are there other notions of context (on the document management side, as distinct from the delivery side) that have been overlooked?

Discuss.

Maximizing reuse requires learning to write differently
Sometimes it requires using new tools
Sometimes it breaks established boundaries of authorship
- Writing books becomes writing topics
Sometimes it breaks established boundaries of control
- Presentation and formatting are often removed from the author's control

It may be challenging to convince authors that the necessary changes have benefits that justify the costs

Solving these problems is highly dependent on the particular circumstances.

Edicts from above?
Chocolate?

To the largest extent possible, make sure everyone who will be impacted by a project are involved in the planning and development stages to assure that everyone's committed to the goals.

Store the components you want to reuse in separate “files”
Write the “main” document so that it references those components
Resolve those references and process the resulting document

In the discussion that follows, we'll mostly be talking about a single composite document. If you can build one, you can build more than one with the same techniques.

Graphics, and other non-XML resources, are the easy case:

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>...</title>
</head>
<body>
...
<img src="somegraphic.png" alt="Some graphic" />
...
</body>
</html>

Or, in DocBook:

<mediaobject>
  <alt>Some graphic</alt>
  <imageobject>
    <imagedata fileref="somegraphic.png"/>
  </imageobject>
</mediaobject>

Graphics aren't properly part of the “XML content” of the document
Most XML processes don't care
Some processes (XML to PDF) will care

In the XML case, “resolving references” is a transformation:

→

There are roughly three ways to reuse XML:

XML entities
XInclude
Construction from stand-off markup
- DITA maps
- DocBook assemblies

Or some proprietary mechanism likely to be like one of those.

Octets (bits on disk) are interpreted as characters (based on some media type), those characters are parsed to produce some sort of a data model, and most XML tools work on that data model.

XML entities are resolved by the parser. They operate at a much lower level than other techniques:

<!DOCTYPE doc [
<!ENTITY chap2 SYSTEM "chap2body.xml">
]>
<doc>
<chapter>First chapter...</chapter>
<chapter>
&chap2;
</chapter>
</doc>

Where chap2body.xml contains (an extParsedEnt):

<para>paragraph</para>
<para>paragraph</para>

After parsing, this is the document other tools see:

<doc>
<chapter>First chapter...</chapter>
<chapter>
<para>paragraph</para>
<para>paragraph</para>
</chapter>
</doc>

Work with almost any parser
Are invisible to most XML processes
Are a kind of textual substitution
Require a doctype declaration and processors which read “external markup declarations”.
Apply validation to the entire, expanded document if validation is applied

The document you start with must have a literal root element
Included documents cannot have their own doctype declarations
Expansion must succeed; errors are fatal
The entity you include can have multiple root nodes
Can only include whole files

XInclude processing takes place after parsing.

<doc xmlns:xi="http://www.w3.org/2001/XInclude">
<chapter>First chapter...</chapter>
<xi:include href="chap2.xml"/>
</doc>

Where chap2.xml contains:

<chapter>
<para>paragraph</para>
<para>paragraph</para>
</chapter>

After applying XInclude processing, this is the document other tools see:

<doc xmlns:xi="http://www.w3.org/2001/XInclude">
<chapter>First chapter...</chapter>
<chapter xml:base="chap2.xml">
<para>paragraph</para>
<para>paragraph</para>
</chapter>
</doc>

Requires an XInclude processor (or appropriate configuration option)
Is logically a transformation like any other. There's a pre-XIncluded document and a post-XIncluded document.
Operates on two or more distinct, separate documents (well, usually)
May apply validation to either the individual documents, or the composite document, or both.
- N.B. DTD validation cannot practically be applied to the composite document
Can address subsections of a file via XPointer
Is recursive: all or nothing

Both the including and the included document must be well-formed
XInclude is agnostic to the presence or absence of doctype declarations

Fallback can be used to recover from resource errors:

<doc xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="http://example.com/tiger.svg">
  <xi:fallback>
    <xi:include href="kitten.svg">
      <xi:fallback>…</xi:fallback>
    </xi:include>
  </xi:fallback>
</xi:include>
</doc>

One common use of XInclude (in software documentation anyway) is to include examples. Sometimes that means you want the text of the document, not its XML essence.

<programlisting language="xml">
  <xi:include href="chap2body.xml" parse="text"/>
</programlisting>

Suppose we wanted to accurately reproduce the original entities example:

<para>paragraph</para>
<para>paragraph</para>

Where chap2.xml actually contains:

<chapter>
<para>paragraph</para>
<para>paragraph</para>
</chapter>

XPointer lets you reach into a document.

<doc xmlns:xi="http://www.w3.org/2001/XInclude">
<chapter>First chapter...</chapter>
<chapter>
  <xi:include href="chap2.xml"
              xpointer="xpath(/*/*)"/>
</chapter>
</doc>

After applying XInclude processing in this case, other tools see:

<doc xmlns:xi="http://www.w3.org/2001/XInclude">
<chapter>First chapter...</chapter>
<chapter>
  <para xml:base="chap2.xml">paragraph</para>
<para xml:base="chap2.xml">paragraph</para>
</chapter>
</doc>

Standard schemes:

#foo or id(foo), the element with the ID “foo”
element(/1/2), the second child of the root element
element(foo/2/3), the third child of the second child of the element with the ID “foo”
xmlns(db=http://docbook.org/ns/docbook), defines a namespace for a subsequent expression

A registry of extension schemes is maintained at http://www.w3.org/2005/04/xpointer-schemes/.

There are a bunch...but support is on a per-implementation basis
Of them, xpath is probably the most widely supported

DITA and DocBook provide stand-off markup for building documents from components
- DITA calls them maps
- DocBook calls them assemblies
There's no “root” document that pulls in the components
Instead, the assembly describes how the pieces are pulled together

<assembly xmlns="http://docbook.org/ns/docbook">
  <resources>
    <resource xml:id="r" fileref="rsrc.xml"/>
    …
  </resources>

  <structure xml:id="h" type="helpsystem">…
  </structure>
  <structure xml:id="b" type="book">…</structure>

  <relationships>…</relationships>
  <transforms>…</transforms>
</assembly>

<assembly xmlns="http://docbook.org/ns/docbook"
          xmlns:xlink="http://www.w3.org/1999/xlink">
  <resources>
    <resource xml:id="xidi.overview"
              fileref="xidi-overview.xml"/>
    <resource xml:id="scr.book.build"
              fileref="scr-book-build.xml"/>
    …
  </resources>

  <structure xml:id="xidi.help.system"
             type="helpsystem"
             defaultformat="helpsystem">
    <output format="pdf" file="xidi-help-system.pdf"/>
    <output format="helpsystem ohj"/>
    <filterout condition="manual.only"/>
    <title>XIDI Help System</title>
    <info>
      <abstract>
        <para>This is the help system…
        </para>
      </abstract>
    </info>
    <revhistory>
      <revision>
        <revnumber>0.1</revnumber>
        <date>1 August 2009</date>
      </revision>
    </revhistory>
    <module>
      <output file="sys-toc.html"/>
      <toc/>
      <toc role="procedures"/>
    </module>
    <module xml:id="help.xidi.overview" >
      <output file="overview.html"/>
      <title>XIDI Help System Overview</title>
      <module resourceref="help.overview.intro"
              contentonly="true" omittitles="true"/>
      <module resourceref="xidi.overview">
        <output file="ovr-xidi.html"/>
      </module>
    </module>
  </structure>

  <structure xml:id="user.guide" type="book">
    <output renderas="book"/>
    <output format="html"
            file="xidi-user-guide.html"/>
    <output format="pdf"
            file="xidi-user-guide.pdf"/>
    <title>XIDI User Guide</title>
    <toc/>
    <toc role="figures"/>
    <toc role="tables"/>
    <toc role="procedures"/>
    <module resourceref="xidi.overview"
            renderas="chapter"/>
    <module resourceref="xidi.create.intro"
            renderas="chapter"/>
  </structure>

  <relationships>
    <relationship linkend="xidi.help.system"
                  type="path">
      <association>New User Introduction</association>
      <instance linkend="help.xidi.overview"/>
      <instance linkend="help.svn.overview"/>
      <instance linkend="help.ex.new.help.sys"/>
    </relationship>

    <relationship type="collection">
      <association>Advanced User Topics</association>
      <instance linkend="xidi.parameters.syntax"/>
      <instance linkend="svn.properties"/>
    </relationship>
  </relationships>

  <transforms>
    <transform grammar="dita"
               fileref="dita2docbook.xsl"/>
    <transform name="tutorial"
               fileref="docbook2tutorial.xsl"/>
  </transforms>
</assembly>

In most cases, in order for partners to exchange documents, both partners must understand all of the markup in the exchanged documents.
In other words, I can't usefully exchange DocBook with someone expecting TEI.
Blind interchange describes the situation where partners exchange documents without knowledge
It requires adhering to a set of constraints that allow one element to be a “subtype” of another with the guarantee that processing the subtype like its “supertype” will do something useful
It is a feature of DITA

Most processes, especially in publishing, are transformative: XML to HTML, XML to PDF, XML to EPUB, etc.
Those transformations are written by people who believe they understand the structure of the documents to be transformed
If the structure differs from expectations, the results will be ugly at best, catastrophically misleading at worst
The more complex the process, the more important it is to understand the incoming markup
Validation is the easiest way to catch markup errors

Ideally, while you're typing your documents
Absolutely, before you do anything else with them!

There are three significant grammar-based schema technologies:

Document Type Definitions (DTDs)
W3C XML Schemas
RELAX NG grammars

There are other, non-grammar-based technologies, of which

Schematron

Is probably the best known.

Widely available (supported by almost all tools)
Normatively part of the XML specification
- But validation is optional
Not written in XML-document syntax
- Poor support for documentation
- Not usable in some environments
Supports entities (a text-based macro language)
Not namespace aware
Very limited data type support

Supported by many tools
Also developed at the W3C
Written in XML-document syntax
Namespace aware
Extensive but not extensible data type support
Hierarchical data types (typed object graphs)
Grammars must be unambiguous

Supported by some tools
Developed at OASIS
Written in XML-document syntax
- With a very popular, official compact (non-XML) syntax
Namespace aware
Supports all the XML Schema data types, plus is extensible
Grammars may be ambiguous
No obvious support for typed object graphs

<doc xmlns="http://www.xmlsummerschool.com/example/ns"
     status="draft">
<head>
  <title>A Sample Document</title>
  <date>2011-09-22T09:00:00+01:00</date>
  <author>Norman Walsh</author>
</head>
<body>
  <p>Paragraph. <em>Important</em> paragraph.</p>
  <p>Paragraph.<fn><p>Redundant, ain't he?</p>
  </fn></p>
</body>
</doc>

What makes one of our documents one of ours and not something else? When is a purchase order not a cocktail recipe?

A doc consists of a head and a body, in that order
A head contains a title, date, and author, in any order
A body only contains p elements
A p contains text, em, or fn elements mixed together

The “rules” about a document exist in a spectrum from simple, structural rules all the way to business process/workflow rules.

Paragraphs in footnotes can't themselves have footnotes
Dates have to be real (ISO 8601) dates
Dates have to be expressed in UTC
Documents can have at most four footnotes
Documents with the status “final” can only be published on Thursdays
Author names have to be in the master author database
Documents can have at most four footnotes per page

<!ELEMENT doc (head, body)>
<!-- Documentation, what documentation? -->

<!ATTLIST doc
          xmlns   CDATA          #FIXED
            "http://www.xmlsummerschool.com/example/ns"
          status  (draft|final)  #IMPLIED>

```
<!ELEMENT head (title, date, author)>
```
```
<!ELEMENT head (title, date, author)>
```
```
<!ELEMENT head (title & date & author)>
```
```
<!ELEMENT head (title & date & author)>
```
```
<!ELEMENT head (title | date | author)>
```
```
<!ELEMENT head (title | date | author)>
```
```
<!ELEMENT head (title | date | author)+>
```
- Allows multiple titles, dates, and authors; doesn't require one of each.

<!ELEMENT title (#PCDATA)*>
<!ELEMENT date (#PCDATA)*>
<!ELEMENT author (#PCDATA)*>

Allows any string as a date

```
<!ELEMENT body (p+)>
```

```
<!ELEMENT p (#PCDATA|em|fn)*>
```
```
<!ELEMENT em (#PCDATA|em|fn)*>
```
```
<!ELEMENT fn (p+)>
```
But is this really sufficient?

<p>This is some text.
<fn><p>With footnote text.
    <fn><p>Which is also text.</p></fn></p>
</fn>
Is that what we intended?</p>

There's nothing in DTDs to exclude nesting.

XML Schemas are XML documents, so they have to have a root element.

<schema xmlns="http://www.w3.org/2001/XMLSchema"
 xmlns:d="http://www.xmlsummerschool.com/example/ns"
 elementFormDefault="qualified"
 targetNamespace="http://www.xmlsummerschool.com/example/ns">

<annotation>
  <documentation>
    <p xmlns="http://www.w3.org/1999/xhtml">
      This is documentation.
    </p>
  </documentation>
</annotation>

<!-- declarations go here -->
</schema>

<complexType name="Document">
  <sequence>
    <element name="head" type="d:Head"/>
    <element name="body" type="d:Body"/>
  </sequence>
  <attribute name="status" type="d:Status"/>
</complexType>

<simpleType name="Status">
  <restriction base="string">
    <enumeration value="draft"/>
    <enumeration value="final"/>
  </restriction>
</simpleType>

<complexType name="Head">
  <all>
    <element name="title" type="string"/>
    <element name="date" type="dateTime"/>
    <element name="author" type="string"/>
  </all>
</complexType>

<complexType name="Body">
  <sequence minOccurs="0" maxOccurs="unbounded">
    <element ref="d:p"/>
  </sequence>
</complexType>

<element name="p">
  <complexType mixed="true">
    <choice minOccurs="0" maxOccurs="unbounded">
      <element ref="d:em"/>
      <element ref="d:fn"/>
    </choice>
  </complexType>
</element>

<element name="em">
  <complexType mixed="true">
    <choice minOccurs="0" maxOccurs="unbounded">
      <element ref="d:em"/>
      <element ref="d:fn"/>
    </choice>
  </complexType>
</element>

<element name="fn">
  <complexType mixed="true">
    <choice minOccurs="1" maxOccurs="unbounded">
      <element name="p">
        <complexType mixed="true">
          <choice minOccurs="0" maxOccurs="unbounded">
            <element ref="d:em"/>
          </choice>
        </complexType>
      </element>
    </choice>
  </complexType>
</element>

(Anyone see the bug?)

RELAX NG grammars are XML documents, so they have to have a root element.

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
 ns="http://www.xmlsummerschool.com/example/ns"
 datatypeLibrary
   ="http://www.w3.org/2001/XMLSchema-datatypes">
<div>
  <p xmlns="http://www.w3.org/1999/xhtml">This
  is some documentation.
  The div wrapper is just for grouping.
  </p>

  <start>
    <ref name="doc"/>
  </start>
</div>

<!-- declarations go here -->
</grammar>

<define name="doc">
  <element name="doc">
    <attribute name="status">
      <choice>
        <value>draft</value>
        <value>final</value>
      </choice>
    </attribute>
    <group>
      <ref name="head"/>
      <ref name="body"/>
    </group>
  </element>
</define>

<define name="head">
  <element name="head">
    <interleave>
      <ref name="date"/>
      <ref name="title"/>
      <ref name="author"/>
    </interleave>
  </element>
</define>

<define name="date">
  <element name="date">
    <data type="dateTime"/>
  </element>
</define>

One of the appealing features of RELAX NG is its compact syntax.

default namespace
  = "http://www.xmlsummerschool.com/example/ns"
namespace h = "http://www.w3.org/1999/xhtml"

[
   h:p [ "This is some documentation. The div"
         " wrapper is just for grouping." ]
]
div {
   start = doc
}

doc =
   element doc {
      attribute status { "draft" | "final" },
      (head, body)
   }

head =
   element head {
      (date & title & author)
   }

date   = element date   { xsd:dateTime }
title  = element title  { text }
author = element author { text }

body =
   element body {
      (p+)
   }

p =
   element p {
      (text | em | fn)*
   }

em =
   element em {
      (text | em | fn)*
   }

fn =
   element fn {
      (limitedp+)
   }

limitedp =
   element p {
      (text | limitedem)*
   }

limitedem =
   element em {
      (text | limitedem)*
   }

Schematron can be used to evaluate extra-grammatical constraints
Essentially arbitrary XPath expressions are evaluated in the context of appropriate elements
Schematron rules can be embedded in RELAX NG and XML Schema documents for convenience

Recall our earlier constraints:

Dates have to be expressed in UTC
Documents can have at most four footnotes

<s:schema
 xmlns:s="http://purl.oclc.org/dsdl/schematron">
 <s:ns prefix="ex"
    uri="http://www.xmlsummerschool.com/example/ns"/>
…

Dates in UTC:

…
   <s:pattern name="Dates in UTC">
     <s:rule context="ex:date">
       <s:assert
    test="timezone-from-dateTime(xs:dateTime(.))
            = xs:dayTimeDuration('PT0H')"
       >Dates must be expressed in UTC.</s:assert>
     </s:rule>
   </s:pattern>
…

At most four footnotes.

…
   <s:pattern name="At most four footnotes">
      <s:rule context="/*">
         <s:assert test="count(//fn) &lt;= 4"
         >At most four footnotes are allowed.
         </s:assert>
      </s:rule>
   </s:pattern>
…

Consider this document:

<doc xmlns="http://www.xmlsummerschool.com/example/ns"
     xmlns:xi="http://www.w3.org/2001/XInclude"
     status="draft">
<head>
  <title>A Sample Document</title>
  <date>2011-09-22T09:00:00+01:00</date>
  <author>Norman Walsh</author>
</head>
<xi:include href="body.xml"/>
</doc>

Is it valid?

Before XInclude processing?
After XInclude processing?
Both before and after?

A single book might consist of a few dozen “resources”
A set of books might consist of a few hundred resources
The documentation for three products across 14 languages and six configurations might consist of many thousands of resources

Filesystem
Source code control system
Database
Content management system

Conceptually easy and familiar
Search, backup, etc. all work exactly like the other files on your system
Versioning, locking, conflict resolution all absent

Familiar to programmers, source code control systems provide a layer of versioning, locking, and conflict resolution on top of the filesystem
Examples: Subversion is centralized; mercurial and git are decentralized.
Works mostly like the filesystem

Databases provide a whole new range of capabilities: indexing, searching, etc.
Not generally like a filesystem, may require new practices
Traditional relational databases are not a good fit for XML. Just. Don't. Go. There.
XML and (some) NoSQL databases are a better fit.
MarkLogic, ahem, makes an excellent database for XML.

Usually built on top of a database
Provide yet more features for management and workflow
Often provide features for designing and implementing management workflows (for example, no document can be published until Q/A has signed off on it)

Part of your management system
- For example, MarkLogic Document Library Services & Content Processing Framework
- RSuite or another CMS
Make or Ant
XSLT or XQuery
XProc

Traditional unix, text-based tool
Filesystem based
Tracks dependencies and keeps things “up-to-date”
Drives command-line tools

all: publish.html

webtech.html: webtech.inc dbstyle.xsl \
              graphics/figure1.png
	$(XSLT) $< dbstyle.xsl $@

webtech.inc: webtech.xml
	$(XINCLUDE) < $@< > $@

Java and XML based tool
Filesystem based
Allows authors to build flow graphs
Drives Java or command-line tools; extensible in Java

<project name="example" default="pubdoc" basedir=".">
  <description>An example ant file</description>

  <property name="build.dir" value="output"/>

  <target name="init">
    <mkdir dir="${build.dir}"/>
  </target>

  <target name="pubdoc" depends="init,xinclude">
    <xslt in="webtech.inc" style="dbstyle.xsl"
          out="${build.dir}/webtech.html"/>
  </target>

  <target name="xinclude">
    <xslt in="webtech.xml" style="xinclude.xsl"
          out="webtech.inc"/>
  </target>
</project>

XML technologies
Capable of nearly arbitrary transformation
Built into some databases and content management systems

XML based, designed for XML processing
Allows authors to write simple, mostly declarative pipelines with a rich, and extensible, vocabulary of steps

<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
            version="1.0">

<p:xinclude/>

<p:xslt>
  <p:input port="stylesheet">
    <p:document href="dbstyle.xsl"/>
  </p:input>
</p:xslt>

</p:pipeline>

Entities and URIs are accessed via URIs
Proxies and resolvers can intercede
Most resolvers use XML Catalogs

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
         prefer="public">

  <system systemId
="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
          uri="/share/doctypes/xhtml1-strict.dtd"/>

  <system systemId
="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
          uri="/share/doctypes/xhtml1-transitional.dtd"/>

</catalog>

XML namespaces provide a global naming mechanism
This facilitates the mixing of different vocabularies:
- MathML and SVG in DocBook
- XInclude in TEI
- Recipe markup in a purchase order
Generally speaking, this requires tools to understand the mixture

How do you validate documents that use multiple namespaces?
One approach is to include the mixtures in the schema: the DocBook 5.0 schema knows that MathML can occur in equations, for example
NVDL, Namespace-based Validation Dispatching Language is another approach
An NVDL document describes how to decompose a mixed document into individual documents that can be validated independently

<rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"
       startMode="docbook">

<mode name="docbook">
  <namespace ns="http://docbook.org/ns/docbook">
    <validate schema="rng/docbook.rng"
              useMode="attach"/>
    <validate schema="sch/docbook.sch"
              useMode="attach"/>
  </namespace>
</mode>

<mode name="attach">
  <anyNamespace>
    <attach/>
  </anyNamespace>
</mode>

</rules>

The floor is yours...

Publishing

22 September 2011

Document Management

Norman Walsh

MarkLogic Corporation

Agenda

Introduction

Learning Objectives

Some reuse scenarios

Some repurposing scenarios

Non-technical challenges

Understanding context

More context

Context impacts reuse

Interlude

Authors are revolting

Solutions to the non-technical challenges

Reuse in XML documents

The mechanics of reuse

Reusing graphics

Why are graphics easy?

Reusing XML

XML reuse techniques

Aside: How XML tools work

Reuse with XML entities

XML entities after parsing

XML entities

XML entity constraints

Reuse with XInclude

After XInclude

XInclude

XInclude constraints

XInclude Fallback

XIncluding plain text

XInclude challenge

XInclude + XPointer

After XInclude

XPointer schemes

Reuse with stand-off markup

DocBook assembly markup

DocBook assembly example

Blind interchange

Validation

Why validate?

When do you validate?

Schema Languages

DTDs

W3C XML Schema

RELAX NG

A Document Example

But what are the rules?

More rules

DTD Rules

DTD Rules (continued)

DTD Rules (continued)

XML Schema Rules

XML Schema Rules (continued)

XML Schema Rules (continued)

XML Schema Rules (continued)

XML Schema Rules (continued)

XML Schema Rules (continued)

RELAX NG Rules

RELAX NG Rules (continued)

RELAX NG Rules (continued)

RELAX NG Rules (compact syntax)

RELAX NG Rules (compact syntax, continued)

RELAX NG Rules (compact syntax, continued)

RELAX NG Rules (compact syntax, continued)

Schematron

Schematron example

Schematron example (continued)

Schematron example (continued)

Revisiting XInclude

Managing documents

What's to manage?

Options

Filesystem management

SCCS management

Database management

Content management system