Copyright © 1999 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML Schema: Structures is part 1 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs).
This is a W3C Working Draft for review by members of the W3C and other interested parties in the general public.
It has been reviewed by the XML Schema Working Group and the Working Group has agreed to its publication. Note that not that all sections of the draft represent the current consensus of the WG. Different sections of the specification may well command different levels of consensus in the WG. Public comments on this draft will be instrumental in the WG's deliberations.
Please review and send comments to www-xml-schema-comments@w3.org (archive).
This draft incorporates only minor changes from the previous version, mostly in the area of content model features: see Rich Content Models (§3.4.6), Mixed Content (§3.4.7) and Element Declaration (§3.4.9).
Three major components of this document are marked below as out-of-date and/or under construction: major efforts by task forces from within the WG are still underway with respect to these, and their reports are linked from this draft. We felt it was important to present this work to the public, in keeping with our obligation to produce drafts for public inspection and comment on a regular basis, despite the "Under Construction" signs posted below.
Sections which are not the status quo, that is on which the working group has not yet reached consensus, are marked with an asterisk (*) at the end of the section title. But please note that all the facilities described herein are in a preliminary state of design. The Working Group anticipates substantial changes, both in the mechanisms described herein, and in additional functions yet to be described. The present version should not be implemented except as a check on the design and to allow experimentation with alternative designs. The Schema WG will not allow early implementation to constrain its ability to make changes to this specification prior to final release.
A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
This document sets out the structural part (XML Schema: Structures) of the XML Schema definition language.
Chapter 2 presents a Conceptual Framework (§2) for XML Schema: Structures, including an introduction to schema constraints, types, schema composition, and symbol spaces. The abstract and concrete syntax of XML Schema: Structures are introduced, along with other terminology used throughout the specification.
Chapter 3 Schema Definitions and Declarations (§3) reconstructs the core functionality of XML 1.0, plus a number of extensions, in line with our stated requirements [XML Schema Requirements]. This chapter discusses the declaration and use of datatypes, archetypes, element, content models, attributes, attribute groups, model groups, refinement, entities and notations.
Chapter 4 presents Schema Composition and Namespaces * (§4), including the validation of namespace qualified instance documents, import, inclusion and export of declarations and definitions, schema paths, access to schemas, and related rules for schema-based validity.
Chapter 5 is a placeholder for Documenting schemas * (§5), which will eventually provide a standardized means for including documentation in the definition of a schema.
Chapter 6 discusses Conformance -- OUT OF DATE * (§6), including the rules by which instance documents are validated, and responsibilities of schema-aware processors.
The normative addenda include a (normative) DTD for Schemas * (§B) and a (normative) Schema for Schemas * (§A), which is an XML Schema schema for XML Schema: Structures, a Glossary (normative) * (§C) [not yet written] and References (normative) * (§D). Non-normative appendixes include a Sample Schema (non-normative) * (§G) and Acknowledgments (non-normative) * (§F).
This Working Draft document was produced using an [XML] DTD and an [XSLT] stylesheet.
The following highlighting is used to present technical material in this document:
[Definition:] A term is something we use a lot.
<-- Category: sample-concrete-syntax-paradigm -->
<example
attribute = NMTOKEN
required-attribute = ID>
<-- Content: (daughter1 , daughter2*) -->
</example>
Example
A non-normative example illustrating use of the schema language, or a related instance.
<schema name='http://www.muzmo.com/XMLSchema/1.0/mySchema' >And an explanation of the example.
The following highlighting is used for non-normative commentary in this document:
Issue (dummy): A recorded issue.
Ed. Note: Notes shared among the editorial team.
NOTE: General comments directed to all readers.
The purpose of XML Schema: Structures is to provide an inventory of XML markup constructs with which to write schemas.
The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values, entities and their contents and notations. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.
Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism will allow a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations.
The definition of XML Schema: Structures is a part of the W3C XML Activity. It is in various ways related to other ongoing parts of that Activity and other W3C WGs
Ed. Note: Need to reference Cambridge Communique as soon as it's published.
XML Schema: Structures defines its own Information Set Contributions.
XML Schema: Structures will have requirements for subsequent Information Set Working Drafts.
The terminology used to describe XML Schema: Structures is defined in the body of this specification. The terms defined in the following list are used in building those definitions and in describing the actions of XML Schema: Structures processors:
This specification uses a number of terms that are common to many of the fields of endeavor that have influenced the development of XML Schema. Unfortunately, it is often the case that these terms do not have the same definitions in all of those fields. This section attempts to provide definitions of terms as they are used to describe the conceptual framework, and the remainder of the specification.
Since XML schemas are themselves specified as XML documents or elements within documents, it is useful to clarify the relationships between certain kinds of XML documents and elements:
Note that it is possible to specify a schema to which schemas themselves must conform, and this is given in (normative) Schema for Schemas * (§A). An XML 1.0 DTD to which schemas must conform is also provided in (normative) DTD for Schemas * (§B).
Any schema is ipso facto an element information item. It follows that the rules specified herein for validity apply to all of the following kinds of XML element information items:
Likewise, rules for schemas in general apply to the particular schema for schemas, which is an instance conforming to itself.
The [XML] specification describes two kinds of constraints on XML documents: well-formedness and validity constraints. Informally, the well-formedness constraints are those imposed by the definition of XML itself (such as the rules for the use of the < and > characters and the rules for proper nesting of elements), while validity constraints are the further constraints on document structure provided by a particular DTD.
Three kinds of normative statements about the impact of XML Schema: Structures components on instances are distinguished in this specification:
NOTE: Schema Information Set Contributions are not as new as might at first appear: XML 1.0 validation augments the XML 1.0 information set in similar ways, e.g. by providing values for attributes not present in instances, and by implicitly exploiting type information for normalization or access, e.g. consider the effect ofNMTOKENS
on attribute whitespace, and the semantics ofID
andIDREF
. By including Schema Information Set Contributions, we are trying to make explicit something XML 1.0 left implicit.
XML Schema: Structures not only reconstructs the DTD constraints of XML 1.0 using XML instance syntax, it also adds the ability to define new kinds of constraints. For example, although the author of an XML 1.0 DTD may declare an element type as containing character data, elements, or mixed content, there is no mechanism with which to constrain the contents of elements to only character data of a particular form, such as only integers in a specified range.
This specification supports the expression of just such constraints by including in the mechanism for the declaration of elements the option of specifying that its contents must consist of a valid string expression of a particular datatype. A number of other mechanisms are added which improve the expressive power, usability and maintainability of schemas as a means to defining the structure of XML documents.
The purpose of a schema is to identify a set of components for use in XML documents and to provide the rules for their correct combination.
The schema language is itself a set of elements and attributes. We will describe these, and show how they are used. But first, a quick example of an XML document.
Example
<?xml version='1.0'?> <PurchaseOrder orderDate="1999-05-20"> <shipTo type="US"> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <shipDate>1999-05-25</shipDate> <comment>Get these things to me in a hurry, my lawn is going wild!</comment> <Items> <Item pno="333-333"> <productName>Lawnmower, model BUZZ-1</productName> <quantity>1</quantity> <price>148.95</price> <comment>Please confirm this is the electric model</comment> </Item> <Item pno="444-444"> <productName>Baby Monitor, model SNOOZE-2</productName> <quantity>1</quantity> <price>39.98</price> </Item> </Items> </PurchaseOrder>
The purchase order consists of a main element with several subordinate
elements. Most of the subelements have simple atomic types such as string
or
date
, drawn from the repertoire of built-in datatypes defined in [XML Schemas: Datatypes], but some are complex. We use the archetype
element
when declaring elements which allow elements in their content and/or may carry attributes. For example, we can define an archetype called Address
as follows:
Example
<archetype name="Address" > <element name="name" type="string" /> <element name="street" type="string" /> <element name="city" type="string" /> <element name="state" type="string" /> <element name="zip" type="number" /> <attribute name="type" type="string" /> </archetype>The consequence of this definition is that an element whose type is declared to be Address
must consist of five elements and may have one attribute. Though each has a distinct name, four of the elements and the attribute will simply contain a string in a document instance while one will contain a number.
If we're going to use the same element in a number of places, we can declare it once and refer to it by name elsewhere:
Example
<element name="comment" type="string" />This declaration restricts the comment
element to text content and no attributes.
We can define a PurchaseOrderType
for our
PurchaseOrder
element, referring to the definitions of Address
and comment
as above, as:
Example
<archetype name="PurchaseOrderType"> <element name="shipTo" type="Address" /> <element name="shipDate" type="date" /> <element ref="comment" minOccurs='0' /> <element name="Items" type="Items" /> <attribute name="orderDate" type="date" /> </archetype>The shipDate
element daughter ofPurchaseOrderType
is declared above as having an atomic type, as in theAddress
example above. Thecomment
daughter is declared by reference to a global element declaration. Similarly, theshipTo
andItems
daughters are declared as having complex types which must be defined elsewhere in the current schema. Thecomment
daughter and theorderDate
attribute are optional, the others are obligatory.
Issue (type-decl-syntax): Further integration of the concrete syntax for type definitions is desireable, e.g. by using 'type' for both archetypes and and datatypes, but the details of a consistent and clear way to do this have not yet been agreed.
Since an element declaration's type
can identify either a datatype or an archetype, and there are separate symbol spaces for these two, the
possibility of ambiguity arises. This is resolved in favour of the archetype, e.g. even if a datatype called Address
existed (either
builtin or user-defined), the above declaration for shipTo
would
refer to the user-defined archetype of that name.
Issue (note-two-sses): The separation of the datatype and archetype name symbol spaces is primarily motivated by the decision to allow unqualified reference to the ab initio and built-in datatypes. Should this decision be reversed, as was suggested in the report of the simplification Task Force, then the unification of the two symbol spaces could proceed with minimal negative impact. The potential for error which arises from unexpected shadowing of an old datatype by a new archetype would be removed.
[Definition:] A definition creates a new archetype or datatype; [Definition:] a declaration enables the appearance in a
document instance of an element or attribute with a specific name and type. In the schema,
we see both the definition of several types, and also several elements and
attributes declared
as usages of these types. For example, Address
is defined to be an
archetype, while within the definition of Address
we see five
declarations of elements and one attribute declaration. These declarations are
not themselves types, but rather an association between a name and constraints
which govern the appearance of that name in documents governed by the containing schema.
In the case of attribute declarations, the constraints are on the allowed value, always by reference to a datatype:
Example
<attribute name="orderDate" type="date" />
In the case of element declarations, the constraints are on the allowed content and attributes, by reference to an archetype or a datatype (in which case no attributes are allowed):
Example
<element name="shipTo" type="Address" /> <element name="comment" type="string" />Because Address
is defined in the schema to have certain elements as its content and to allow a certain attribute, anyshipTo
element appearing in an instance must include those elements and may have that attribute, while anycomment
element may not have any attributes, but any text content.
As well as naming a datatype or archetype in an attribute or element declaration, we can embed the type definition immediately within the element declaration:
Example
<archetype name='Items'> <element name='Item' minOccurs='0' maxOccurs='*'> <archetype> <element name='productName' type='string' /> <element name='quantity' type='integer'> <minExclusive>0</minExclusive> </element> <element name='price' type='number' /> <element ref='comment' minOccurs='0' /> </archetype> </element> </archetype>Here not only is the archetype of the Item
element given in line, but also the datatype referenced by itsquantity
daughter (the built-ininteger
datatype) is also qualified inline by adding a subrange constraint.
Taken together the examples above constitute a complete schema for the
initial PurchaseOrder
example instance. They are drawn together
in a single complete schema in Sample Schema (non-normative) * (§G).
The next chapter Schema Definitions and Declarations (§3) sets out the XML Schema: Structures approach to schemas and formal definitions of their component parts. Here we informally summarize the key constructs used in defining schemas. A 'Yes' in the 'Name apears in instances?' column indicates that the name will appear in instances -- other names are for schema use only.
XML Schema: Structures Feature | Purpose | Named? | Name appears in instances? |
---|---|---|---|
The Schema (§3.1) | A wrapper element containing all the definitions and declarations comprising a schema. | Yes | No |
Datatype Definition (§3.4.1) | An atomic type (content constraint), such as 'integer', that applies to character data in an instance document, whether it appears as an attribute value or the contents of an element. The mechanisms for defining datatypes are set out elsewhere, in XML Schemas: Datatypes. | Yes | No |
Archetype Definition (§3.4.2) | A complete set of constraints for elements in instance documents, applying to both contents and attributes. | Yes | No |
Element Declaration (§3.4.9) | An
association between a name for an element and a type. An element
declaration for 'A' is comparable to a DTD declaration
<!ELEMENT A .....> . |
Yes (local or global) | Yes |
Attribute Declaration (§3.4.3) | An association between a name for an attribute and a datatype, together with occurrence constraints such as 'required' or 'default'. The association is local to its surrounding archetype. | Yes (local) | Yes |
Content type | Either a datatype or a content model. A content type applies to the contents of elements in an instance document (but not their attribute values). It provides a unifying abstraction for the constraints which apply to the contents of elements, but introduces no additional features. | No | No |
Element Content Model (§3.4.5) | A constraint that applies to the contents of elements in an instance document. Content models do not include attribute declarations. | No | No |
Rich Content Models (§3.4.6) | Components for constructing content models which allow only element content. Includes facilities for grouping and sequencing, as well as for declaration of and reference to elements. | No (but see below) | No |
Attribute Group Definition * (§3.4.4) | An association between a name and a reusable collection of attribute declarations. | Yes | No |
Named Model Group * (§3.4.8) | Model groups are part of the content model building block abstraction, but are unnamed and cannot be referenced for reuse. A named model group is an association between a name and a model group, allowing for reuse. | Yes | No |
Archetype Refinement * (§3.5) | One archetype may be defined as refining one or more other archetypes, acquiring content type and/or attributes therefrom. | Yes | No |
Schema Import (§4.2.2) | Extends the current schema with definitions and/or declarations from elsewhere, retaining the association with their origin. | No | No |
Schema Inclusion (§4.2.4) | Integrates definitions and/or declarations from elsewhere into the schema being defined, as if they had been defined locally. | No | No |
As indicated in the third column of the tables above, most of the components listed have names, which provide for references within the schema, and sometimes from one schema to another. For example, an attribute declaration can refer to a named datatype, such as 'integer'. A content model can refer to an element, and so on.
If all such names were assigned from the same 'pool', then it would be impossible to have e.g. a datatype named 'integer' and an element with the name 'integer' in the same schema. [Definition:] Accordingly we introduce the idea of a symbol space (avoiding 'name space' to avoid confusion with 'Namespaces in XML' [XML-Namespaces]).
There is a single distinct symbol space within a given schema for each of the abstractions named above other than 'Attribute' and 'element': within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. In particular note that the same name can refer to both a type and an element, without conflict or necessary relation between the two.
Attributes and local element declarations are special, in that every archetype defines its own attribute symbol space and local element symbol space, which are distinct from each other. In addition, top-level elements (whose declarations are not contained within an archetype definition) reside in their own symbol space.
XML Schema: Structures is presented here primarily in the form of an [Definition:] abstract syntax, which provides a formal specification of the information provided for each declaration and definition in the schema language. The abstract syntax is presented using a simplified BNF. Defined terms are to the left. Their components are to the right, with a small amount of meta-syntax: ()s for grouping, | to separate alternatives, ? for optionality, * and + for iteration. Terms in italics are primitives, not expanded here, either because they are defined elsewhere (e.g. URI, defined by [URI]) or because they can only be grounded once a concrete syntax is decided on (e.g. choice).
An abstract syntax production prefixed with a number in brackets (e.g. [3]) is normative; other abstract syntax is either for purposes of explanation, or is a duplicate (for convenience) of a normative definition to be found elsewhere.
The abstract syntax illustrates the expressive power of the language, and the relationships among its component parts. The abstract syntax can be used to evaluate the expressive power of XML Schema: Structures, but not its look and feel. In particular, please note that neither ordering within or between productions or choice of names is significant, and that any particular concrete syntax is not constrained by these.
The [Definition:] concrete syntax of XML Schema: Structures, the exact element and attribute names used in a schema, are a key feature of its proposed design. The concrete syntax is the form in which the schema language is used by schema authors. Though its elements and attributes are often different from the terms of the abstract syntax BNF, the features and expressive power of the two are congruent. The concrete syntax profoundly affects the convenience and usability of the schema language.
We include a preliminary concrete syntax in this draft, via examples, paradigms and in (normative) Schema for Schemas * (§A) and (normative) DTD for Schemas * (§B). Unlike the previous version, in which the intention was to stay quite close to the abstract syntax, in this version we have begun to take convenience and clarity into account.
The principal purpose of XML Schema: Structures is to provide a means for defining schemas that constrain the contents of instances and augment the information sets thereof.
A schema contains some preamble information and a set of definitions and declarations.
Schema top level | |||||||||||||||||||||||||||||||||||
|
preamble consists of an xmlSchemaRef specifying the URI for XML Schema: Structures; the targetNamespace specifying the URI of the namespace which this schema is about; and a schemaVersion specification for private version documentation purposes and version management.
See Schema Composition and Namespaces * (§4) for discussion of schemas, instances and namespaces.
Ed. Note: The whole matter of instance/schema connections is still under discussion: the WG has not reached consensus in this area. The referenced section does give some indication of where our thinking in this area is going.
<-- Category: root -->
<schema
model = "open" | "refinable" | "closed"
targetNS = CDATA
version = CDATA
xmlns = "http://www.w3.org/1999/XMLSchema">
<-- Content: (import* , include* , export? , (attrGroup | comment | datatype | element | externalEntity | group | notation | textEntity | archetype | unparsedEntity)*) -->
</schema>
<-- Category: top-level -->
<comment>
<-- Content: text -->
</comment>
Example
<!DOCTYPE schema PUBLIC '-//W3C//DTD XML Schema Version 1.0//EN' SYSTEM 'http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.dtd' > <schema targetNS='http://purl.org/metadata/dublin_core' version='M.n' xmlns='http://www.w3.org/1999/XMLSchema'> ... </schema>Note that the abstract syntax xmlSchemaRef is realised via a default namespace declaration in the concrete syntax.
Although the schema above is a complete XML document, schema
need not be the document element, but can appear within other documents.
Indeed there is no requirement that a schema be derived from a (text) document
at all: it could be built 'by hand' via e.g. a DOM-conformant API.
The schema's model property is discussed in Archetype Refinement * (§3.5). The schema's export, import and include properties are discussed in Schema Composition and Namespaces * (§4).
The schema's declarations and definitions, discussed in detail in Schema Definitions and Declarations (§3), provide for the creation of new schema components:
Summary of Definitions and Declarations | |||||||||||||||||||||||||||||||||||
|
Example
The following illustrates the basic model for declaring or defining all XML Schema: Structures components:
<datatype name='myDatatype'> ... </datatype> <archetype name='myType'> ... </archetype> <element name='myElement'> ... </element> <attrGroup name='myAttrGroup'> ... </attrGroup> <group name='myModelGroup'> ... </group> <notation name='myNotation' ... /> <textEntity name='myTextEntity'> ... </textEntity> <externalEntity name='myExternalEntity' ... /> <unparsedEntity name='myUnparsedEntity' ... /> </schema>When creating a component, we establish an association between its name and the specification for that component. Each new component therefore creates a new entry in the symbol space for that kind of component.
The Unique Definition (§6.2.1) Constraint on Schemas obtains.
Issue (no-evolution): This draft does not deal with the requirement "for addressing the evolution of schemata" (see [XML Schema Requirements]).
NOTE: We have not so far seen any need to reconstruct the XML 1.0 notion of root. For the connection from document instances to schemas, see Associating Instance Document Constructs with Corresponding Schemas (§4.2.5) and Schema Validity * (§6.1).
Uniform means are provided for reference to a broad variety of schema constructs, both within a single schema and to features imported (Schema Import (§4.2.2)) from external schemas. The name used to reference any component of XML Schema: Structures from within a schema consists of an NCName and an optional schemaRef, a reference to an external schema. In a few cases, some qualification may be added to a reference: this is made clear as the individual reference forms are introduced below.
Example: Component Names and References | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The abstract syntax above characterizes the reference mechanisms used in this specification.
Example
<element name='elem1' type='Address'/> <element name='elem2' type='BLOCKQUOTE' schemaAbbrev='XHTML'/> <attribute name='attr1' type='quantity' schemaName='http://www.w3.org/xsl.xsd'/>The first of these is a local reference, the other two refer to schemas elsewhere. The BLOCKQUOTE
example assumes the schemaAbbrevXHTML
has been declared for import; thetemplate
example similarly assumes that the given (imaginary as of this writing) URL has been declared for import. See Schema Import (§4.2.2) for a discussion of importing.
The Consistent Import (§6.2.2) Constraint on Schemas obtains.
The One Reference Only (§6.2.2) Constraint on Schemas obtains.
The identify definition wrt schema-validity obtains.
The Preorder Priority for Included Definitions (§6.2.7) Constraint on Schemas also obtains.
Like XML 1.0 DTDs, XML Schema: Structures provides facilities for constraining the contents of elements and the values of attributes, and for augmenting the information set of instances, e.g. with defaulted values and type information. [Definition:] We refer hereafter to the combination of schema constraints and information set contributions with the abbreviation SC. Compared to DTDs, XML Schema: Structures provides for a richer set of SCs, and improved capabilities for sharing SCs across sets of elements and attributes.
We start with [Definition:] the simple datatypes whose expression in XML documents consists entirely of character data. As in the current draft of XML Schemas: Datatypes, wherever we speak of datatypes in this draft, we shall mean these simple datatypes.
Datatypes | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
XML Schema: Structures incorporates the datatype specification mechanisms defined by [XML Schemas: Datatypes] in order to express SCs on attribute values and the contents of elements consisting entirely of character data.
The production for datatypeSpec above serves to indicate where this
chapter connects with XML Schemas: Datatypes. exportControl is
defined in Exporting Schema Constructs (§4.2.1). The concrete syntax
displayed below is copied from [XML Schemas: Datatypes]. Most of the elements
are for specifying facets: they are all optional and may appear in any order
after the basetype
element.
The other productions provide for using datatypes once they have been defined, see below under contentType and attribute.
We assume that it is appropriate to allow for some local specialization of datatypes at the point of use, and provide for that here (specialize).
As explained in References to Schema Constructs (§3.3), a schemaRef, if included, allows for the referenced definition to be located in some other schema.
<-- Category: top-level -->
<datatype
export = "true" | "false"
name = NMTOKEN>
<-- Content: (basetype , ((minExclusive | minInclusive) | (maxExclusive | maxInclusive) | (maxAbsoluteValue , minAbsoluteValue)? | encoding | enumeration | length | maxLength | pattern | period | precision | scale)*) -->
</datatype>
<-- Category: datatype -->
<basetype
name = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA />
<-- Category: datatype -->
<minExclusive>
<-- Content: text -->
</minExclusive>
<-- Category: datatype -->
<minInclusive>
<-- Content: text -->
</minInclusive>
<-- Category: datatype -->
<maxExclusive>
<-- Content: text -->
</maxExclusive>
<-- Category: datatype -->
<maxInclusive>
<-- Content: text -->
</maxInclusive>
<-- Category: other -->
<minAbsoluteValue>
<-- Content: text -->
</minAbsoluteValue>
<-- Category: datatype -->
<maxAbsoluteValue>
<-- Content: text -->
</maxAbsoluteValue>
<-- Category: datatype -->
<encoding>
<-- Content: text -->
</encoding>
<-- Category: datatype -->
<enumeration>
<-- Content: literal+ -->
</enumeration>
<-- Category: datatype -->
<literal>
<-- Content: text -->
</literal>
<-- Category: datatype -->
<length>
<-- Content: text -->
</length>
<-- Category: datatype -->
<maxLength>
<-- Content: text -->
</maxLength>
<-- Category: datatype -->
<pattern>
<-- Content: lexical+ -->
</pattern>
<-- Category: datatype -->
<lexical>
<-- Content: text -->
</lexical>
<-- Category: datatype -->
<period>
<-- Content: text -->
</period>
<-- Category: datatype -->
<precision>
<-- Content: text -->
</precision>
<-- Category: datatype -->
<scale>
<-- Content: text -->
</scale>
Example
<datatype name='posInt'> <basetype name='integer'/> <minExclusive>0</minExclusive> </datatype> <attribute name='foo' type='posInt'/> <attribute name='baz' type='integer'/> <attribute name='fontSize' type='quantity' schemaName='http://www.w3.org/xsl.xsd' fixed='12pt'/>The first attribute
example references the definition above it. The second references a datatype pre-defined by XML Schemas: Datatypes. The third references a datatype in an (imaginary) XSL schema and fixes its value.
NOTE: See previous note on the type definition issue.
The satisfy-dt definition wrt schema-validity obtains.
The Datatype Info (§6.2.3.1) Schema Information Set Contribution obtains.
[Definition:] Archetype specifications gather together all SCs pertinent to elements in instance documents, their attributes and their contents. They are called archetypes because there may be more than one element declaration that shares the same SCs (see Element Declaration (§3.4.9)), and which therefore can be constrained by a common archetype.
Archetypes | |||||||||||||||||||||||||||||||||||
|
The first three productions above provide the basic structure of the specification, and the last two provide for reference to the things specified. But note that the name of an archetype is not ipso facto the name of elements whose appearance in instances will be associated with the SCs of that archetype. The connection between an element name and an archetype is made by an elementDecl, see below.
Alongside Attribute Declaration (§3.4.3) for permitted attributes, SCs for contents are specified in an archetype (contentType). For elements which may contain only character data, this is by reference to a Datatype Definition (§3.4.1). Note that doing this by way of datatypeRef allows for specialization and even defaulting in a manner similar to attribute values. For other kinds of elements, an Element Content Model (§3.4.5) is required.
Issue (elt-default): The extension of defaulting to element content is tentative.
<-- Category: top-level -->
<archetype
content = "textOnly" | "mixed" | "elemOnly" | "empty" | "any"
default = CDATA
fixed = CDATA
model = "open" | "refinable" | "closed"
name = NMTOKEN
order = "choice" | "seq" | "all" | "many"
schemaAbbrev = NMTOKEN
schemaName = CDATA
type = NMTOKEN>
<-- Content: (refines* , ((element | group)* | datatypeQual?) , (attrGroupRef | attribute)*) -->
</archetype>
<-- Category: archetype -->
<datatypeQual>
<-- Content: ((minExclusive | minInclusive) | (maxExclusive | maxInclusive) | (maxAbsoluteValue , minAbsoluteValue)? | encoding | enumeration | length | maxLength | pattern | period | precision | scale)* -->
</datatypeQual>
<-- Category: archetype -->
<refines
name = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA />
Example
<archetype name='length1' type='number'/> <minInclusive>0</minInclusive> <attribute name='unit' type='NMTOKEN'/> </archetype> <element name='width' type='length1'/> <width unit='cm'>2.54</width> <archetype name='length2'> <element name='size' type='number'> <minInclusive>0</minInclusive> </element> <element name='unit' type='NMTOKEN'/> </archetype> <element name='depth' type='length2'/> <depth> <size>2.54</size><unit>cm</unit> </depth>Two approaches to defining an archetype for length: one with character data content constrained by a qualified reference to a built-in datatype, and one attribute, the other using two elements.
The way in which the concrete syntax defined and illustrated above realises the abstract syntax is
not straightforward, because it is optimised to make simple cases simple. The
datatypeQual
option is allowed only if a type
attribute is present. Similarly, the schemaName
or
schemaAbbrev
, the default
and the
fixed
attributes are allowed only if a type
attribute
is present. Finally, if a type
attribute is present, it must
reference a datatype, and the content
attribute must be
textOnly
(or absent, in which case it defaults to
textOnly
). This is to handle the main alternation in the abstract
syntax for contentType, which allows either
(possibly locally qualified) reference to a datatype or a
content model.
NOTE: See previous note on the type definition issue.
The AttrGroup Unique (§6.2.3.2) Constraint on Schemas obtains.
The AttrGroup Identified (§6.2.3.2) Constraint on Schemas obtains.
The attr-decl-set definition wrt schema-validity obtains.
The attr-fullname definition wrt schema-validity obtains.
The Attribute Locally Unique (§6.2.3.2) Constraint on Schemas obtains.
The satisfy-as definition wrt schema-validity obtains.
The Archetype Info (§6.2.3.2) Schema Information Set Contribution obtains.
Attribute declarations associate a name (which will appear as an attribute in start tags in instances) with SCs for the presence and value thereof.
Attributes | ||||||||||||||||||||||||||||||
|
NOTE: The datatypeRef productions are repeated here for easy reference.
Attribute declarations provide for:
<-- Category: other -->
<attribute
default = CDATA
fixed = CDATA
maxOccurs = "1"
minOccurs = "0" | "1"
name = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA
type = "string">
<-- Content: ((minExclusive | minInclusive) | (maxExclusive | maxInclusive) | (maxAbsoluteValue , minAbsoluteValue)? | encoding | enumeration | length | maxLength | pattern | period | precision | scale)* -->
</attribute>
Example
<attribute name='myAttribute'/> <attribute name='anotherAttribute' type='integer' default='42'> <minExclusive>0</minExclusive> </attribute> <attribute name='yetAnotherAttribute' type='integer' minOccurs='1'/> <attribute name='stillAnotherAttribute' type='string' fixed='Hello world!'/>Four attributes are declared: one with no explicit SCs at all; two declared by reference to a built-in datatype, one with a default and a subrange qualification and one required to be present in instances; and one with a fixed value.
The
maxOccurs
attribute is FIXED
at 1 for all attributes.
Consistent with this, minOccurs
can only be 0 or 1.
When attribute declarations are used in an archetype specification, each
archetype provides its own symbol space for attribute names. E.g. an attribute
named title
within one archetype need not have the same
datatypeRef as one declared within another
archetype.
The attr-satisfy definition wrt schema-validity obtains.
Issue (default-attr-datatype): What is the default attribute datatypeSpec?
The satisfy-attrs definition wrt schema-validity obtains.
The Attribute Value Default (§6.2.3.3) Schema Information Set Contribution obtains.
Issue (namespace-declare): We've got a problem with namespace declarations: they're not attributes at the infoset level, so they can appear without compromising validity, except if there is a fixed or required declaration, and defaults should have the apparently desired effect. I.e., if a schema declares an attribute whose name isxmlns
with a default or fixed value, does it change the infoset? Or if we allow QNames as such to be declared,xmlns:foo
.
XML Schema: Structures can name a group of attributes so that they may be incorporated as a whole into archetype definitions:
Attribute groups | |||||||||||||||||||||||||
|
Attribute group definitions provide a construct to replace some uses of parameter entities.
<-- Category: top-level -->
<attrGroup
export = "true" | "false"
name = NMTOKEN>
<-- Content: (attrGroupRef | attribute)+ -->
</attrGroup>
<-- Category: archetype -->
<attrGroupRef
name = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA />
Example
<attrGroup name='myAttrGroup'> <attribute .../> ... </attrGroup> <archetype name='myelement' content='empty'> <attrGroupRef name='myAttrGroup'/> </archetype>Define and refer to an attribute group. The effect is as if the attribute declarations in the group were present in the archetype definition.
Ed. Note: There needs to be a Constraint on Schema which constrains the attributes which appear with an attrGroupRef: the name is the same as one of the attributes in the group, datatype and defaulting preserves substitutability, etc.
Ed. Note: There needs to be some discussion of what happens in case of name conflict between attrs as a result of an attr group ref.
When content of elements is not constrained by reference to a datatype (Datatype Definition (§3.4.1)), it can have any, empty, element-only or mixed content. In the latter cases, the form of the content is specified in more detail.
Content model | |||||
|
A content model constrains the element content of an archetype specification: it says nothing about attributes.
Content models do not have names, but appear as a part of the definition of an archetype, which does have a name. Model groups can be named and used by name, see below.
The satisfy-cm definition wrt schema-validity obtains.
A content model identified with only an elemModel specifies child elements only. If the mixed qualifier is present, text may occur as well as elements. In either case the content model consists of a simple grammar governing the allowed types of child elements and the order in which they must appear.
Rich content model | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Issue (namedTypeInModel): Symmetry suggests we might allow an archetypeDefn to appear in a content model, provided it's named.
The grammar for element-only content is built on model elements (modelElt above): elements, groups and archetype references. A model element provides for some number of occurrences in an instance of a single element (via elementRef or elementDecl), a group of elements (via anonModelGroup or modelGroupRef) or an archetype reference (via archetypeRef).
A group is two or more model elements plus a compositor. The compositor for a group specifies for a given group whether it provides for
all
compositor,
which is associated with the allGroup production);
These options reconstruct the XML
1.0 ,
connector, the XML 1.0 |
connector, the SGML
&
connector and the repeated disjunction of XML 1.0's Mixed
production respectively. In the first case (sequence
) all the
model elements must appear in the order given in the group; in the second case
(choice
), exactly one of the model elements must appear in the
element content; in the third case (all
), all the model elements,
which are restricted in this case only to unqualified
elements, must appear in the element content, but may
appear in any order; in the fourth case (many
), any number of the
model elements may appear in any order. The all
compositor may
only appear as the top-level compositor of a content model.
The occurs specification governs how many times the instance material allowed by a modelElt may occur at that point in the grammar, but note that the components of a group whose compositor is (implicitly) 'all' may not be qualified, and therefore call for exactly one appearance of the element they identify.
See Element Declaration (§3.4.9) for further discussion and examples of the appearance of elementDecl as one of the two expansions of element above.
For the interpretation of archetypeRef in this context, see Archetype Refinement * (§3.5).
<-- Category: archetype -->
<group
collection = "no" | "list"
maxOccurs = CDATA
minOccurs = "1"
name = NMTOKEN
order = "choice" | "seq" | "all" | "many"
ref = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA>
<-- Content: (element | group)* -->
</group>
The satisfy-eo definition wrt schema-validity obtains.
The Element Consistency (§6.2.3.6) Constraint on Schemas obtains.
The Unambiguous Content Model (§6.2.3.6) Constraint on Schemas obtains.
Issue (still-unambig): Should this compatibility constraint be preserved?
A content model which allows mixed content provides for mixing elements with
character data in document instances. The same elemModel mechanism is used for specifying the
grammar of the allowed elements, with the one change that the default for compositor is
changed to many
for elements and
groups at the top level of the content model, ensuring that the default
behaviour is the same as that of XML.
Example
<archetype content='mixed'> <element ref='name1'/> <element ref='name2'/> <element ref='name3'/> </archetype>Allows character data mixed with any number of name1
,name2
andname3
elements.
Issue (noEmptyReqd): We need to make the elemModel rhs optional, to allow for mixed with no elements specified == our minimum commitment model. This in turn would allow us if we chose to get rid of an explicit empty flag: just specifyelemOnly
and no model.We could then get rid of any as well, given other mechanisms for controlled openness we're contemplating.
The satisfy-mixed definition wrt schema-validity obtains.
This reconstructs another common use of parameter entities.
Named model groups | ||||||||||||||||||||
|
Groups defined with the allGroup option may only be referenced from a modelGroupRef which constitutes the only group at the top level of a content model.
<-- Category: top-level -->
<group
collection = "no" | "list"
maxOccurs = CDATA
minOccurs = "1"
name = NMTOKEN
order = "choice" | "seq" | "all" | "many"
ref = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA>
<-- Content: (element | group)* -->
</group>
Example
<group name='myModelGroup'> <element ref='myelement'/> </group> <element name='myelement'> <archetype> <group ref='myModelGroup'/> <attribute ...>. . .</attribute> </archetype> </element> <element name='anotherelement'> <archetype> <group order='choice'> <element ref='yetAnotherelement'/> <group ref='myModelGroup'/> </group> <attribute ...>. . .</attribute> </archetype> </element>A minimal model group is defined and used by reference, first as the whole content model, then as one alternative in a choice.
An [Definition:] element declaration associates an element name with a type, either by reference or by incorporation.
Element declaration | |||||||||||||||||||||||||
|
An element declaration associates a name with a
specification. This name will appear in tags in instance
documents; the specification provides SCs on
the form of elements tagged with the given name. An element
declaration whose elementSpec is an
archetypeSpec is comparable to an
<!ELEMENT ...>
declaration in an XML 1.0 DTD.
elementSpec not only allows for element declarations to associate a name with an archetypeSpec (by reference or inclusion), but also allows the reference or specification to be for a datatype, with the implication that no attributes are allowed in instances and the text-only content will be constrained appropriately.
elementRef and elementName provide for top-level element declarations to be referenced by name from content models.
As noted above element names are in a separate symbol space from the symbol spaces for the names of types, so there can (but need not be) an archetype or datatype with the same name as a top-level element.
In the case of ambiguity of type reference, that is when the typeRef option is used and there are both a datatype and an archetype of the referenced name in the relevant schema, the ambiguity is resolved in favour of the archetype.
NOTE: See previous note on the ambiguity issue.
The elt-fullname definition wrt schema-validity obtains.
An elementDecl may appear both at the top level of a schema and within a modelElt. See above (Rich Content Models (§3.4.6) and Mixed Content (§3.4.7)) for where this is allowed. This declares a locally-scoped association between an element name and a type. As with attribute names, locally-scoped element names reside in symbol spaces local to the archetype that defines them. Note however that archetype and datatype names are always top-level names within a schema, even when associated with locally-scoped element names.
NOTE: It is not yet clear whether a type defined implicitly by the appearance of a archetypeSpec or datatypeSpec directly within an elementSpec, or by the use of a typeRef which refers to a datatype, will have an implicit name, or if so what that name would be.
<-- Category: top-level -->
<element
archRef = NMTOKEN
default = CDATA
export = "true" | "false"
fixed = CDATA
maxOccurs = CDATA
minOccurs = "1"
name = NMTOKEN
ref = NMTOKEN
schemaAbbrev = NMTOKEN
schemaName = CDATA
type = NMTOKEN>
<-- Content: (archetype | datatype)? -->
</element>
Example
<element name='myelement' type='myDatatype'/> <element name='et0' type='myType'/> <element ref='et1'/> <element name='et1'> <archetype order='all'> <element . . . /> . . . <attribute ...>. . .</attribute> </archetype> </element> <element name='et2'> <archetype content='any'/> </element> <element name='et3'> <archetype content='empty'> <attribute ...>. . .</attribute> </archetype> </element> <element name='et4'> <archetype order='choice'> <element . . . /> . . . <attribute ...>. . .</attribute> </archetype> </element> <element name='et5'> <archetype order='seq'> <element . . . /> . . . <attribute ...>. . .</attribute> </archetype> </element> <element name='et6'> <archetype model='open' content='mixed'/> </element>A pretty complete set of alternatives. Note the last one is intended to be equivalent to the idea sometimes called WFXML, for Well-Formed XML: it allows any content at all, whether defined in the current schema or not, and any attributes.
<element name='contextOne'> <archetype order='seq'> <element name='myLocalelement' type='myFirstType'/> <element ref='globalelement'/> </archetype> </element> <element name='contextTwo' <archetype order='seq'> <element name='myLocalelement' type='mySecondType'/> <element ref='globalelement'/> </archetype> </element>Instances of myLocalelement
withincontextOne
will be constrained bymyFirstType
, while those withincontextTwo
will be constrained bymySecondType
.
NOTE: The possibility that differing attribute declarations and/or content models would apply to elements with the same name in different contexts is an extension beyond the expressive power of a DTD in XML 1.0.
The Nested May Not Be Global (§6.2.3.7) Constraint on Schemas obtains.
The Cannot Shadow Global (§6.2.3.7) Constraint on Schemas obtains.
The satisfy-ed definition wrt schema-validity obtains.
The ind-valid definition wrt schema-validity obtains.
The satisfy-etr definition wrt schema-validity obtains.
NOTE: This chapter articulates what has only been hinted at above, namely a considerable increase in the power and expressiveness of schema declarations, by explaining what was provided for in the abstract syntax in the previous section, but not explained much if at all at that point.
We provide for the refinement of archetypes defined in a schema. An archetype definition may identify one or more other archetypes from which it specifies the creation of a (joint) refinement.
NOTE: The balance of this chapter has been withdrawn, pending further discussion in the WG. A Task Force created from within the WG has investigated a range of issues and options for implementing the desired functionality, as called for in the [XML Schema Requirements]. The Task Force has produced a report [Refinement TF Report], which will form the basis of a design to be filled in here.
[Definition:] an archetype AT1 is said to refine an archetype AT2 if and only if AT1 is declared to refine either AT2 or (recursively) some archetype that refines AT2. [Definition:] AT2 is then said to be an ancestor of AT1. [Definition:] The effective constraints are the union of the explicit and the acquired.
Entities and notations | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Internal parsed entities are a feature of XML that enables reuse of text fragments by direct reference in an instance document.
In XML Schema: Structures documents, internal parsed entities are declared by using the textEntitySpec production.
<-- Category: top-level -->
<textEntity
export = "true" | "false"
name = NMTOKEN>
<-- Content: text -->
</textEntity>
Example
<textEntity name='flavor'>Fresh mint</textEntity'>flavor
can now be used in an entity reference in instances of the containing schema.
See Schema Validity * (§6.1) for SCs covering entities and entity references.
External parsed entities are a feature of XML that offers a method for including well-formed XML document fragments, including text and markup, by direct reference to the storage object of the parsed entity.
In schemas, external parsed entities are declared by using the externalEntitySpec production.
<-- Category: top-level -->
<externalEntity
export = "true" | "false"
name = NMTOKEN
notation = "XML"
public = CDATA
system = CDATA />
Example
<externalEntity name='FrontMatter' system='FrontMatter.xml' /> <externalEntity name='Chapter1' system='chapter1.xml' /> <externalEntity name='Chapter2' system='Chapter2.xml' /> <externalEntity name='BackMatter' system='BackMatter.xml' />These four external entities represent the supposed contents of a book:
<book> &FrontMatter; &Chapter1; &Chapter2; &BackMatter; </book>In an instance, the external entities take their familiar XML form. The processor expands the entities for their content.
Again, See Schema Validity * (§6.1) for SCs covering entities and entity references.
External unparsed entities are a feature of XML that offers a baroque method for including binary data by indirect reference to both the storage object and the the notation type of the unparsed entity. In schemas, external parsed entities may be declared by using the unparsedEntitySpec production.
<-- Category: top-level -->
<unparsedEntity
export = "true" | "false"
name = NMTOKEN
notation = NMTOKEN
public = CDATA
system = CDATA />
Example
<unparsedEntity name='SKU-5782-pic' system='http://www.vendor.com/SKU-5782.jpg' notation='JPEG' />
<picture location='SKU-5782-pic'/>The picture element carries an attribute which is (presumably) governed by the unparsed entity declaration.
The Attribute is Entity (§6.2.5) Schema Validity Constraint obtains.
Issue (unparsed-entity-gaps): There are lots of gaps and little problems in this design for unparsed entities.
A notation may be declared by specifying a name and an identifier for the notation. A notation may be referenced by name in a schema as part of an external entity declaration.
<-- Category: top-level -->
<notation
export = "true" | "false"
name = NMTOKEN
public = CDATA
system = CDATA />
Example
<notation name='jpeg' public='image/jpeg' system='viewer.exe' /> <element name='picture> <archetype> <attribute name='entity' type='NOTATION'/> </archetype> </element>
<picture entity='SKU-5782-pic'/>The notation need not ever be mentioned in the instance document.
Issue (unparsed-entity-attributes): We need to synchronise with XML Schemas: Datatypes regarding how we declare attributes as unparsed entities!
This chapter describes facilities to provide for validation of namespace-qualified instance document elements and attributes, and potentially (subject to enhancements to the Namespaces recommendation), entities and notations.
NOTE: 'Namespaces in XML' [XML-Namespaces] provides an enabling framework for modular composition of schemas. From that document:
"We envision applications of Extensible Markup Language (XML) where a single XML document may contain elements and attributes (here referred to as a 'markup vocabulary') that are defined for and used by multiple software modules. One motivation for this is modularity; if such a markup vocabulary exists which is well-understood and for which there is useful software available, it is better to re-use this markup rather than re-invent it. " "Such documents, containing multiple markup vocabularies, pose problems of recognition and collision. Software modules need to be able to recognize the tags and attributes which they are designed to process, even in the face of 'collisions' occurring when markup intended for some other software package uses the same element or attribute name. " "These considerations require that document constructs should have universal names, whose scope extends beyond their containing document. This specification describes a mechanism, XML Schema namespaces, which accomplishes this. "
XML Schema: Structures provides facilities to enable declaration and modular composition of schemas.
The balance of this section as it appeared in the previous public WD has been withdrawn, pending further discussion in the WG. A Task Force created from within the WG has investigated a range of issues and options for implementing the desired functionality. The Task Force has produced a report [Composition TF Report], which will form the basis of a design to be filled in here.
Ed. Note: I've chosen to include here a brief summary of the expected TF report, addressing the instance->schema connection issue, on the grounds that confusion is rampant in the rest of the world wrt this issue, and we will benefit both ourselves and others by signalling our thinking here as soon as possible.
The TF report recommends a layered approach, where the base layer simply addresses the process of schema validation where the element information item to be validated and the schema to validate it with are known. The second layer discusses mechanisms by which processors may locate schemas, but emphasises that this is always in the end an processor- and environment-dependent process. The following candidate mechanisms are identified:
Issue (schema-search-ordered): It is worth noting that the TF has not reached consensus on whether the above list should be understood as ordered, that is, in the case where more than one is viable, should we identify a precedence.
Export | |||||||||||||||||||||||||||||||||||||||||||||
|
Import | |||||
|
Import restrictions | ||||||||||||||||||||
|
Include | |||||
|
NOTE: Documentation facilities have purposely been left out of this draft of the XML Schema Definition Language specification. The editors chose to concentrate on other topics. It is anticipated that explanation elements will be provided for within any of the Schema elements. Their purpose is to encapsulate documentation for the elements within which they are contained. Elements for narrative exposition at the top level of a schema have also been proposed.
Proposals for XML Schema documentation include defining a custom set of elements, allowing any content at all, allowing all or part of [HTML-4], DocBook or TEI. There are good arguments for each of these proposals.
The Working Group must identify its requirements and constraints.
Issue (error-behavior): This draft includes extensive discussion of conformance and validity checking, but rules for dealing with errors are missing. In future, we must distinguish errors from fatal errors, and clarify rules for dealing with both.
NOTE: This section is not up to date or in sync with the rest of the document, but is included here to avoid breaking huge numbers of references
We approach the definition of schema validity one step at a time. In the definitions below we deal primarily in terms of information sets, rather than the documents which give rise to them: see [XML-Infoset] for definitions of item, RUE and information set.) Please note that the formal definitions below are explicitly not couched in processing terms: they describe properties of an information set, but do not tell you how to check an information set to see if it has those properties.
First we have to get to the schema(s) involved. This is slightly tricky, as not all namespace declarations will resolve to schemas, and not everything that purports to be a schema will be one.
[Definition:] A URI is said to
nominate a schema if it resolves to an element item in the
information set of a well-formed XML 1.0 document whose local name is
schema
and whose namespace item's URI identifies either
or
[Definition:] A URI is said to resolve successfully to a schema if it nominates a schema, and the element item it resolves to represents an XML schema, that is:
[Definition:] An element item is schema-ready if the URI of any of its namespace declaration items which nominates a schema resolves successfully to a schema.
Issue (namespace-declaration-items): Namespace items associated with namespace declarations have disappeared from the most recent version [XML-Infoset]. Several WGs need them, we expect they'll be back, otherwise we can reconstruct what we need from element and attribute namespace items alone with some effort.
[Definition:] A document is schema-ready if every element item anywhere in its information set is schema-ready.
Note that this means that documents with no namespace declarations, or only namespace declarations which do not nominate schemas are none-the-less schema-ready.
[Definition:] We say an element item is schema-governed if its name is in a namespace, and the URI of the information item for that namespace resolves successfully to a schema.
[Definition:] We use the name schema root for any element item which is schema-governed and which is either
or
The provision within XML Schema: Structures of a mechanism for defining
parsed entities presents problems
for the relationship between schema-validity and XML 1.0 well-formedness, since
references to entities declared only in a schema are undefined from the XML 1.0
perspective. Strictly speaking, a well-formed XML document may contain
references to undefined entities only if it is declared as
standalone='no'
and contains either an external subset
or one or more references to external parameter entities in their internal
subset. We get around this by [Definition:] defining a nearly well-formed XML
document to be one which either is well-formed per XML 1.0, or which fails to
be well-formed only because of undefined general entity references, but which
would be well-formed if it were standalone='no'
and
identified an external subset. We consider this justified on the
grounds that the use of a namespace declaration which refers to a schema
functions rather as an external subset, and from the XML 1.0 perspective such a
reference almost of necessity renders the document non-standalone when
schema-validation is applied.
[Definition:] We use the name string-infoset-in-context for the XML 1.0 information set items arising from the interpretation of a string in the context of a particular point in an XML 1.0 information set.
[Definition:] The effective element item of an element item (call this OEI) is an element item whose
The Expansions Schema-Ready (§6.2.8) Schema Validity Constraint obtains.
The Ungoverned RUE (§6.2.8) Schema Validity Constraint obtains.
The RUE Entity Declared (§6.2.8) Schema Validity Constraint obtains.
Note that the above constraints and definition mean that in error-free documents, all element items, even ones which are not schema-governed, have well-defined effective element items.
[Definition:] A document is schema-valid if and only if:
NOTE: The validity of all other schema-governed element items follows from (3) above by the recursive nature of the Schema-validity Constraint referenced there.
NOTE: It is intentional that the above definition labels as schema-valid a document with no namespace declarations or with only namespace declarations which do not nominate schemas.
Note that there is no requirement that the schema root mentioned above be the root of its document, or that schemas be the roots of their documents, or that schema and schema root be in different documents. Accordingly, it is possible for a single schema-valid document to contain both a schema and the material which it validates.
The interaction between XML 1.0 DTDs and XML Schemas is complex but clear:
NOTE: The above is silent on whether schema-valid documents must be Namespace-conforming.
[Definition:] The augmented information set of a schema-valid document is the information set rooted in the effective element item of its document element, augmented by all the information items described in any Schema Information Set Contributions which apply to any information items anywhere within it.
Constraint on Schemas: Unique Definition
The same NCName must not appear in
two definitions or declarations of the same type.
Constraint on Schemas: Consistent Import
A schemaAbbrev or schemaName in a schemaRef
must be declared in an Schema Import (§4.2.2) of the current
schema, and the NCName qualified by that
schemaRef must be an import (Import Restrictions (§4.2.3)) of the appropriate type per that
declaration.
Constraint on Schemas: One Reference Only
The concrete syntax uses schemaAbbrev
and
schemaName
attributes to realise schemaName. It is an error for both these attributes
to appear on the same element in a schema.
[Definition:] A ...Ref identifies a ...Spec provided there is a definition or declaration of that ...Spec in the appropriate schema whose NCName matches the NCName of the ...Ref's ...Name. If there is no schemaRef in the ...Name, the appropriate schema is the current schema or a schema it eventually includes; if there is a schemaRef, the URI contained in or abbreviated by it must resolve successfully to a schema, which is then the appropriate schema.
Constraint on Schemas: Avoid Built-ins
The NCName must not be the same as
the name of any of the built-in datatypes (see [XML Schemas: Datatypes]).
[Definition:] A string (possibly empty) dt-satisfies a datatypeSpec and an optional datatypeQual if
and
Schema Information Set Contribution: Datatype Info
When a string dt-satisfies a
datatypeRef and an optional
datatypeQual, the containing attribute or
element information item will be augmented to indicate the
datatypeSpec and the specialize (if any) which it satisfied.
Constraint on Schemas: AttrGroup Unique
The same attrGroupDefn must not be
referenced by two or more attrGroupRefs in the
same archetypeSpec.
Constraint on Schemas: AttrGroup Identified
Every attrGroupRef in an
archetypeSpec must identify an attrGroupSpec.
[Definition:] The attribute declaration set of an archetypeSpec consists of all its effective attributes together with all the attributes contained in the attrGroupSpecs identified by any attrGroupRefs it contains.
[Definition:] The full name of an attribute in an attribute declaration set is its NCName plus its schemaName, i.e. if it appeared directly in the archetypeSpec, the empty string, if it was acquired by refinement or if it came from an attrGroupSpec, then the schemaName from the schemaRef which identified the relevant archetypeSpec or attrGroupSpec respectively, if any, otherwise the empty string.
Constraint on Schemas: Attribute Locally Unique
The same full name must not
appear more than once in any archetypeSpec's
attribute declaration set.
[Definition:] An element item a-satisfies an archetypeSpec if the element item's attribute items taken together as a set attrs-satisfy the archetypeSpec's attribute declaration set, and either
or
Issue (sic-elt-default): The above definitions do not provide for handling a default on an archetype's datatypeRef. Preferred solution: empty element items ipso facto satisfy datatypeRefs with defaults and are augmented with the default value. This would have the consequence that you cannot provide the empty string as the explicit value of an element item if it's governed by a datatypeRef with a default.
Schema Information Set Contribution: Archetype Info
When an element item a-satisfies a
archetypeSpec, that element information item
will be augmented to indicate the archetypeSpec
which it satisfied.
[Definition:] An attribute item attr-satisfies an attribute if
or
where the attribute item's value consists of only character information items and by its "value string" is meant the string formed by concatenating the characters of each of those character information item children, if any, or else the empty string.
[Definition:] The attribute items of an element item attrs-satisfy an attribute declaration set if
and
Schema Information Set Contribution: Attribute Value Default
For every attribute in the
attribute declaration set
not used to attr-satisfy
an attribute item in the context of (1a) above which has a
datatypeRef which has a default, an
attribute item with the default value is added to the parent element item.
[Definition:] A sequence of character and element items (call this CESeq) model-satisfies an effective contentModel if
or
Constraint on Schemas: Element Unique in Mixed
A given NCName must not appear two or
more times among the elementDecls and
elementRefs with no schemaRefs; a given elementName must not appear two or more times
among the elementRefs.
[Definition:] An element item mixed-satisfies a mixed if
or
or
Issue (mixed-change-current-schema): There's an implicit change in current schema in the definition of satisfy-mixed above which should be made explicit.
[Definition:] A sequence of element items elemOnly-satisfies an effective elemOnly if
NOTE: The above definition of elemOnly-satisfy does not explicitly incorporate the modifications required when the containing archetype is open, as set out at the end of Archetype Refinement * (§3.5), but it should be understood as doing so.
Constraint on Schemas: Element Consistency
A given NCName must not appear both
among the elementDecls and among the
elementRefs with no schemaRefs, or more than once among the
elementDecls.
NOTE: Note that the above permits repeated use of the same elementRef, analogous to DTD usage.
NOTE: EDITORS: Add a COS for the checking of valid pairs of minOccurs and maxOccurs.
Constraint on Schemas: Unambiguous Content Model
For compatibility, it is an error if a content model is such that there
exist element item sequences within which some item can match more than one
occurrence of an elementRef or
elementDecl in the content model.
[Definition:] The full name of a top-level elementDecl is its NCName plus its schemaName, i.e. if it appeared directly in the current schema or an include, the empty string, if it was imported, then the schemaName of that import, which must successfully resolved to its containing schema.
[Definition:] An element item e-satisfies an elementDecl if the elementDecl:
or
Constraint on Schemas: Nested May Not Be Global
An elementSpec in a nested
elementDecl must not be global.
Constraint on Schemas: Cannot Shadow Global
If a top-level elementSpec is
global, then the NCName of its
elementDecl must not be redeclared by any
nested elementDecl in the same schema or
any schema it eventually
includes.
[Definition:] An element item is independently valid if there is a top-level elementDecl whose NCName matches its name in the schema its namespace item resolves to (or a schema that schema includes, in which case see the definition of identify for details on which declaration is used if there is more than one), and the element item must e-satisfy that elementDecl.
[Definition:] An element item ref-satisfies an elementRef if
or
NOTE: The last clause above is much too complex, it needs to be split apart and built up in stages. It is this which allows elements based on refining archetypes to appear in place of those based on their ancestors.
Constraint on Schemas: Allowed Refinements
An archetype must not refine one or more other archetypes unless all of the
latter have been declared with either open or refinable
(explicitly or by default: the default for model on any
archetype which does not explicitly specify one is provided by the
model of the schema itself,
which in turn defaults to closed for compatibility). The same archetype
must not be referenced more than once in the refinements list.
Schema-validity Constraint: Attribute is Entity
When an attribute value is interpreted as a reference to an unparsed entity
[How?!], the attribute value must identifies an unparsedEntitySpec (note that no
schemaRef can be specified in this case); the
NCName of the notationRef of that unparsedEntitySpec must
identify a notationSpec; the resource specified by the
systemID and publicID
attribute must be available.
Constraint on Schemas: Refer to Schema
The URI associated with a schemaRef in any of
the productions above must successfully
resolve to a schema.
Constraint on Schemas: Name Consistently Defined
The NCName in each of the above
productions must identify a declaration or definition of the corresponding
class (element, archetype, etc.)
Constraint on Schemas: Preorder Priority for Included Definitions
When using a ...Ref to identify
a ...Spec, if there is no appropriate matching declaration or definition
in the current schema, but there is more than one
eventually included schema
which contains an appropriate matching declaration or definition, the
...Spec whose declaration or definition occurs first in a preorder
traversal of the eventually
included schemas is the one identified.
[Definition:] A schema directly includes another schema if the first schema has an include and the URI contained in or abbreviated by the schemaRef of that include resolves successfully to the second schema.
[Definition:] A schema eventually includes another schema if the first schema directly includes the second, or if the first schema directly includes some other schema which itself eventually includes the second.
Schema-validity Constraint: Expansions Schema-Ready
Any element item anywhere within the string-infoset-in-context replacing an RUE child per
the above must be schema-ready.
Schema-validity Constraint: Ungoverned RUE
RUEs must not appear in element items
which are not schema-governed,
that is in the values of attributes of or as children of such elements.
Schema-validity Constraint: RUE Entity Declared
For every RUE appearing in a
schema-governed element, there
must be a parsed entity declaration in the referenced schema whose name matches
the name of the RUE.
NOTE: This section has fallen out of alignment with the rest of the specification, but is included none-the-less to give a feeling for how this section will eventually look: the details should not be taken too seriously.
Each step in the following presupposes the successful outcome of the previous step.
A conforming XML Schema processor must:
NOTE: Note that the schema contribution to the information set above is meant to be suggestive only at this point, until we've articulated all the Schema Information Set Contributions in the preceding sections.
The XML Schema definition for XML Schema: Structures itself is presented here as normative part of the specification, and as an illustrative example of the XML Schema in defining itself with the very constructs that it defines. The names of XML Schema language types, elements, attributes and groups defined here are evocative of their purpose, but are occasionally verbose.
There is some annotation in comments, but a fuller annotation will require the use of embedded documentation facilities or a hyperlinked external annotation for which tools are not yet readily available.
Since an XML Schema: Structures is an XML document, it has optional XML and doctype
declarations that are provided here for completeness. The root
schema
element defines a new schema. Since this is a schema for
XML Schema: Structures, the targetNS
references the XML Schema namespace itself, and specifies that this
is version "0.8".
In the following definition of the schema
element, the
preamble is realised with attributes corresponding
to targetNamespace, schemaVersion and model,
and a sequence of nested elements for import,
export and include. The
xmlns
attribute corresponds to xmlSchemaRef. The
schema
's definitions and declarations are represented by
datatype
, archetype
, element
,
attribute
, attrGroup
, group
,
textEntity
, externalEntity
,
unParsedEntity
and notation
.
<?xml version='1.0'?> <!-- XML Schema schema for XML Schemas: Part 1: Structures --> <!-- Id: structures.xsd,v 1.7 1999/10/27 13:25:46 ht Exp --> <!DOCTYPE schema PUBLIC "-//W3C//DTD XMLSCHEMA 19991105//EN" "structures.dtd"> <schema xmlns='http://www.w3.org/1999/XMLSchema' targetNS='http://www.w3.org/1999/XMLSchema' version='0.6'> <!-- The datatype element and all of its members are defined in XML Schema: Part 2: Datatypes --> <include schemaName='http://www.w3.org/TR/1999/WD-xmlschema-2-19991105/datatypes.xsd'/> <!-- The NCName datatype is widely used for the names of components --> <datatype name='NCName'><basetype name='NMTOKEN'/> </datatype> <!-- The public datatype is used for entities and notations --> <datatype name='public'><basetype name='string'/> </datatype> <!-- schema element --> <element name='schema'> <archetype> <element ref='import' minOccurs='0' maxOccurs='*'/> <element ref='include' minOccurs='0' maxOccurs='*'/> <element ref='export' minOccurs='0'/> <group order='choice' minOccurs='0'> <element ref='comment'/> <element ref='datatype'/> <element ref='archetype'/> <element ref='element'/> <element ref='attrGroup'/> <element ref='group'/> <element ref='textEntity'/> <element ref='externalEntity'/> <element ref='unparsedEntity'/> <element ref='notation'/> </group> <attribute name='targetNS' type='uri'/> <attribute name='version' type='string'/> <attribute name='xmlns' type='uri' default='http://www.w3.org/1999/XMLSchema'/> <attribute name='model' type='NCName' default='closed'> <enumeration> <literal>open</literal> <literal>refinable</literal> <literal>closed</literal> </enumeration> </attribute> </archetype> </element> <!-- comment element --> <element name='comment' type='string'/> <!-- ################################################ --> <!-- Toplevel and named: Fundamental archetypes, used hereafter to build the schema. --> <!-- A toplevel specifies an export control in addition to a name. --> <!-- ################################################ --> <!-- named archetype --> <archetype name='named' model='refinable'> <attribute name='name' type='NCName'/> </archetype> <!-- toplevel archetype --> <archetype name='toplevel' model='refinable'> <refines name='named'/> <attribute name='export' type='boolean'/> </archetype> <!-- ################################################ --> <!-- ######### Toplevel elements #################### --> <!-- ################################################ --> <!-- The datatype definition element is defined in XML Schema Part 2: Datatypes. --> <!-- qualifiable archetype For references which may be to components of other schemas. reference and typeRef are sub-types of qualifiable --> <archetype name='qualifiable' model='refinable'> <attribute name='schemaName' type='uri'/> <attribute name='schemaAbbrev' type='NCName'/> </archetype> <!-- reference archetype refines and attrGroupRef are kinds of reference --> <archetype name='reference' model='refinable'> <refines name='qualifiable'/> <attribute name='name' type='NCName'/> </archetype> <!-- for types (group and element) which use ref instead of name --> <archetype name='ref' model='refinable'> <refines name='qualifiable'/> <attribute name='ref' type='NCName'/> </archetype> <!-- typeRef archetype --> <!-- 'element', 'archetype' and 'attribute' are all kinds of typeRef --> <archetype name='typeRef' model='refinable'> <refines name='qualifiable'/> <attribute name='type' type='NCName'/> <attribute name='default' type='string'/> <attribute name='fixed' type='string'/> </archetype> <!-- modelGroup archetype --> <!-- modelGroup, group and archetype are all kinds of modelGroup --> <archetype name='modelGroup' model='refinable'> <attribute name='order' default='seq'> <enumeration> <literal>choice</literal> <literal>seq</literal> <literal>all</literal> <literal>many</literal> </enumeration> </attribute> </archetype> <!-- modelElt archetype --> <!-- the abstract class of all model elements: groups, elements and modelGroupRefs --> <archetype name='modelElt' model='refinable'> <attribute name='minOccurs' type='non-negative-integer' default='1'/> <attribute name='maxOccurs' type='string'/> <!-- allows '*', so integer won't do --> </archetype> <!-- The archetype element refines the toplevel, typeRef and modelGroup archetypes. It may include a refines element that specifies the archetype(s) that is is based on, and either a datatypeQual or a model, followed by any number of attribute and attrGroupRef elements. --> <!-- archetype element --> <element name='archetype'> <archetype> <refines name='toplevel'/> <refines name='typeRef'/> <refines name='modelGroup'/> <element ref='refines' minOccurs='0' maxOccurs='*'/> <group order='choice'> <element archRef='modelElt' minOccurs='0' maxOccurs='*'/> <element ref='datatypeQual' minOccurs='0'/> </group> <group order='choice' minOccurs='0' maxOccurs='*'> <element ref='attribute'/> <element ref='attrGroupRef'/> </group> <attribute name='content' default='elemOnly'> <enumeration> <literal>elemOnly</literal> <literal>textOnly</literal> <literal>mixed</literal> <literal>empty</literal> <literal>any</literal> </enumeration> </attribute> <attribute name='model' type='NCName'> <!-- default comes from schema model --> <enumeration> <literal>open</literal> <literal>refinable</literal> <literal>closed</literal> </enumeration> </attribute> </archetype> </element> <!-- refines element --> <element name='refines'> <archetype content='empty'> <refines name='reference'/> </archetype> </element> <!-- The element element refines the toplevel and typeRef archetype. It can be used either at the toplevel to define an element-type binding globally, or within a content model to either reference a globally-defined element or archetype or declare an element-type binding locally. The ref/archRef forms are not allowed at the top level --> <!-- element element --> <element name='element'> <archetype order='choice'> <refines name='toplevel'/> <refines name='ref'/> <refines name='typeRef'/> <refines name='modelElt'/> <element ref='datatype'/> <element ref='archetype'/> </archetype> </element> <!-- The group element refines the toplevel, modelElt, reference and modelGroup archetypes. --> <!-- group element --> <element name='group'> <archetype> <refines name='toplevel'/> <refines name='ref'/> <refines name='modelElt'/> <refines name='modelGroup'/> <element archRef='modelElt' minOccurs='0' maxOccurs='*'/> </archetype> </element> <!-- The datatypeQual archetype provides for modifying datatypes referenced from attribute declarations and archetype definitions. The 'facets' group is defined in the datatype schema. It is realised by the datatypeQual element and refined by the attribute element --> <archetype name='datatypeQual' order='many'> <group ref='facets'/> </archetype> <!-- The datatypeQual element realises the datatypeQual archetype --> <element name='datatypeQual' type='datatypeQual'/> <!-- the attribute element declares attributes --> <element name='attribute'> <archetype> <refines name='datatypeQual'/> <refines name='typeRef'/> <refines name='named'/> <attribute name='minOccurs' type='non-negative-integer' default='0'> <enumeration> <literal>0</literal> <literal>1</literal> </enumeration> </attribute> <attribute name='maxOccurs' type='integer' fixed='1'/> </archetype> </element> <!-- attrGroup element --> <element name='attrGroup'> <archetype> <refines name='toplevel'/> <group order='choice' minOccurs='1' maxOccurs='*'> <element ref='attribute'/> <element ref='attrGroupRef'/> </group> </archetype> </element> <!-- The attrGroupRef element refines the reference archetype. --> <!-- attrGroupRef element --> <element name='attrGroupRef'> <archetype> <refines name='reference'/> </archetype> </element> <!-- The textEntity element refines the toplevel archetype. It provides for string content to specify the entity value. --> <!-- textEntity element --> <element name='textEntity'> <archetype type='string'> <refines name='toplevel'/> </archetype> </element> <!-- The externalRef archetype provides for specification of a uri, an optional public identifier, and a notation attribute. It refines the toplevel archetype --> <archetype name='externalRef' model='refinable' content='empty'> <refines name='toplevel'/> <attribute name='system' type='uri' minOccurs='1'/> <attribute name='public' type='public'/> </archetype> <!-- the typedExternalRef adds a required notation to an external ref --> <archetype name='typedExternalRef' model='refinable'> <refines name='externalRef'/> <attribute name='notation' type='NOTATION' minOccurs='1'/> </archetype> <!-- The externalEntity and unparsedEntity elements are based on the typedExternalRef archetype. --> <!-- externalEntity element --> <element name='externalEntity'> <archetype> <refines name='typedExternalRef'/> <attribute name='notation' fixed='XML'/> </archetype> </element> <!-- unparsedEntity element --> <element name='unparsedEntity'> <archetype> <refines name='typedExternalRef'/> </archetype> </element> <!-- The notation element refines the externalRef archetype. --> <element name='notation'> <archetype> <refines name='externalRef'/> </archetype> </element> <!-- ################################################ --> <!-- import, export and include --> <!-- The import, export and include elements all refine the restrictions archetype, whose attributes can be used to enable or disable import and export restrictions. Within import and include elements, references to the components of foreign schemas control their importation or inclusion, respectively. --> <!-- The import and include elements both refine external --> <!-- ################################################ --> <archetype name='restrictions' model='refinable'> <attribute name='datatypes' type='boolean' default='true'/> <attribute name='archetypes' type='boolean' default='true'/> <attribute name='elements' type='boolean' default='true'/> <attribute name='attrGroups' type='boolean' default='true'/> <attribute name='modelGroups' type='boolean' default='true'/> <attribute name='entities' type='boolean' default='true'/> <attribute name='notations' type='boolean' default='true'/> </archetype> <archetype name='external' order='choice' model='refinable'> <refines name='restrictions'/> <element ref='component' minOccurs='0' maxOccurs='*'/> <attribute name='schemaName' minOccurs='1' type='uri'/> </archetype> <!-- component element, used in external --> <element name='component'> <archetype content='empty'> <attribute name='name' type='NCName' minOccurs='1'/> <attribute name='type' minOccurs='1'> <enumeration> <literal>datatype</literal> <literal>archetype</literal> <literal>element</literal> <literal>attrGroup</literal> <literal>modelGroup</literal> <literal>entity</literal> <literal>notation</literal> </enumeration> </attribute> </archetype> </element> <!-- import element --> <element name='import'> <archetype> <refines name='external'/> <attribute name='schemaAbbrev' minOccurs='1' type='NCName'/> </archetype> </element> <!-- export element --> <element name='export'> <archetype content='empty'> <refines name='restrictions'/> </archetype> </element> <!-- include element --> <element name='include' type='external'/> <!-- ################################################ --> <!-- notations for use within XML Schema schemas --> <!-- ################################################ --> <notation name='XMLSchemaStructures' public='structures' system='http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.xsd'/> <notation name='XML' public='REC-xml-19980210' system='http://www.w3.org/TR/1998/REC-xml-19980210'/> </schema>
NOTE: And that is the end of the schema for XML Schema: Structures.
The DTD for XML Schema: Structures is given below. Note there is no
implication here the schema
must be the root element of a document.
<!-- DTD for XML Schemas: Part 1: Structures --> <!-- Id: structures.dtd,v 1.9 1999/10/27 13:26:12 ht Exp --> <!ELEMENT schema ((import*, include*, export?, (comment | datatype | archetype | element | attrGroup | group | notation | textEntity | externalEntity | unparsedEntity)* ))> <!ATTLIST schema targetNS CDATA #IMPLIED version CDATA #IMPLIED xmlns CDATA 'http://www.w3.org/1999/XMLSchema' model (open|refinable|closed) 'closed' > <!ELEMENT import (component*) > <!ATTLIST import schemaAbbrev NMTOKEN #REQUIRED schemaName CDATA #REQUIRED datatypes (true|false) 'true' archetypes (true|false) 'true' elements (true|false) 'true' attrGroups (true|false) 'true' groups (true|false) 'true' entities (true|false) 'true' notations (true|false) 'true' > <!ELEMENT component EMPTY > <!ATTLIST component name NMTOKEN #REQUIRED type (datatype|archetype|element|attrGroup|group| entity|notation) #REQUIRED> <!ELEMENT export EMPTY > <!ATTLIST export datatypes (true|false) 'true' archetypes (true|false) 'true' elements (true|false) 'true' attrGroups (true|false) 'true' groups (true|false) 'true' entities (true|false) 'true' notations (true|false) 'true' > <!ELEMENT include (component*) > <!ATTLIST include schemaName CDATA #REQUIRED datatypes (true|false) 'true' archetypes (true|false) 'true' elements (true|false) 'true' attrGroups (true|false) 'true' groups (true|false) 'true' entities (true|false) 'true' notations (true|false) 'true' > <!-- --> <!-- comments contain text --> <!-- --> <!ELEMENT comment (#PCDATA) > <!-- The datatype element is defined in XML Schema: Part 2: Datatypes --> <!ENTITY % xs-datatypes PUBLIC "-//W3C//DTD XMLSCHEMA datatypes 19991105//EN" 'http://www.w3.org/TR/1999/WD-xmlschema-2-19991105/datatypes.dtd' > %xs-datatypes; <!-- --> <!-- an archetype is a named content type specification with attribute declarations--> <!-- --> <!ELEMENT archetype (refines*, ((element|group)*|datatypeQual?), (attribute|attrGroupRef)*)> <!-- note that datatypeQual only if type attr present --> <!ATTLIST archetype name NMTOKEN #IMPLIED content (textOnly|mixed|elemOnly|empty|any) 'elemOnly' model (open|refinable|closed) #IMPLIED order (choice|seq|all|many) #IMPLIED type NMTOKEN #IMPLIED default CDATA #IMPLIED fixed CDATA #IMPLIED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED > <!-- Note that schemaAbbrev/Name, default|fixed only if type, type iff content='textOnly', in which case must name a datatype --> <!-- Note that if order is 'all', group/groupRef is not allowed --> <!-- If order is 'all', minOccurs==maxOccurs==1 on element --> <!-- Default for order is 'seq' unless content='mixed', in which case it's 'many' --> <!ELEMENT refines EMPTY> <!ATTLIST refines name NMTOKEN #REQUIRED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED > <!-- --> <!-- an element is declared by either: a name and a type (either nested or referenced via the type attribute) or: a ref to an existing element declaration --> <!-- --> <!ELEMENT element ((archetype|datatype)?)> <!-- archetype or datatype only if no type|ref|archRef attribute --> <!-- ref|archRef not allowed at top level --> <!ATTLIST element name NMTOKEN #IMPLIED ref NMTOKEN #IMPLIED archRef NMTOKEN #IMPLIED type NMTOKEN #IMPLIED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED minOccurs CDATA '1' maxOccurs CDATA #IMPLIED export (true|false) 'true' default CDATA #IMPLIED fixed CDATA #IMPLIED> <!-- type, ref and archRef are mutually exclusive. schemaName/Abbrev applies to whichever is there, not allowed if neither. If name is absent, ref is required --> <!-- maxOccurs defaults to 1 or minOccurs, whichever is greater --> <!ELEMENT group (element|group)*> <!ATTLIST group name NMTOKEN #IMPLIED export (true|false) 'true' collection (no|list) 'no' minOccurs CDATA '1' maxOccurs CDATA #IMPLIED order (choice|seq|all|many) 'seq' ref NMTOKEN #IMPLIED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED> <!-- Three different functions: as a named group definition, as an anonymous grouping in a model and as a reference to a named group --> <!-- Name and export only at top level. Name and ref are mutually exclusive, as are ref and content --> <!-- Note that if order is 'all', group is not allowed inside. If order is 'all' THIS group must be or be referenced alone at the top level of a content model --> <!-- If order is 'all', minOccurs==maxOccurs==1 on element --> <!-- the entity reference below is discharged in datatypes.dtd --> <!ELEMENT datatypeQual (%facets;)*> <!-- --> <!-- an attribute declaration names an attribute specification --> <!-- --> <!ELEMENT attribute (%facets;)*> <!ATTLIST attribute name NMTOKEN #REQUIRED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED type CDATA 'string' maxOccurs CDATA #FIXED '1' minOccurs (0|1) '0' default CDATA #IMPLIED fixed CDATA #IMPLIED> <!-- an attrGroup is a named collection of attribute decls --> <!ELEMENT attrGroup (attribute | attrGroupRef)+ > <!ATTLIST attrGroup name NMTOKEN #REQUIRED export (true|false) 'true' > <!ELEMENT attrGroupRef EMPTY > <!ATTLIST attrGroupRef name NMTOKEN #REQUIRED schemaAbbrev NMTOKEN #IMPLIED schemaName CDATA #IMPLIED > <!-- --> <!-- Entities and notations in XML Schema --> <!-- --> <!-- a textEntity can be referenced in documents of this type --> <!ELEMENT textEntity (#PCDATA) > <!ATTLIST textEntity name NMTOKEN #REQUIRED export (true|false) #FIXED 'true' > <!-- an externalEntity can be referenced in documents of this type --> <!ELEMENT externalEntity EMPTY > <!ATTLIST externalEntity name NMTOKEN #REQUIRED export (true|false) #FIXED 'true' public CDATA #IMPLIED system CDATA #REQUIRED notation NMTOKEN #FIXED 'XML'> <!-- declares notation to be a 1st class element or entity content types --> <!ELEMENT notation EMPTY > <!ATTLIST notation name NMTOKEN #REQUIRED export (true|false) #FIXED 'true' public CDATA #REQUIRED system CDATA #IMPLIED> <!-- an unparsedEntity can be referenced in documents of this type --> <!ELEMENT unparsedEntity EMPTY > <!ATTLIST unparsedEntity name NMTOKEN #REQUIRED export (true|false) #FIXED 'true' public CDATA #IMPLIED system CDATA #REQUIRED notation NMTOKEN #REQUIRED > <!NOTATION XMLSchemaStructures PUBLIC 'structures' 'http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.xsd' > <!NOTATION XML PUBLIC 'REC-xml-1998-0210' 'http://www.w3.org/TR/1998/REC-xml-19980210' >
Ed. Note: The Glossary has barely been started. An XSL macro will be used to collect definitions from throughout the spec and gather them here for easy reference.
The following have contributed material to this draft:
The editors acknowledge the members of the XML Schema Working Group, the members of other W3C Working Groups, and industry experts in other forums who have contributed directly or indirectly to the process or content of creating this document. The Working Group is particularly grateful to Lotus Development Corp. and IBM for providing teleconferencing facilities.
The current members of the XML Schema Working Group are:
Paula Angerstein, Vignette Corporation; David Beech, Oracle Corp.; Paul V. Biron, Health Level Seven; Allen Brown, Microsoft; Greg Bumgardner, Rogue Wave Software; Lee Buck, Extensibility; Dean Burson, Lotus Development Corporation; Peter Chen, Bootstrap Alliance and LSU; David Cleary, Progress Software; Dan Connolly, W3C (staff contact); Andrew Eisenberg, Progress Software; Rob Ellman, Calico Technology; David Ezell, Hewlett Packard Company; David Fallside, IBM; Matthew Fuchs, Commerce One; Paul Grosso, ArborText, Inc.; Dave Hollander, CommerceNet (co-chair); Mary Holstege, Calico Technology; Jane Hunter, Distributed Systems Technology Centre (DSTC Pty Ltd); Renato Iannella, Distributed Systems Technology Centre (DSTC Pty Ltd); Rick Jelliffe, Academia Sinica; Dianne Kennedy, Graphic Communications Association; Setrag Khoshafian, Technology Deployment International (TDI); Janet Koenig, Sun Microsystems; Ara Kullukian, Technology Deployment International (TDI); Andrew Layman, Microsoft; Dmitry Lenkov, Hewlett Packard Company; Eve Maler, ArborText, Inc.; Ashok Malhotra, IBM; Murray Maloney, Commerce One; John McCarthy, Lawrence Berkeley National Laboratory; Noah Mendelsohn, Lotus Development Corporation; Don Mullen, Extensibility; Murata Makoto, Xerox; Frank Olken, Lawrence Berkeley National Laboratory; Dave Peterson, Graphic Communications Association; Mark Reinhold, Sun Microsystems; Shriram Revankar, Xerox; Jonathan Robie, Software AG; Lew Shannon, NCR; C. M. Sperberg-McQueen, W3C (co-chair); Henry S. Thompson, University of Edinburgh; Matt Timmermans, Microstar; Jim Trezzo, Oracle Corp.; Steph Tryphonas, Microstar; Mark Tucker, Health Level Seven; Priscilla Walmsley, XMLSolutions; Aki Yoshida, SAP AGThe XML Schema Working Group has benefited in its work from the participation and contributions of a number of people not currently members of the Working Group, including in particular those named below. Affiliations given are those current at the time of their work with the WG.
Gabe Beged-Dov, Rogue Wave Software; George Feinberg, Object Design; Charles Frankston, Microsoft; Ernesto Guerrieri, Inso; Michael Hyman, Microsoft; Chris Olds, Wall Data; William Shea, Merrill Lynch; Ralph Swick, W3C; Tony Stewart, Rivcom
Example
An example of a full blown schema, for the PurchaseOrder
example from Schemas, Types and Elements (§2.3):
<schema targetNS='http://www.myOrg.com/bob/PurchaseOrder' xmlns='http://www.w3.org/1999/XMLSchema'> <element name='PurchaseOrder' type='PurchaseOrderType'/> <element name='comment' type='string'/> <archetype name='PurchaseOrderType'> <element name='shipTo' type='Address'/> <element name='shipDate' type='date'/> <element ref='comment' minOccurs='0'/> <element name='Items' type='Items'/> <attribute name='orderDate' type='date'/> </archetype> <archetype name='Address'> <element name='name' type='string'/> <element name='street' type='string'/> <element name='city' type='string'/> <element name='state' type='string'/> <element name='zip' type='number'/> <attribute name='type' type='string'/> </archetype> <archetype name='Items'> <element name='Item' minOccurs='0' maxOccurs='*'> <archetype> <element name='productName' type='string'/> <element name='quantity'> <datatype> <basetype name='integer'/> <minExclusive>0</minExclusive> </datatype> </element> <element name='price' type='number'/> <element ref='comment' minOccurs='0'/> </archetype> </element> </archetype> </schema>
$Log: structures.xml,v $ Revision 1.15.1.11 1999/11/05 17:19:14 aqw typo in stylesheet name Revision 1.15.1.10 1999/11/05 15:41:53 aqw fix some more dates -> entref, remove some more ../'s Revision 1.15.1.9 1999/11/05 15:12:55 aqw oops, move DTD down a level too Revision 1.15.1.8 1999/11/05 15:05:28 aqw just before PWD: fix some uses of ...base; entities, remove Id:, make resource pointers same level, not ../, to accommodate Hugo's release directory structure Revision 1.15.1.7 1999/11/04 23:42:26 aqw one last link fix Revision 1.15.1.6 1999/11/04 22:03:50 aqw more last-minute link fixes Revision 1.15.1.5 1999/11/04 21:35:44 aqw adjust URLs, membership, status Revision 1.15.1.4 1999/11/03 21:32:52 aqw example fixed per David Beech suggestion Revision 1.15.1.3 1999/11/03 21:19:05 aqw more on editors and acks fix some typos courtesy of David Beech Revision 1.15.1.2 1999/11/03 20:21:03 aqw editor emails Revision 1.15.1.1 1999/11/03 19:52:45 aqw editor and author fixes for PWD Revision 1.15 1999/10/27 13:28:58 ht Fix some (all?) syntax paradigms, examples Include bug-fixed .xsd and .dtd Revision 1.14 1999/10/27 10:48:01 ht Incorporate up-to-date schema and DTD, completing concrete syntax changes Parameterise paths/dates to facilitate release process Revision 1.13 1999/10/09 10:49:40 ht correct headline date Revision 1.12 1999/10/05 09:56:19 ht Preliminary implementation of A3 and A7 (ampConnector and richerMixed) votes. Moving towards a parallel syntax for elementDecl/Ref and groupDefn/Ref. Concrete syntax paradigms, examples, DTD and Schema NOT up-to-date Revision 1.11 1999/09/27 16:31:07 ht merge simple back to main branch Revision 1.10.2.38 1999/09/27 16:29:02 ht return to xmlschema-current as base Revision 1.10.2.37 1999/09/24 16:40:22 ht add comments archive pointer Revision 1.10.2.36 1999/09/24 16:38:23 ht link housekeeping, move TF reports bibliography to separate appendix Revision 1.10.2.35 1999/09/24 13:44:27 ht final (?) housekeeping before publication Revision 1.10.2.34 1999/09/23 18:48:51 ht changes to front matter in preparation for public WD ponter to Simple TF included Revision 1.10.2.33 1999/09/23 13:32:15 ht up-to-date pointer to refinement TF report Revision 1.10.2.32 1999/09/23 13:00:22 ht typo in db entity Revision 1.10.2.31 1999/09/23 12:59:04 ht per suggestions from Ashok, some rewording of summary of Composition TF, added issue regarding priority of instance->schema alternatives Revision 1.10.2.30 1999/09/22 14:02:35 ht typo in correction to 4.1 Revision 1.10.2.29 1999/09/22 13:58:39 ht edits implementing Noah's comments Revision 1.10.2.28 1999/09/22 08:07:07 ht add verbatim change log at end ---------------------------- revision 1.10.2.27 date: 1999/09/21 16:26:11; author: ht; state: Exp; lines: +4 -4 added $Id: structures.xml,v 1.15.1.11 1999/11/05 17:19:14 aqw Exp $ to title for now ---------------------------- revision 1.10.2.26 date: 1999/09/21 16:06:08; author: ht; state: Exp; lines: +42 -244 replaced composition tf report with a summary and a pointer ---------------------------- revision 1.10.2.25 date: 1999/09/21 14:11:50; author: aqw; state: Exp; lines: +495 -111 some dates, up-to-date DTD and Schema for schemas ---------------------------- revision 1.10.2.24 date: 1999/09/21 10:50:37; author: ht; state: Exp; lines: +18 -3 supply missing content model for 'attribute' in concrete syntax paradigm ---------------------------- revision 1.10.2.23 date: 1999/09/21 10:37:51; author: aqw; state: Exp; lines: +21 -20 define/declare consistency pass ---------------------------- revision 1.10.2.22 date: 1999/09/20 13:08:36; author: aqw; state: Exp; lines: +47 -49 track datatype content model changes, minor wording ---------------------------- revision 1.10.2.21 date: 1999/09/16 14:55:17; author: ht; state: Exp; lines: +136 -14 header disclaimer, graveyards rescued to discharge references ---------------------------- revision 1.10.2.20 date: 1999/09/16 14:25:01; author: aqw; state: Exp; lines: +274 -1541 rip out all of 3.5, all of 4, install 'Draft Proposal' in 4 ---------------------------- revision 1.10.2.19 date: 1999/09/16 12:08:59; author: aqw; state: Exp; lines: +107 -143 Clean up import/include/export, references in particular Add archetypeRef to content models, minimally New example of datatype+attr ---------------------------- revision 1.10.2.18 date: 1999/09/15 22:06:29; author: aqw; state: Exp; lines: +26 -3 Two clarifications following discussion with andrew 1) what it would take to remove the two symbol spaces problem 2) How <archetype> allows either datatypeRef or contentType ---------------------------- revision 1.10.2.17 date: 1999/09/15 20:30:49; author: aqw; state: Exp; lines: +114 -105 change date, incorporate edited dtd ---------------------------- revision 1.10.2.16 date: 1999/09/15 19:52:39; author: aqw; state: Exp; lines: +90 -76 Encorporate/respond to Eve Maler's suggested edits ---------------------------- revision 1.10.2.15 date: 1999/09/13 16:14:12; author: aqw; state: Exp; lines: +306 -335 Finish consistency pass through 3.4 Brutal 'element type' -> element ---------------------------- revision 1.10.2.14 date: 1999/09/09 14:22:29; author: aqw; state: Exp; lines: +53 -56 cleanup pass, down to 3.3 ---------------------------- revision 1.10.2.13 date: 1999/09/08 18:23:47; author: ht; state: Exp; lines: +41 -41 more type back to archetype ---------------------------- revision 1.10.2.12 date: 1999/09/08 18:03:06; author: aqw; state: Exp; lines: +214 -216 put archetype back in, imperfectly, I expect ---------------------------- revision 1.10.2.11 date: 1999/09/07 21:50:36; author: bu; state: Exp; lines: +124 -63 fix paradigm contexts, extend example, consolidate example in appendix ---------------------------- revision 1.10.2.10 date: 1999/09/07 16:54:39; author: aqw; state: Exp; lines: +514 -521 syntax paradigms now properly distributed, I think ---------------------------- revision 1.10.2.9 date: 1999/09/07 15:53:06; author: ht; state: Exp; lines: +5 -8 fixed minor validity errors ---------------------------- revision 1.10.2.8 date: 1999/09/07 15:31:58; author: aqw; state: Exp; lines: +288 -285 working on integrating syntax paradigms ---------------------------- revision 1.10.2.7 date: 1999/09/07 09:44:57; author: aqw; state: Exp; lines: +630 -33 added ALL concrete syntax boxes at once ---------------------------- revision 1.10.2.6 date: 1999/09/06 14:55:04; author: ht; state: Exp; lines: +35 -2 added one e: syntax exposition ---------------------------- revision 1.10.2.5 date: 1999/09/02 15:28:27; author: ht; state: Exp; lines: +6 -6 fix URLs for self, a bit ---------------------------- revision 1.10.2.4 date: 1999/09/02 12:53:34; author: aqw; state: Exp; lines: +108 -95 Added not-status-quo marks, changed e.g. String to string ---------------------------- revision 1.10.2.3 date: 1999/09/01 17:02:14; author: aqw; state: Exp; lines: +587 -977 integration of 2.3 from simple more renaming ---------------------------- revision 1.10.2.2 date: 1999/08/23 15:32:16; author: aqw; state: Exp; lines: +730 -248 Modified simple integration to give preliminary consistency ---------------------------- revision 1.10.2.1 date: 1999/08/22 17:44:40; author: aqw; state: Exp; lines: +317 -260 Textual integration of Simple update of 1999-08-13 ---------------------------- revision 1.10 date: 1999/07/20 19:47:27; author: ht; state: Exp; lines: +5 -5 branches: 1.10.2; fixed dates, dangling reference ---------------------------- revision 1.9 date: 1999/07/19 09:31:26; author: ht; state: Exp; lines: +34 -38 David Beech: updated definition of "Schema" following WG and IG email discussion. Changed "Schemata" to "Schemas" except where directly quoted from Requirements doc. Clarified in 2.5 that elements and attributes have separate symbol spaces (public comment). Fixed assorted typos. ---------------------------- revision 1.8 date: 1999/06/23 10:00:31; author: aqw; state: Exp; lines: +1 -1 fix $Id: structures.xml,v 1.15.1.11 1999/11/05 17:19:14 aqw Exp $ ---------------------------- revision 1.7 date: 1999/06/23 09:51:15; author: aqw; state: Exp; lines: +28 -28 Restrict content model of 'all' in schema and dtd, change entities for point releases ---------------------------- revision 1.6 date: 1999/06/23 09:10:01; author: aqw; state: Exp; lines: +147 -187 pushed & down to lowest level, fixed incoherent validity definition in 6.2.3.7 to agree with the note which follows. Wrapped validation text from 3.4 in appropriately named div4's ---------------------------- revision 1.5 date: 1999/06/21 16:31:59; author: aqw; state: Exp; lines: +569 -551 Really moved validity-oriented definitions to 6.3 (previous revision was just housekeeping) ---------------------------- revision 1.4 date: 1999/06/21 16:25:21; author: aqw; state: Exp; lines: +45 -36 moved validity-oriented definitions to 6.3 ---------------------------- revision 1.3 date: 1999/06/21 12:25:21; author: aqw; state: Exp; lines: +3540 -3650 Low-level: Normalise line ends, quotes Editorial: Move all constraintnotes to new separate section ---------------------------- revision 1.2 date: 1999/05/27 14:13:54; author: aqw; state: Exp; lines: +2 -2 fix stylesheet and dtd urls to local versions ---------------------------- revision 1.1 date: 1999/05/23 16:51:11; author: ht; state: Exp; branches: 1.1.1; Initial revision