Copyright ©1999 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML Schema: Structures is part 1 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs).
This is a public working draft of XML Schema 1.0 for review by the public and by members of the World Wide Web Consortium.
It has been reviewed by the XML Schema Working Group, and the Working Group has agreed to its publication. The WG believes this draft to be `feature-complete': the functionality included here is substantially complete and is expected to be stable. We do not expect to add major new functionality, or to make major changes to the functionality described in this draft. Some sections of the draft (in particular those on conformance), and some aspects of the design (in particular details of the transfer syntax for schemas), on the other hand, are still rough and are expected to be revised.
The WG expects to spend January, 2000, working out details, clarifying points of uncertainty that arise in the review of this draft, cleaning up inconsistencies, reviewing the design of the concrete transfer syntax, and making editorial improvements.
Following that period of review and polishing, it is the WG's intent to issue a Last Call for Review by other W3C working groups sometime during February, 2000, and to submit this specification in March, 2000, for publication as a Candidate Recommendation. This schedule may vary, depending on the comments of the public and of other W3C working groups on this draft. Such comments are instrumental in the WG's deliberations, and we encourage readers to review the draft and send comments to www-xml-schema-comments@w3.org (archive).
Although the Working Group does not anticipate further substantial changes to the functionality described here, this is still a working draft, subject to change based on experience and on comment by the public and other W3C working groups. The present version should be implemented only by those interested in providing a check on its design or by those preparing for an implementation of the Candidate Recommendation. The Schema WG will not allow early implementation to constrain its ability to make changes to this specification prior to final release.
A list of current W3C working drafts can be found at http://www.w3.org/TR/. They may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
This document sets out the structural part (XML Schema: Structures) of the XML Schema definition language.
Chapter 2 presents a Conceptual Framework (§2) for XML Schema: Structures, including an introduction to schema constraints, types, schema composition, and symbol spaces. The abstract and concrete syntax of XML Schema: Structures are introduced, along with other terminology used throughout the specification.
Chapter 3 Schema Definitions and Declarations (§3) reconstructs the core functionality of XML 1.0, plus a number of extensions, in line with our stated requirements [XML Schema Requirements]. This chapter discusses the declaration and use of simple and complex types, elements, content models, attributes, attribute groups, model groups and inheritance.
Chapter 4 presents Schema Access and Composition (§4), including the validation of namespace qualified instance documents, import and inclusion of declarations and definitions, access to schemas, and the foundations of schema-validity.
Chapter 5 describes provision for including documentation in the definition of a schema.
Chapter 6 discusses Conformance * (§6), including the rules by which instance documents are validated, and responsibilities of schema-aware processors.
The normative addenda include a (normative) DTD for Schemas (§B) and a (normative) Schema for Schemas (§A), which is an XML Schema schema for XML Schema: Structures, a Glossary (normative) * (§C) [not yet written] and References (normative) * (§D). Non-normative appendixes include a Sample Schema (non-normative) (§F) and Acknowledgments (non-normative) (§E).
This Working Draft document was produced using an [XML] DTD and an [XSLT] stylesheet.
The following highlighting is used to present technical material in this document:
[Definition:] A term is something we use a lot.
Example
A non-normative example illustrating use of the schema language, or a related instance.
<schema name="http://www.muzmo.com/XMLSchema/1.0/mySchema" >And an explanation of the example.
The following highlighting is used for non-normative commentary in this document:
Issue (dummy): A recorded issue.
Ed. Note: Notes from the editors to themselves or the Working Gorup.
NOTE: General comments directed to all readers.
The purpose of XML Schema: Structures is to provide an inventory of XML markup constructs with which to write schemas.
The purpose of an XML Schema: Structures schema is to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.
Any application that consumes well-formed XML can use the XML Schema: Structures formalism to express syntactic, structural and value constraints applicable to its document instances. The XML Schema: Structures formalism will allow a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. However, the language defined by this specification does not attempt to provide all the facilities that might be needed by any application. Some applications may require constraint capabilities not expressible in this language, and so may need to perform their own additional validations.
The definition of XML Schema: Structures is a part of the W3C XML Activity. It is in various ways related to other ongoing parts of that Activity and other W3C WGs
XML Schema: Structures defines its own Information Set Contributions which are compatible with [XML-Infoset] although not defined as such therein.
The terminology used to describe XML Schema: Structures is defined in the body of this specification. The terms defined in the following list are used in building those definitions and in describing the actions of XML Schema: Structures processors:
This specification uses a number of terms that are common to many of the fields of endeavor that have influenced the development of XML Schema. Unfortunately, it is often the case that these terms do not have the same definitions in all of those fields. This section attempts to provide definitions of terms as they are used to describe the conceptual framework, and the remainder of the specification.
Since XML schemas are themselves specified as XML documents or elements within documents, it is useful to clarify the relationships between certain kinds of XML documents and elements:
Note that it is possible to specify a schema to which schemas themselves must conform, and this is given in (normative) Schema for Schemas (§A). An XML 1.0 DTD to which schemas must conform is also provided in (normative) DTD for Schemas (§B).
The [XML] specification describes two kinds of constraints on XML documents: well-formedness and validity constraints. Informally, the well-formedness constraints are those imposed by the definition of XML itself (such as the rules for the use of the < and > characters and the rules for proper nesting of elements), while validity constraints are the further constraints on document structure provided by a particular DTD.
Three kinds of normative statements about the impact of XML Schema: Structures components on instances are distinguished in this specification:
NOTE: Schema Information Set Contributions are not as new as might at first appear: XML 1.0 validation augments the XML 1.0 information set in similar ways, e.g. by providing values for attributes not present in instances, and by implicitly exploiting type information for normalization or access, e.g. consider the effect ofNMTOKENS
on attribute whitespace, and the semantics ofID
andIDREF
. By including Schema Information Set Contributions, we are trying to make explicit something XML 1.0 left implicit.
XML Schema: Structures not only reconstructs the DTD constraints of XML 1.0 using XML instance syntax, it also adds the ability to define new kinds of constraints. For example, although the author of an XML 1.0 DTD may declare an element type as containing character data, elements, or mixed content, there is no mechanism with which to constrain the contents of elements to only character data of a particular form, such as only numeral sequences representing integers in a specified range.
This specification supports the expression of just such constraints by including in the mechanism for the declaration of elements the option of specifying that its contents must consist of a valid string expression of a particular datatype. A number of other mechanisms are added which improve the expressive power, usability and maintainability of schemas as a means to defining the structure of XML documents.
The purpose of a schema is to identify a set of components for use in XML documents and to provide the rules for their correct combination.
The schema language itself defines an XML form for itself in terms of elements and attributes. We will describe these, and show how they are used. But first, a quick example of an XML document.
Example
<?xml version='1.0'?> <PurchaseOrder orderDate="1999-05-20" xmlns="http://www.myco.com/MYPO"> <shipTo type="US"> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <shipDate>1999-05-25</shipDate> <comment>Get these things to me in a hurry, my lawn is going wild!</comment> <Items> <Item pno="333-333"> <productName>Lawnmower, model BUZZ-1</productName> <quantity>1</quantity> <price>148.95</price> <comment>Please confirm this is the electric model</comment> </Item> <Item pno="444-444"> <productName>Baby Monitor, model SNOOZE-2</productName> <quantity>1</quantity> <price>39.98</price> </Item> </Items> </PurchaseOrder>
The purchase order consists of a main element with several subordinate
elements. Most of the subelements have simple atomic types such as string
or
date
, drawn from the repertoire of built-in simple types defined in [XML Schemas: Datatypes], but some are complex. We use the type
element
when declaring elements which allow elements in their content and/or may carry attributes. For example, we can define a type called Address
as follows:
Example
<type name="Address" > <element name="name" type="string" /> <element name="street" type="string" /> <element name="city" type="string" /> <element name="state" type="string" /> <element name="zip" type="integer" /> <attribute name="type" type="string" /> </type>The consequence of this definition is that an element whose type is declared to be Address
must consist of five elements and may have one attribute. Though each has a distinct name, four of the elements and the attribute will simply contain a string in a document instance while one will contain an integer.
If we're going to use the same element in a number of places, we can declare it once and refer to it by name elsewhere:
Example
<element name="comment" type="string" />This declaration restricts the comment
element to text content and no attributes.
We can define a PurchaseOrderType
for our
PurchaseOrder
element, referring to the definitions of Address
and comment
as above, as:
Example
<type name="PurchaseOrderType"> <element name="shipTo" type="po:Address" /> <element name="shipDate" type="date" /> <element ref="po:comment" minOccurs="0" /> <element name="Items" type="po:Items" /> <attribute name="orderDate" type="date" /> </type>The shipDate
element daughter ofPurchaseOrderType
is declared above as having a simple type, as in theAddress
example above. Thecomment
daughter is declared by reference to a global element declaration. Since this definition is in the namespace being defined, and apparently the default namespace is being used for the schema elements themselves (e.g.element
,attribute
), we use a prefix (po
) on this reference which would have to be declared with the same URI as the target namespace URI for the containing schema. Similarly, theshipTo
andItems
daughters are declared as having complex types which must be defined elsewhere in the current schema. Thecomment
daughter and theorderDate
attribute are optional, the others are obligatory.
Issue (type-decl-syntax): Further integration of the concrete syntax for type definitions is desireable, e.g. by using 'type' for both simple and complex types, but the details of a consistent and clear way to do this have not yet been agreed.
Since an element declaration's type
can identify either a
simple or a complex type, and there are separate symbol spaces for these two, the
possibility of ambiguity arises. This is resolved in favour of the complex
type, e.g. even if a simple type called Address
existed (either
builtin or user-defined), the above declaration for shipTo
would
refer to the user-defined complex type of that name.
Issue (note-two-sses): The separation of the simple and complex type name symbol spaces is primarily motivated by the decision to allow unqualified reference to the ab initio and built-in simple types. Should this decision be reversed, as was suggested in the report of the simplification Task Force, then the unification of the two symbol spaces could proceed with minimal negative impact. The potential for error which arises from unexpected shadowing of an old simple type by a new complex type would be removed.
[Definition:] A definition creates a new type; [Definition:] a declaration enables the appearance in a
document instance of an element or attribute with a specific name and type. In the schema,
we see both the definition of several types, and also several elements and
attributes declared
as usages of these types. For example, Address
is defined to be a
type, while within the definition of Address
we see five
declarations of elements and one attribute declaration. These declarations are
not themselves types, but rather an association between a name and constraints
which govern the appearance of that name in documents governed by the containing schema.
In the case of attribute declarations, the constraints are on the allowed value, always by reference to a simple type:
Example
<attribute name="orderDate" type="date" />
In the case of element declarations, the constraints are on the allowed content and attributes, by reference to a complex or a simple type (in which case no attributes are allowed):
Example
<element name="shipTo" type="po:Address" /> <element name="comment" type="string" />Because Address
is defined in the schema to have certain elements as its content and to allow a certain attribute, anyshipTo
element appearing in an instance must include those elements and may have that attribute, while anycomment
element may not have any attributes, but any text content.
As well as naming a type in an attribute or element declaration, we can embed the type definition immediately within the element declaration:
Example
<type name="Items"> <element name="Item" minOccurs="0" maxOccurs="*"> <type> <element name="productName" type="string" /> <element name="quantity"> <datatype source="integer"> <minExclusive value="0"/> </datatype> </element> <element name="price" type="decimal" /> <element ref="po:comment" minOccurs="0" /> <attribute name="pno" type="string"/> </type> </element> </type>Here not only is the type of the Item
element given in line, but also the simple type of itsquantity
daughter (the built-in simple typeinteger
) is qualified inline by adding a subrange constraint.
Taken together the examples above constitute a complete schema for the
initial PurchaseOrder
example instance. They are drawn together
in a single complete schema in Sample Schema (non-normative) (§F).
[Definition:] Schemas are composed of: schema components: a set of type
definitions, attribute group definitions, model group definitions, element
declarations, and attribute declarations. Note that it is the
abstract idea of a component we are talking about here, along with its name:
an XML element such as <element>
is a standardized representation
for a component, not the component itself.
The next chapter Schema Definitions and Declarations (§3) sets out the XML Schema: Structures approach to schemas, with formal definitions of their component parts and presentations of standardized representations for each of them. Here we informally summarize the key constructs used in defining schemas. A 'Yes' in the 'Name appears in instances?' column indicates that the name will appear in instances -- other names are for schema use only.
XML Schema: Structures Feature | Purpose | Named? | Name appears in instances? |
---|---|---|---|
The Schema (§3.1) | A wrapper element containing all the definitions and declarations comprising a schema. | Yes | No |
Simple Type Definition (§3.4.1) | A simple atomic type (content constraint), such as 'integer', that applies to character data in an instance document, whether it appears as an attribute value or the contents of an element. The mechanisms for defining simple types are set out elsewhere, in XML Schemas: Datatypes. | Yes | No |
Complex Type Definition (§3.4.2) | A complete set of constraints for elements in instance documents, applying to both contents and attributes. | Yes | No |
Element Declaration (§3.4.9) | An
association between a name for an element and a type. An element
declaration for 'A' is comparable to a DTD declaration
<!ELEMENT A .....> . |
Yes (local or global) | Yes |
Attribute Declaration (§3.4.3) | An association between a name for an attribute and a simple type. The association is local to its surrounding type. | Yes (local) | Yes |
Content type | Either a simple type or a content model. A content type applies to the contents of elements in an instance document (but not their attribute values). It provides a unifying abstraction for the constraints which apply to the contents of elements, but introduces no additional features. | No | No |
Element Content Model (§3.4.5) | A constraint that applies to the contents of elements in an instance document. May include specifications of grouping and sequencing. | No | No |
Attribute Group Definition (§3.4.4) | An association between a name and a reusable collection of attribute declarations. | Yes | No |
Deriving Type Definitions (§3.6) | One type may be defined as based on another type, acquiring content type and/or attributes therefrom. | Yes | No |
References to schema components across namespaces (§4.2.2) | Integrates definitions and/or declarations from elsewhere into the schema being defined, as if they had been defined locally. | No | No |
Unique, key and key reference constraints (§3.7) | Provide more powerful uniqueness and intra-document reference mechanisms | Yes | No |
As indicated in the third column of the tables above, most of the components listed have names, which provide for references within the schema, and sometimes from one schema to another. For example, an attribute declaration can refer to a named type, such as 'integer'. A content model can refer to an element, and so on.
If all such names were assigned from the same 'pool', then it would be impossible to have e.g. a type named 'integer' and an element with the name 'integer' in the same schema. [Definition:] Accordingly we introduce the idea of a symbol space (avoiding 'name space' to avoid confusion with the term defined in [XML-Namespaces]). A symbol space is similar to the non-normative concept of namespace partition introduced in [XML-Namespaces].
There is a single distinct symbol space within a given schema for each of the abstractions named above other than 'Attribute' and 'element': within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. In particular note that the same name can refer to both a type and an element, without conflict or necessary relation between the two.
Attributes and local element declarations are special, in that every type defines its own attribute symbol space and local element symbol space, which are distinct from each other. In addition, top-level elements (whose declarations are not contained within a type definition) reside in their own symbol space.
The names of schema components such as type definitions and element
declarations are not of type ID
, as explained above: they are not
unique within a schema, just within a symbol space. This means that simple
fragment identifiers will not work to reference schema components.
In the long run we expect to provide some mechanism suitable for referencing the
semantic components of schemas as such. In the mean time, we observe that
[XPointer] provides a mechanism which maps well onto our
notion of symbol spaces. An fragment identifier of the form
#xpointer(schema/element[@name="person"])
will uniquely identify
the element declaration with name person
, and similar fragment
identifiers can obviously be constructed for the other top-level symbol spaces.
Every element and attribute declaration is associated with a target namespace URI, or with no namespace. More specifically, each symbol space is associated directly (in the case of global declarations) or indirectly (in the case of local declarations) with a target namespace or with no namespace. So, the name of each global declaration is effectively qualified by the target namespace in which it is defined. Locally scoped element and attribute declarations are named in a symbol space defined by their containing type definition.
Global element and attribute declarations are used to validate instance document constructs in the namespace identified by the URI of the target namespace for the corresponding declaration. Declarations with a null target namespace validate non-namespace qualified instance document constructs.
The XML namespaces recommendation discusses only instance document syntax for elements and attributes; it therefore provides no direct framework for managing the names of types, attribute groups, and other facilities provided by XML schemas. Nevertheless, we apply the target namespace facility uniformly to all schema components. Specifically, the target namespace qualifies the symbol space for definitions as well as for declarations.
The above discussion requires that each global definition and
declaration be associated with a target namespace. The standard
XML format for schema definitions provides a "targetNamespace"
attribute for the
<schema>
element.
<schema targetNamespace="someNSURI"> ...every global declaration & def'n here... ...is in targetNamespace </schema>
If specified, this supplies the same target namespace for all the definitions and declarations contained within that schema element. If absent, it indicates that all the definitions and declarations have a null target namespace.
XML Schema: Structures is presented here primarily in the form of an [Definition:] abstract syntax, which provides a formal specification of the information provided for each declaration and definition in the schema language. The abstract syntax is presented using a simplified BNF. Defined terms are to the left. Their components are to the right, with a small amount of meta-syntax: ()s for grouping, | to separate alternatives, ? for optionality, * and + for iteration. Terms in italics are primitives, not expanded here, either because they are defined elsewhere (e.g. URI, defined by [URI]) or because they can only be grounded once a concrete syntax is decided on (e.g. choice).
An abstract syntax production prefixed with a number in brackets (e.g. [3]) is normative; other abstract syntax is either for purposes of explanation, or is a duplicate (for convenience) of a normative definition to be found elsewhere.
The abstract syntax illustrates the expressive power of the language, and the relationships among its component parts. The abstract syntax can be used to evaluate the expressive power of XML Schema: Structures, but not its look and feel. In particular, please note that neither ordering within or between productions or choice of names is significant, and that any particular concrete syntax is not constrained by these.
The [Definition:] concrete syntax of XML Schema: Structures, the exact element and attribute names used in a schema, are a key feature of its proposed design. The concrete syntax is the form in which the schema language is used by schema authors. Though its elements and attributes are often different from the terms of the abstract syntax BNF, the features and expressive power of the two are congruent. The concrete syntax profoundly affects the convenience and usability of the schema language.
We include a preliminary concrete syntax in this draft, via examples, paradigms and in (normative) Schema for Schemas (§A) and (normative) DTD for Schemas (§B). Unlike the previous version, in which the intention was to stay quite close to the abstract syntax, in this version we have begun to take convenience and clarity into account.
Ed. Note: possible changes to the definition of what of schema is, to reflect a our discussion of layering.
The principal purpose of XML Schema: Structures is to provide a means for defining schemas that constrain the contents of instances and augment the information sets thereof.
A schema contains some preamble information and a set of definitions and declarations.
Schema top level | ||||||||||||||||||||||||||||||||||||||||
|
preamble consists of an xmlSchemaRef specifying the URI for XML Schema: Structures; the targetNamespace specifying the URI of the namespace which this schema is about; and a schemaVersion specification for private version documentation purposes and version management.
finalDefault and exactDefault provide defaults for final and exact respectively for type definitions and element declarations. The default for these properties is empty in both cases.
See Schema Access and Composition (§4) for discussion of schemas, instances and namespaces, and also for import and include.
Example
<!DOCTYPE schema PUBLIC '-//W3C//DTD XML Schema Version 1.0//EN' SYSTEM 'http://www.w3.org/TR/1999/WD-xmlschema-1-19991217/structures.dtd' > <schema targetNS="http://purl.org/metadata/dublin_core" version="M.n" xmlns="http://www.w3.org/1999/XMLSchema"> ... </schema>Note that the abstract syntax xmlSchemaRef is realised via a default namespace declaration in the concrete syntax.
Although the schema above is a complete XML document, schema
need not be the document element, but can appear within other documents.
Indeed there is no requirement that a schema be derived from a (text) document
at all: it could be built 'by hand' via e.g. a DOM-conformant API.
The schema's declarations and definitions, discussed in detail in Schema Definitions and Declarations (§3), provide for the creation of new schema components:
Summary of Definitions and Declarations | ||||||||||||||||||||||||||||||
|
Example
The following illustrates the basic model for declaring or defining all XML Schema: Structures components:
<datatype name="myDatatype"> ... </datatype> <type name="myType"> ... </type> <element name="myElement"> ... </element> <attributeGroup name="myAttrGroup"> ... </attributeGroup> <group name="myModelGroup"> ... </group> <notation name="myNotation" ... /> </schema>When creating a component, we establish an association between its name and the specification for that component. Each new component therefore creates a new entry in the symbol space for that kind of component.
Ed. Note: make sure that discussion of targetNamespace is up-to-date.
The Unique Definition (§6.2.1) Constraint on Schemas obtains.
Issue (no-evolution): This draft does not deal with the requirement "for addressing the evolution of schemata" (see [XML Schema Requirements]).
NOTE: We have not so far seen any need to reconstruct the XML 1.0 notion of root. For the connection from document instances to schemas, see Layer 3: Web-interoperability (§4.3) and Schema Validity * (§6.1).
Uniform means are provided for reference to a broad variety of schema constructs, both within a single schema and to features imported (References to schema components across namespaces (§4.2.2)) from external schemas. The name used to reference any component of XML Schema: Structures from within a schema consists of a QName. In a few cases, some elaboration may be added to a reference: this is made clear as the individual reference forms are introduced below.
Example: Component Names and References | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The abstract syntax above characterizes the reference mechanisms used in this specification.
Example
<element name="elem1" type="Address"/> <element name="elem2" type="XHTML:BLOCKQUOTE"/> <attribute name="attr1" type="xsl:quantity"/>The first of these is a local reference, the other two refer to schemas elsewhere and assume that the prefixes used have been declared and their namespaces declared for import. See References to schema components across namespaces (§4.2.2) for a discussion of importing.
The Consistent Import (§6.2.2) Constraint on Schemas obtains.
The identify definition wrt schema-validity obtains.
The Preorder Priority for Included Definitions (§6.2.6) Constraint on Schemas also obtains.
Like XML 1.0 DTDs, XML Schema: Structures provides facilities for constraining the contents of elements and the values of attributes, and for augmenting the information set of instances, e.g. with defaulted values and type information. [Definition:] We call a set of SCs intended for use in this way a type definition.
[Definition:] We refer hereafter to the combination of schema constraints and information set contributions with the abbreviation SC. Compared to DTDs, XML Schema: Structures provides for a richer set of SCs, and improved capabilities for sharing SCs across sets of elements and attributes.
We start with the simple types whose expression in XML documents consists entirely of character data.
Simple Types | |||||||||||||||||||||||||
|
XML Schema: Structures incorporates the simple type specification mechanisms defined by [XML Schemas: Datatypes] in order to express SCs on attribute values and the contents of elements consisting entirely of character data.
The production for facet above serves to indicate where this
chapter connects with XML Schemas: Datatypes. The concrete syntax
displayed below is copied from [XML Schemas: Datatypes]. All facets are optional and may appear in any order
within the datatype
element. The simpleTypeRef in the simpleTypeSpec identifies the simple type on which the
one being defined is based: infinite regress is avoided because XML Schemas: Datatypes
provides a set of built-in ab initio simple types.
The other productions provide for using simple types once they have been defined, see below under typeDefn and attribute.
As explained in References to Schema Constructs (§3.3), the use of QName allows for the referenced definition to be located in some other schema.
An abstract type cannot itself be used as the type of an attribute or element.
A simple type definition can rule itself out as the source of type derivations, by declaring itself final.
Example
<datatype name="posInt" source="integer"/> <minExclusive value="0"/> </datatype> <attribute name="foo" type="posInt"/> <attribute name="baz" type="integer"/> <attribute name="fontSize" type="xsl:quantity" fixed="12pt"/>The first attribute
example references the definition above it. The second references a datatype pre-defined by XML Schemas: Datatypes. The third references a datatype in an (imaginary) XSL schema and fixes its value.
NOTE: See previous note on the type definition issue.
The satisfy-dt definition wrt schema-validity obtains.
The Datatype Info (§6.2.3.1) Schema Information Set Contribution obtains.
We now move on to [Definition:] the complex types whose expression in XML documents consists of elements with attributes and/or element content.
Types | ||||||||||||||||||||||||||||||
|
The complexTypeDefn production and its descendants provide for all the SCs which constitute a complex type definition; the last two productions provide for reference to complex types once defined. But note that the name of a type is not ipso facto the name of elements whose appearance in instances will be associated with the SCs which constitute that type. The connection between an element name and a type is made by an elementDecl, see below.
Alongside attributesSpec for permitted attributes, SCs for contents are specified by a contentType: for elements which may contain only character data, this is a simple type (via a simpleTypeRef) or, for other kinds of elements, a contentModel. An abstract type may not be used as the type in an elementDecl. (See Wildcards (§3.5) for a discussion of attrWildcard. See Deriving Type Definitions (§3.6) for the full details on contentType, of which the above is only a summary, as well as final, exact and more on abstract.)
Example
<type name="length1" type="decimal"/> <restrictions> <minInclusive value="0"/> </restrictions> <attribute name="unit" type="NMTOKEN"/> </type> <element name="width" type="length1"/> <width unit="cm">2.54</width> <type name="length2"> <element name="size"> <datatype source="decimal"> <minInclusive value="0"/> </datatype> </element> <element name="unit" type="NMTOKEN"/> </type> <element name="depth" type="length2"/> <depth> <size>2.54</size><unit>cm</unit> </depth>Two approaches to defining a type for length: one with character data content constrained by a qualified reference to a built-in datatype, and one attribute, the other using two elements.
Note that both the datatypeRef and the typeRef options in
the abstract syntax are realised by the source
attribute on the
type
element. source
must refer to a simple type
if content
is textonly
. The contents of the restrictions
element will be quite different in the two cases, and if the
source
refers to a simple type, no content model is appropriate, so none
of element
, group
or any
are allowed.
The values other than textonly
for content
express
choices recorded in the abstract syntax in the contentModel and
richModel productions below.
Careful consideration of the above abstract and concrete syntax reveal that
a type need consist of no more than a name, i.e. that
<type name="anything"/>
is allowed. See the discussion of the
ur-type in Deriving Type Definitions (§3.6) for what such a type means.
NOTE: See previous note on the type definition issue.
The AttrGroup Unique (§6.2.3.2) Constraint on Schemas obtains.
The AttrGroup Identified (§6.2.3.2) Constraint on Schemas obtains.
The attr-decl-set definition wrt schema-validity obtains.
The attr-fullname definition wrt schema-validity obtains.
The Attribute Locally Unique (§6.2.3.2) Constraint on Schemas obtains.
The satisfy-as definition wrt schema-validity obtains.
The Type Info (§6.2.3.2) Schema Information Set Contribution obtains.
Attribute declarations associate a name (which will appear as an attribute in start tags in instances) with SCs for the presence and value thereof by referring to a (possibly restricted) simple type. These SCs in turn will be part of the SCs of one or more types. A default or fixed value may be supplied, as well as an indication of whether the attribute is optional or required.
Attributes | |||||||||||||||||||||||||||||||||||
|
NOTE: A number of productions are repeated here for easy reference.
Attribute declarations provide for:
0
to
maxOccurs, will be clarified in Deriving Type Definitions (§3.6);
Example
<attribute name="myAttribute"/> <attribute name="yetAnotherAttribute" type="integer" minOccurs="1"/> <attribute name="anotherAttribute" default="42"> <datatype source="integer"> <minExclusive value="0"/> </datatype> </attribute> <attribute name="stillAnotherAttribute" type="string" fixed="Hello world!"/>Four attributes are declared: one with no explicit SCs at all; two declared by reference to the built-in simple datatype integer
, one required to be present in instances and one with a default and a subrange qualification; and one with a fixed value.
The type
attribute is used when the attribute can use a
built-in or pre-declared datatype, i.e. if no facets
are part of its datatypeSpec. Otherwise an
anonymous datatype
is used.
Wherever attribute declarations are used, the surrounding
type definition provides its own symbol space for attribute names. E.g. an attribute
named title
within one type need not have the same
datatypeRef as one declared within another
type.
The attr-satisfy definition wrt schema-validity obtains.
The default when no datatypeRef is provided is the ur-type, which imposes no constraints at all.
The satisfy-attrs definition wrt schema-validity obtains.
The Attribute Value Default (§6.2.3.3) Schema Information Set Contribution obtains.
Issue (namespace-declare): We've got a problem with namespace declarations: they're not attributes at the infoset level, so they can appear without compromising validity, except if there is a fixed or required declaration, and defaults should have the apparently desired effect. I.e., if a schema declares an attribute whose name isxmlns
with a default or fixed value, does it change the infoset? Or if we allow QNames as such to be declared,xmlns:foo
.
A schema can name a group of attributes so that they may be incorporated as a whole into type definitions:
Attribute groups | ||||||||||||||||||||
|
Attribute group definitions provide a construct to replace some uses of parameter entities. See Wildcards (§3.5) for a discussion of attrWildcard.
Example
<attributeGroup name="myAttrGroup"> <attribute .../> ... </attributeGroup> <type name="myelement" content="empty"> <attributeGroup ref="myAttrGroup"/> </type>Define and refer to an attribute group. The effect is as if the attribute declarations in the group were present in the type definition.
The concrete syntax above is the first example of a pattern which will
recur: The same element, in this case attributeGroup
, serves both to
define and to incorporate by reference. In the first case the
name
attribute is required, in the second the ref
attribute is required, and the element must be empty. These two are mutually exclusive, and also conditioned
by context: the defining form, with a name
, must occur at the top
level of a schema, whereas the referring form, with a ref
, must
occur within a complex type definition or an attribute group definition.
Ed. Note: There needs to be some discussion of what happens in case of name conflict between attrs as a result of an attr group ref.
Issue (global-attrs): Somewhere in Chapter 3, we need to introduce a means for declaring global attributes.
When content of elements is not constrained by reference to a simple type (Simple Type Definition (§3.4.1)), it can be unconstrained, be constrained to have no content, or allow elements in its content, in which case the form of the content is specified in more detail.
Content model | |||||
|
A content model constrains the element content of a type specification: it says nothing about attributes.
Content models do not have names, but appear as a part of the definitions of types, which do have names.
The satisfy-cm definition wrt schema-validity obtains.
A content model consisting of an elemModel alone specifies child elements only. If the mixed qualifier is present, text may occur as well as elements. In either case the content model consists of a simple grammar governing the allowed types of child elements and the order in which they must appear.
Rich content model | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
[Definition:] The grammar for element-only content is built on content model particles (particle above): elements, groups and wildcards. A particle provides for some number of occurrences in an instance of a single element (via elementRef or elementDecl), a group of elements (via group) or an indirect specification of any of these (via modelGroupRef).
[Definition:] We say that a particle permits one or more elements or groups if its minOccurs is 0.[Definition:] We say that a particle requires one or more elements or groups if its minOccurs is greater than 0.
[Definition:] A group is two or more particles plus a compositor. The compositor for a group specifies for a given group whether it provides for
all
compositor,
which is associated with the allGroup production).
These options reconstruct the XML
1.0 ,
connector, the XML 1.0 |
connector, the repeated disjunction of XML 1.0's Mixed
production and the SGML
&
connector respectively. In the first case (sequence
) all the
elements permitted or
required must appear in the order given in the group; in the second case
(choice
), exactly one of the permitted or
required elements must appearin the fourth case (all
), all the required elements,
which are restricted in this case only to unqualified
elements with minOccurs=maxOccurs=1, must appear, but may
appear in any order. The all
compositor may
only appear as the top-level compositor of a content model.
The occurs specification governs how many times the material permitted or required by a particle may occur, but note that the components of a group whose compositor is (implicitly) all may not be qualified, and therefore call for exactly one appearance of the element they identify.
See Element Declaration (§3.4.9) for further discussion and examples of the appearance of elementDecl as one of the two expansions of element above.
For the interpretation of wildcard in this context, see Wildcards (§3.5).
The satisfy-eo definition wrt schema-validity obtains.
The Element Consistency (§6.2.3.6) Constraint on Schemas obtains.
Constraint on Schemas: Unambiguous Content Model
For compatibility, it is an error if a content model is such that there
exist element item sequences within which some item can match more than one
occurrence of an elementRef,
elementDecl or wildcard in the content model.
A content model which allows mixed content provides for mixing elements with
character data in document instances. The same elemModel mechanism is used for specifying the
grammar of the allowed elements, with the changes that the implicit top-levl
model group has the choice
compositor
and minOccurs of 0 and maxOccurs of '*', thus ensuring that the default
behaviour is the same as that of XML.
Example
<type content="mixed"> <element ref="name1"/> <element ref="name2"/> <element ref="name3"/> </type>Allows character data mixed with any number of name1
,name2
andname3
elements.
Issue (noEmptyReqd): We need to make the elemModel rhs optional, to allow for mixed with no elements specified == our minimum commitment model. This in turn would allow us if we chose to get rid of an explicit empty flag: just specifyelementOnly
and no model.We could then get rid of any as well, given other mechanisms for controlled openness we're contemplating.
Note that most of this is actually realised in the current version, with the exception of the observation about empty.
The satisfy-mixed definition wrt schema-validity obtains.
This reconstructs another common use of parameter entities.
Named model groups | ||||||||||||||||||||
|
Groups defined with the allGroup option may only be referenced from a modelGroupRef which constitutes the only group at the top level of a content model.
Example
<group name="myModelGroup"> <element ref="myelement"/> </group> <element name="myelement"> <type> <group ref="myModelGroup"/> <attribute ...>. . .</attribute> </type> </element> <element name="anotherelement"> <type> <group order="choice"> <element ref="yetAnotherelement"/> <group ref="myModelGroup"/> </group> <attribute ...>. . .</attribute> </type> </element>A minimal model group is defined and used by reference, first as the whole content model, then as one alternative in a choice.
Issue (named-model-groups): In its vote on 1999-11-04, the WG agreed that this section was still open for discussion.
An [Definition:] element declaration associates an element name with a type, either by reference or by incorporation.
Issue (elt-default): The extension of defaulting to element content is tentative.
Element declaration | |||||||||||||||||||||||||
|
An element declaration associates a name with type. This name will appear in tags in instance
documents; the type provides SCs on
the form of elements tagged with the given name. An element
declaration whose elementSpec is an
typeSpec is comparable to an
<!ELEMENT ...>
declaration in an XML 1.0 DTD.
elementSpec not only allows for element declarations to associate a name with a complex type (by reference or inclusion), but also allows the reference or specification to be for a simple type, with the implication that no attributes are allowed in instances and the text-only content will be constrained appropriately.
elementRef provides for top-level element declarations to be referenced by name from content models.
As noted above element names are in a separate symbol space from the symbol spaces for the names of types, so there can (but need not be) a complex type or simple type with the same name as a top-level element.
The elt-fullname definition wrt schema-validity obtains.
An elementDecl may appear both at the top level of a schema and within a modelElt. See above (Rich Content Models (§3.4.6) and Mixed Content (§3.4.7)) for where this is allowed. This declares a locally-scoped association between an element name and a type. As with attribute names, locally-scoped element names reside in symbol spaces local to the type that defines them. Note however that type and datatype names are always top-level names within a schema, even when associated with locally-scoped element names.
The use above of simpleTypeSpce and complexTypeSpec, which have no provision for names, is intentional: nested types are anonymous.
See below at Unique, key and key reference constraints (§3.7) for generalConstraints.
An element declared as nullable may appear in instances with an
attribute whose name is null
from the XML Schema instance namespace and
value true
to distinguish a null content from an empty content.
It is an error for element information items marked
xsi:null="true"
to have any content.
Issue (nullRequiresEmpty): Is it a precondition for being nullable that the element's contentType allow no content? If not, then more needs to be said above, if so, this needs to be spelled out.
Example
<element name="myelement" type="mySimpleType"/> <element name="et0" type="myComplexType"/> <element name="et1"> <type> <element ref="et0"/> . . . <attribute ...>. . .</attribute> </type> </element> <element name="et2"> <type content="empty"> <attribute ...>. . .</attribute> </type> </element>The first two examples above declare elements by reference to a simple and a complex type respectively. The third and fourth use embedded anonymous complex types, the first of which in turn refers to one of the top-level elements in its content model.
<element name="contextOne"> <type> <element name="myLocalelement" type="myFirstType"/> <element ref="globalelement"/> </type> </element> <element name="contextTwo"> <type> <element name="myLocalelement" type="mySecondType"/> <element ref="globalelement"/> </type> </element>Instances of myLocalelement
withincontextOne
will be constrained bymyFirstType
, while those withincontextTwo
will be constrained bymySecondType
.
NOTE: The possibility that differing attribute declarations and/or content models would apply to elements with the same name in different contexts is an extension beyond the expressive power of a DTD in XML 1.0.
In the concrete syntax above, the type
attribute is used to
encode both the typeRef or datatypeRef options. In the case where there are both a
simple type and a complex type of the referenced name in the relevant schema, the
ambiguity is resolved in favour of the complex type.
NOTE: See previous note on the ambiguity issue.
Ed. Note: existing section on element declaration should be updated to cover instance syntax.
The Nested May Not Be Global (§6.2.3.7) Constraint on Schemas obtains.
The satisfy-ed definition wrt schema-validity obtains.
The ind-valid definition wrt schema-validity obtains.
The satisfy-etr definition wrt schema-validity obtains.
In order to exploit the full potential for extensibility offered by XML plus namespaces, more provision is needed than DTDs allow for targeted flexibility in content models and attribute declarations. At a given point in a content model, in addition to what DTDs provide for we need particles that allow the following:
Of course, by qualifying one of these with a *, we allow for any amount of (localized) flexibility in validation.
Attributes need the same kind of flexibility: a good-citizen schema
should probably allow any attributes from the xml:
namespace, for instance.
Wildcards | |||||||||||||||
|
The four alternatives for wildcard correspond to the four kinds of flexibility listed above.
All of the above are subject to the same ambiguity constraints (Unambiguous Content Model (§3.4.6)) as other content model particles: If an instance element could match either an explicit particle and a wildcard, or one of two wildcards, within the content model of a type, that model is in error.
Example
<any/> <any namespace="##other"/> <any namespace="http://www.w3.org/1999/Style/Transform/"> <any namespace="##targetNamespace"/> <anyAttribute namespace="http://www.w3.org/XML/1998/namespace"/>Concrete examples of the four cases listed above, plus one attribute case.
This section articulates what has only been hinted at above, namely a considerable increase in the power and expressiveness of schema declarations, by explaining what was provided for in the abstract syntax in the previous section, but not explained much if at all at that point: the potential for deriving new type definitions on the basis of old ones. [Definition:] We call such a new definition a derived type definition, and [Definition:] the old definition it is derived from the source type definition.
We provide two means for deriving type definitions from other type definitions, each of which implies a partial order over the types defined in a schema: A type definition may either restrict or extend another type definition.
A new type complex type can be defined by adding additional content model particles at the end of the element-only content model of another complex definition and/or by adding attribute declarations to any type definition. Members of a type whose definition is derived in this way, i.e. by extension, will always contain members of their source type within them as prefixes.
Extension | |||||||||||||||
|
For the time being, the effective content model of a type definition derived by extension from another complex type is composed by appending its contentModel to that of the source definition. It follows from this that the source definition must be complex and element-only if the contentModel is not empty. If it is empty, there is no constraint on the nature of the source definition, which may be simple or complex (thus the simpleTypeRef above). In either case, attributes may be added.
NOTE: The restriction to appending in the case of content-model extension simplifies application processing in order to cast instances from derived to source type. We may liberalise this in future versions, requiring more complex transformations to effect casting.
Example
<type name="personName"> <element name="title" minOccurs="0"/> <element name="forename" minOccurs="0" maxOccurs="*"/> <element name="surname"/> </type> <type name="extendedName" source="personName" derivedBy="extension"> <element name="generation" minOccurs="0"/> </type> <element name="addressee" type="extendedName"/> <addressee> <forename>Albert</forename> <forename>Arnold</forename> <surname>Gore</surname> <generation>Jr</generation> </addressee>A type definition for personal names, and a definition derived by extension which adds a single element; an element declaration referencing the derived definition, and a valid instance thereof.
A new type can be defined by decreasing the possibilities made available by an existing type definition: narrowing ranges, removing alternatives, etc. Restriction is specified bottom up, via simpleRestrictions or complexRestrictions in a complexTypeSpec, in either case referring to its source definition with a complexTypeRef. Members of a type whose definition is derived in this way, i.e. by restriction, will always be members of their source type as well.
Restriction | |||||||||||||||
|
A definition by restriction will restrict some of the permissions or obligations inherited therefrom. This means that if the source definition has text-only content, the simpleRestrictions option must be used, and its facets must each narrow the corresponding facet of the source definition, e.g. by reducing a range or removing members of an enumeration.
If the source definition has element-only or mixed content, restricting a content model is also possible via the complexRestrictions option. In this case each particle within the restriction is matched one for one with the corresponding particle in the source definition, and in each case a restriction must be effected, e.g. by narrowing the range of occurs, by reducing the members of a disjunction or by replacing a wildcard with a more explicit particle.
In either case, attributes may be restricted by adding and/or fixing defaults, or by restricting the attribute's simple type definition.
The bare contentModel option reflects the fact that type definitions without a source type, consisting simply of a content model, are interpreted as restricting the ur-type.
Example
<type name="simpleName" source="personName" derivedBy="restriction"> <restrictions> <element name="title" maxOccurs="0"/> <element name="forename" minOccurs="1" maxOccurs="1"/> </restrictions> </type> <element name="who" type="simpleName"/> <who> <forename>Bill</forename> <surname>Clinton</surname> </who>A simplified type definition derived from the source type from the previous example by restriction, eliminating one optional daughter and fixing another to occur exactly once; an element declared by reference to it, and a valid instance thereof.
A type definition can control the extent to which other types may be derived from it, the ways it may appear in content models and the import of those appearances.
Derivation control | |||||||||||||||
|
An abstract type definition may not be referenced from a particle, nor may it be used in an instance
as the value of xsi:type
(see below).
A type definition may prevent its use as a source for derived definitions by any or all means by declaring itself final for the prohibited means. In the absence of final the value of finalDefault from the containing schema is used.
A type definition may declare that it is exact: although not abstract, none-the-less types derived by any or all means from it may not appear in its place even when an element particle naming it would appear to allow this (see below). In the absence of exact the value of exactDefault from the containing schema is used.
In the light of type derivation, we propose to elaborate the significance of element when it serves as a particle. An element which specifies a type (either locally or by reference) which is the source of other types is understood as allowing that type or any of the types derived from it to govern its occurance in instances.
This introduces an ambiguity if the
source type is not abstract
and/or has a derived type which is
substitutable for another derived type. This is resolved by requiring instance
elements allowed via in this way which conform to types other than
the source type to
manifest their type using the type
attribute from the XML Schema
instance namespace,
e.g. xsi:type
.
If either the element particle or the referenced type definition is exact for some or all kind of derivation, then this elaboration doesn't happen for those derivations, i.e. types derived in those ways may not appear.
This facility is intended to provide a good impedence match with the needs of database and object-oriented programming applications.
Example
<type name="WorldAddress" source="po:Address" derivedBy="extension"> <element name="country" type="string"/> </type> <type name="GermanAddress" source="po:WorldAddress" derivedBy="extension"> <element name="land" type="string/> </type> <element name="person"> <type> . . . <element name="address" type="po:Address"/> </type> </element> <person> ... <address> ... </address> </person> <person> <address xsi:type="GermanAddress"> ... <country>Germany</country> <land>Saarland</land> </address> </person>Two types derived from the Address
type defined in Sample Schema (non-normative) (§F) are defined, adding first acountry
and then aland
element to its required content. Two schema-valid instances of an element declared with typeAddress
are shown, one using that type itself, and therefore not requiring disambiguation, and one using thexsi:type
attribute to indicate that it is using theGermanAddress
type.
We provide a mechanism to allow bottom-up specification of disjunctions over elements in content models. An element declared at the top level can declare itself to be a member of an equivalence class by reference to the exemplar of that class, itself an element declared at the top level.
Element Equivalence Classes | ||||||||||||||||||||||||||||||
|
The type of every member of an equivalence class must be the same as or derived from the type of the exemplar.
In a way similar to the reinterpretation of the element particle discussed in the previous section, we further extend the significance thereof by saying that an element particle of the elementRef form which references the exemplar of an equivalence class is understood as allowing not only the referenced element but any member of its equivalence class in instances.
Vacuous type derivation is allowed, i.e. an equivalence class member may have the same type as the exemplar of its class. The concrete syntax makes this easy by allowing element declarations which specify an equivClassRef to specify no type at all, in which case they are taken to have the same type as their exemplar.
If an element is abstract, then although references to it can appear in content models, it cannot itself allow element information items with its name to appear in instances: only instances of elements declared with it as their exemplar, if any, may appear.
If an element is exact for some or all kinds of derivation, then this
elaboration doesn't happen for those derivations, i.e. elements other than the
exemplar may not appear if their type is derived from that of the
exemplar in a proscribed way. The equivClass
value for exact prevents equivalence class member substitution without preventing derived type substitution per Reinterpreting Content Models (§3.6.4).
If an element is final for some or all kinds of derivation, elements whose type is derived from their source by the proscribed means may not nominate this element as the exemplar of their equivClassRef.
Example
<type name="facet" source="annotated" derivedBy="extension"> <attribute name="value" minOccurs="1"/> </type> <element name="facet" type="facet" abstract="true"/> <element name="encoding" equivClass="facet"> <type source="facet" derivedBy="restriction"> <attribute name="value" type="encodings"/> </type> </element> <element name="period" equivClass="facet"> <type source="facet" derivedBy="restriction"> <attribute name="value" type="timeDuration"/> </type> </element> <type name="datatype"> <element ref="facet" minOccurs="0" maxOccurs="*"/> <attribute name="name" type="NCName" minOccurs="0"> . . . </type>An example from the schema for datatypes from XML Schemas: Datatypes. The facet
type is defined and thefacet
element is declared to use it. Thefacet
element is abstract -- it's only defined to stand as the exemplar for a class). Two further elements are declared, each a member of thefacet
equivalence class. Finally a type is defined which refers tofacet
, thereby allowing eitherperiod
orencoding
(or any other member of the class).
[Definition:] Any schema implicitly defines an ur-type, which is the source for all types, simple or complex, which do not identify an explicit source type, and the default type for elements and attributes which do not specify one. You can think of the ur-type as if it were defined as follows for complex types:
<type name="ur-type" content="mixed"> <any/> <anyAttribute/> </type>
The mixed
content specification together with the
unconstrained wildcard content model and attribute specification produce the defining property for the
ur-type, namely that every type is a restriction of it: its permissions and requirements are
the least restrictive possible.
There is no way to notate the ur-type from the perspective of simple types: it's defining property is simply that all the ab-initio types are derived from it by restriction.
[Definition:] a type AT1 is said to refine a type AT2 if and only if AT1 is declared to refine either AT2 or (recursively) some type that refines AT2. [Definition:] AT2 is then said to be an ancestor of AT1. [Definition:] The effective constraints are the union of the explicit and the acquired.
We supplement the simple uniqueness and reference mechanisms provided by
ID
and IDREF
in XML 1.0 and SGML with three new kinds
of constraint, for uniqueness, keys and key references.
The unique and key constraints provide for selected elements to be checked as having locally or globally unique identities. The keyref constraint provides for checking referential consistency with respect to a declared unique or key constraint.
These constraints are specified independently of the types of the
attributes and elements involved, i.e. something declared as of type integer
may also serve as a key, unlike ID
and
IDREF
. Each constraint declaration has a name, which exists in a
single symbol space for constraints.
Overall the augmentations to XML's ID/IDREF
mechanism are:
Uniqueness Constraints | ||||||||||||||||||||||||||||||||||||||||
|
selector specifies an XPath expression [XPath] relative to instances of the element being declared, or to the root for generalConstraints declared at the top level. This must identify a node set of subelements (i.e. elements contained within the declared element) to which the constraint applies.
field specifies an XPath expression relative to each element selected by a selector. This must identify a single node (element or attribute, not necessarily within the selected element) whose content or value, which must be of a simple type, is used in the constraint. It is possible to specify an ordered list of fields, to cater to multi-field keys, keyrefs, and uniqueness constraints. A field must not evaluate to the same element or attribute as any other field in a given instance of a generalConstraint.
NOTE:
Provision for multi-field keys etc. goes beyond what is supported by xsl:key
.
Issue (restrictConstrXPaths): Xpaths can take arbitrarily complicated forms and it is unnecessary to burden XML Schema implementations with supporting every feature of XPath. The XPaths in selector and field should be restricted to certain specified simple forms.
Issue (fieldOnlyKeyref): Should the selector be made optional in a keyref, with default the element it's contained within?
Issue (islandValidConstraint): What are the implications of constraints if we support island validation constraints?
The refer of a keyref must match the constraintName of a unique or a key. The number of fields of the keyref must be the same as the number of fields of the referenced unique or key. For every element which is identified in a given scope by the selector of a keyref defined for that scope (call this ref), there must be an element in that scope identified by the selector of the named unique or key (call this target) such that the values of the fields of the keyref evaluated with respect to ref, taken in order, match the values of the fields of the unique or key evaluated with respect to target.
NOTE: If reference to a key or unique defined in a scoping element which may occur more than once is envisaged, then the scoping elements themselves must have keys (typically with global scope), and the scoped keys must include the key of their scoping element among their fields.
Example
<element name="state"> <type> <element name="stateCode" type="twoLetterCode"/> <element name="vehicle"> <type> . . . <attribute name="regNo" type="integer"/> </type> </element> . . . </type> </element> <key name="regKey"> <selector>.//vehicle[@regNo]</selector> <field>@regNo</field> <field>ancestor::state/stateCode</field> <!-- scope needs to be involved --> </key> <element name="person"> <type> . . . <element name="car"> <type model="empty"> . . . <attribute name="regRef" type="integer"/> <attribute name="regState" type="twoLetterCode"/> </type> </element> </type> <keyref name="carRef" refer="regKey"> <selector>.//car[@regRef]</selector> <field>@regRef</field> <field>../person/@regState</field> </keyref> </element>A state
element is defined, which inter alia contains astateCode
descendant and somevehicle
descendants. Avehicle
in turn has aregNo
attribute, which is an integer. The combination ofstateCode
andregNo
is asserted to be a key forvehicle
withinstate
. Furthermore, aperson
element has inter-alia an emptycar
element, withregRef
andregState
attributes, which are then asserted together to refer tovehicles
via theregKey
constraint.
Notations | ||||||||||||||||||||||||||||||
|
A notation may be declared by specifying a name and an identifier for the notation.
Example
<notation name="jpeg" public="image/jpeg" system="viewer.exe" /> <element name="picture"> <type source="binary" derivedBy="extension"> <attribute name="pictype" type="NOTATION"/> </type> </element> <picture pictype="jpeg">...</picture>The notation need not ever be mentioned in the instance document.
This chapter defines the mechanisms by which we establish the necessary precondition for establishing schema-validity, namely access to one or more schemas. This chapter also describes in detail related mechanisms for using in one schema, definitions and declarations from another.
Chapter 6 provides a formal definition of schema-validation. Here we describe a 3-layer architecture which incorporates that formal definition and relates it to XML documents and WWW-situated processes. This layering is provided to maximize the range of environments in which this specification can be applied, and to minimize the need for modifications to this specification as new standards and conventions for Web interoperability are developed. The layers are:
Layer 1 specifies the manner in which a set of schema components can be applied to validate an instance element. Layer 2, which is
primarily defined in Chapter 3, specifies the use of <schema>
elements in XML documents as the standard XML representation for
schema information in a broad range of computer systems and execution
environments. To support interoperation over the World Wide Web in particular,
layer 3 provides a set of conventions for schema reference on the
Web.
We note that improved or alternative conventions for Web interoperability can be standardized in the future without reopening this recommendation. For example, the W3C is currently considering initiatives to standardize the packaging of resources relating to particular documents and/or namespaces: this would be an addition to the mechanisms described here for layer 3. This architecture also facilitates innovation at layer 2: for example, it would be possible in the future to define an additional standard for the representation of schema components which allowed e.g. type definitions to be specified piece by piece, rather than all at once.
Ed. Note: Some of this text will probably end up in Chapter 6
The fundamental purpose of the schema-validation core is to define schema-validatity for a single element information item and its descendants with respect to a specified type definition. All processors are required to implement this core predicate in a manner which conforms exactly to this specification.
Schema-validity is defined with reference to [Definition:] a complete component set which consists of (at a minimum) the set of schema components (definitions and declarations) required for that validation. This is not a circular definition, but rather a post facto observation: no element information item can be schema-valid unless all the components required by any aspect of its (potentially recursive) validation are present in the complete component set.
As specified above, each schema component is associated directly or indirectly with a target namespace, or explicitly with no namespace. In the case of multi-namespace documents, components for more than one target namespace will co-exist in the complete component set.
Processors have the option to assemble ( and perhaps to optimize or pre-compile) the entire complete component set prior to the start of a validation episode, or to gather the complete component set lazily as individual components are required. In all cases it is required that:
NOTE: the validation core is defined in terms of the schema components
comprising a complete component set; no mention is made of the schema definition
syntax (i.e. <schema>
). Although many processors will acquire
schemas in this format, others may operate on compiled representations, on a
programmatic representation as exposed in some programming language, etc.
[Definition:] The fundamental schema-valid predicate applies to an element information item (EII) and a type definition against the background of a set of schema components comprising a complete component set.
The obligation of a schema-aware processor as far as the schema-validation core is concerned is to implement the definition of schema-valid. The choice of EII, as well as the determination of the type definition and complete component set. is not specified at layer 1.
The schema-valid predicate is defined recursively, and e.g. streaming processors may implement it in a way that augments the complete component set during processing in response to encountering new namespaces. The implication of the invariants expressed above is that schema-valid must be implemented so that the same validation outcome is given in such cases as would be given if the initial invocation of schema-valid was re-performed with the final complete component set. replacing the one initially employed.
This is basically provided by chapter 3 of the current WD, which defines an XML syntax for defining types and declaring elements, specifying their target namespace and collecting them into schema definitions (i.e. XML documents). On the perspective argued for here, chapter 3 should be understood as doing two things: defining the requirements on an XML 1.0 document for it to qualify as a schema definition; specifying how the concrete syntax of that document (at the infoset level) maps on to type definitions and element declarations.
NOTE: The two following sections relate to assembling the complete component set for validation from multiple sources. They should not be understood as a form of text substitution, but rather as providing mechanisms for distributed definition of schema components, with appropriate schema-specific semantics.
Ed. Note: "schemaLocation" really belongs at layer 3 . . .
We provide the following mechanism for assembling a complete component
set from several <schema>
elements:
Include | ||||||||||
|
A <schema>
element may contain one or more
<include>
elements. The <include>
element has a required
attribute, "schemaLocation"
, consisting of one or more URI references, which must resolve
to another <schema>
element, whose "targetNamespace"
attribute
must be identical to the containing <schema>
element's
"targetNamespace"
. The schema derived from a <schema>
then
consists of all the components it contains and all the components contained by
any schema it includes, recursively.
It is not an error for the same URI reference to be 'included' more than once, but only the first inclusion is effective.
It is an error for a component to be multiply defined -- see below.
NOTE: The "schemaLocation" attribute properly belongs at layer 3, and its use in this case is similar to its uses on other elements described in the next section.
Ed. Note: Basing identity of schema definition on identity of URI reference as above and in 4.2.2 below is clearly less than ideal. Do we want to try to include a clause allowing processors with good evidence that the same schema definition resource has been acquired by two different means (e.g. one absolute and one relative URI reference) to ignore one of them?
As described in section 2.2, every global schema component is associated with a target namespace (or, explicitly, with none). In this section we set out the exact mechanism and syntax in the XML form of schema definition by which a reference to a foreign component is made, that is, a component with a different target namespace from that of the referring component.
We require not only a means of addressing such foreign components but also a signal to schema-aware processors that a schema document contains such references and differentiates two subtly different cases:
"ref"
attribute on <group>
, <element>
and
<attributeGroup>
, the "type"
attribute on
<element>
and <attribute>
and the "source"
attribute on <type>
and <datatype>
to
be of type QName. Use of prefixes in such attributes is governed by
the normal rules for QNames, i.e. that there must be a namespace
declaration for the prefix in scope;<import>
element, which must appear at the
beginning of schema documents to identify namespaces used in external
references, i.e. those whose prefix or lack of it identifies them as coming
from a different namespace than the enclosing schema's target namespace. It has a required attribute "namespace"
which indicates that the schema document contains one or more
qualified references to schema components in that namespace (via one
or more prefixes declared with namespace declarations in the normal
way, see above), and an optional attribute "schemaLocation"
. When
only the "namespace"
attribute is present, the schema author is
leaving the identification of the schema to the instance, via the
mechanisms described below. When a "schemaLocation"
attribute is
present, it must contain a single URI reference which the schema
author warrents will resolve to a schema document containing the
component(s) referred to in the imported namespace.
NOTE: The "schemaLocation" attribute properly belongs at layer 3, and its use in this case is similar to its uses on other elements described in the next section.
NOTE: The treatment of references as QNames implies that since (with the exception of the schema for schemas) the target namespace and the XML Schema namespace differ, without massive redeclaration of the default namespace either internal references to the names being defined a schema or the schema declaration and definition elements themselves must be explicitly qualified.
Import | |||||||||||||||
|
Example
This design is a compromise: We've decided the pun of
<schema xmlns="http://www.w3.org/1999/XMLSchema" xmlns:html="http://www.w3.org/1999/XHTML"> <html:p>[Some documentation for my schema]</html:p> . . . <type name="myType"> <element ref="html:p" minOccurs="0"/> . . . </type> </schema>is actually OK, provided you declare the html namespace as imported ('mentioned' as well as 'used', if you will):
<import namespace="http://www.w3.org/1999/XHTML"/>
It is an error if the <import>
's "namespace"
and the imported schema's
target namespace are not the same. [?]
It is not an error for the same URI reference to be 'imported' more than once, but only the first import is effective.
It is an error if a "schemaLocation"
attribute in an instance
associates a different URI reference from one associated with the
same namespace in a schema.
It is an error to provide more than one
definition/declaration for the same component type within the same
target namespace: this is true regardless of whether the
definitions/declarations occur in the same schema document, or in
separate schema documents referenced via one or more parallel or
nested <include>
s or two or more nested <import>
s.
Layers 1 and 2 provide a framework for validation and XML definition of schemas in a broad variety of environments. Over time, we expect that a range of standards and conventions will evolve to support interoperability of XML Schema implementations on the World Wide Web. Layer 3 defines the minimum level of function required of all conformant processors operating on the Web: it is intended that, over time, future standards (e.g. XML Packages) for interoperability on the Web and in other environments can be introduced without the need to republish this specification.
NOTE: The core validation architecture requires that the complete component set of appropriate declarations for each global element and attribute be available. This requires may involve both resolving both instance->schema and schema->schema references. As observed above, we anticipate that the precise mechanisms for resolving such references will evolve over time. In support of such evolution, we have attempted to observe the design principle that references from one schema to another use mechanisms that directly parallel those used to reference a schema from an instance document.
For interoperability, schema definitions like all other Web resources are identified by
URI and retrieved using the standard mechanisms of the Web (e.g. http, https,
etc.) Schema definitions on the Web must be part of documents with the mime type text/xml
, and are represented in the
standard XML schema definition form described by layer two (I.e. as <schema>
elements).
NOTE: there will often be times when a schema definition will be a complete XML 1.0 document with a root element of<schema>
. There will be other occasions in which<schema>
elements will be contained in other documents, perhaps referenced using fragment and/or Xpointer notation.
As described in section 4.1, processors are responsible for providing the schema components (definitions and declarations) needed for validation. This section introduces a set of normative conventions to facilitate interoperability for instance documents and schemas retrieved and validated from the Web.
NOTE: As discussed above in section 4.2, other non-Web mechanisms for delivering schemas for validation may exist, but are outside the scope of this recommendation.
The validation mechanisms of section 4.1 specify that schema-valid applies to an element information item to be validated, and the type to be used to validate it. Documents on the Web can be validated in their entirety, from the document root element, or else individual elements can be selectively validated.
Processors on the Web are free to attempt validation against arbitrary complete component sets (as defined in Section 4.1) and against any type. However, it is useful to have a common convention for determining a complete component set, and an initial type. For this purpose, we require that for general-purpose schema-aware processors (i.e. those not specialised to one or a fixed set of pre-determined schemas), unless directed otherwise by a user the element information item to be validated is either the document element of an information set, or the local root of a new namespace-governed region within an information set.
To determine the type declaration to be used for the validation in these cases, the processor is required to retrieve a schema document with a "targetNamespace" matching the namespace URI, if any, of the element information item to be validated. Again, unless directed otherwise, the processor must locate within that schema a global element declaration with an NCName matching that of the element information item to be validated. The processor must ensure that both the element declaration, and the type declaration used in that element declaration, are included in the complete component set for the validation. The type thus identified is the type used for validation of the element information item. The composition of the complete component set for validation is governed by section 4.2 above.
The means used to locate appropriate schema document(s) are processor and application dependent, subject to a following requirements:
NOTE: Experience suggests that it is not in general safe or desirable from a performance point of view to dereference NS URIs as a matter of course. User community and/or consumer/provider agreements may establish conventions which make this a sensible strategy: This recommendation allows but does not require this. Users are always free to supply namespace URIs as schema location information when dereferencing is desired: see below.
"schemaLocation"
attribute (in the XML Schema namespace) to record
this fact with pairs of URI references (one for the namespace URI, and
one for the schema definition).Again, unless directed otherwise general-purpose schema-aware processors must attempt to dereference each schema URI in the value of "schemaLocation" to obtain a schema, whose target namespace must match the namespace it appears in conjunction with, if any. Failure to dereference the URI, failure to locate a valid document of mime type text/xml, failure to find a
<schema>
or mismatch of the targetNamespace URI is an error. "schemaLocation"
attributes can occur on any element participating in a
validation, and all such attributes must be processed as if they had occurred at the validation root. According to the rules of
section 4.1, the complete component set can be lazily assembled, but is otherwise
stable throughout a validation. Although schema location attributes can occur
on any element, and can be processed incrementally as discovered, their effect
is essentially global to the validation. Definitions and declarations remain
in effect beyond the scope of the element on which the binding is declared.
NOTE: The issue of the value space for QNames, in particular in the case of prefixes with no declaration, needs to be carefully elucidated in the Datatype draft.
Multiple schema bindings can be declared using a single attribute. For example consider a stylesheet:
<stylesheet xmlns="http://www.w3.org/1999/Style/Transform" xmlns:html="http://www.w3.org/1999/XHTML" xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xsi:schemaLocation="http://www.w3.org/1999/Style/Transform http://www.w3.org/1999/Style/Transform/xslt.xsd http://www.w3.org/1999/XHTML http://www.w3.org/1999/XHTML/xhtml.xsd"
NOTE: the namespace URIs used in "schemaLocation"
can, but need not
match those actually qualifying the element within whose start tag it is found or its other attributes. For example, all
schema location information can be declared on the document element
of a document, if desired,
regardless of where the namespaces are actually used.
Ed. Note: I'm not sure about using spaces for separators -- more thought needed. But note by not using prefixes, we don't need to worry about a syntax for the default namespace case. But we do need to worry about the 'no namespace' case -- what do I put in "schemaLocation" if I don't declare any namespaces?
Ed. Note: 4.3.2 can still be improved by shifting more from processing language to abstract predicate language.
Ed. Note: Oops, another bug with QNames: In 4.2.2, we've left ourselves with no way in a component which does have a target namespace to refer to a component with none.
Annotation of schemas and schema components, with material for human or computer consumption, is provided for by allowing application information and human information at the beginning of most major schema elements, and anywhere at the top level of schemas.
Annotations | ||||||||||||||||||||
|
info is intended for human consumption, appinfo for automatic processing. In both cases, provision is made for an optional URI reference to supplement the local information. Schema validation does not involve dereferencing these URIs, when present. In the case of info, indication may be given as to the identity of the (human) language used in the contents, using the xml:lang
attribute.
Issue (error-behavior): This draft includes extensive discussion of conformance and validity checking, but rules for dealing with errors are missing. In future, we must distinguish errors from fatal errors, and clarify rules for dealing with both.
NOTE: This section is in the process of being redrafted, and is not guaranteed to be coherent yet. The material after the stale below here editorial note is in many cases out of sync with the material above in sections 3 and 4.We use the terms schema and type definition and perhaps others in more careful way than has been the case heretofore, reserving them for abstract datatypes with content as per the abstract syntax, as opposed to XML elements per the concrete syntax. This change will have to be reflected upwards in due course.
We approach the definition of schema validity one step at a time. In the definitions below we deal primarily in terms of information sets, rather than the documents which give rise to them: see [XML-Infoset] for definitions of XML information set and information item. Please note that the formal definitions below are explicitly not couched in processing terms: they describe properties of an information set, but do not tell you how to check an information set to see if it has those properties.
Schema-validity is first and foremost a property of element information items with respect to type definitions and schemas. This recommendation does not cover all aspects of how the type definitions and schemas are identified, but it does specify quite carefully what it means to be schema-valid once you've got them.
First we define our terms:
[Definition:] An EII is an element information item from an XML information set which conforms to [XML-Infoset] with Namespace processing.
[Definition:] A TNS (for Target Namespace Set) is a set, possibly empty, of namespace names, all denoting the same namespace.
[Definition:] A schema is a set of named type definitions and element declarations, each associated with a TNS.
[Definition:] A type definition is a TNS and either a complex type definition as defined by the typeDefn production or a simple type definition as defined by the datatypeDefn production.
[Definition:] An EII is schema-valid with respect to a type definition and a set of schemas if and only if:
or
[Definition:] The effective stype of a datatypeDefn with respect to a set of schemas is [TBFI].
[Definition:] A string is an instance of an effective stype if and only if [TBFI]
[Definition:] The effective ctype of a typeDefn with respect to a set of schemas is [TBFI]. [Definition:] We refer to the set of all attribute declarations in an effective ctype as its attribute set. [Definition:] We refer to the simple type or content model in an effective ctype as its content type
[Definition:] A sequence of element and information items is an instance of an effective ctype's content type with respect to a TNS and a set of schemas if and only if [TBFI]
Ed. Note: The formal nature of a complex type is slightly different to that of a simple type. A simple type is a 3-tuple of value space, lexical space and facets. A value space is a (possibly infinite) set of values, a lexical space is a (possibly infinite) set of strings. A complex type is a (possibly infinite) set of element information items. These items form equivalence classes as a result of the optionality of some properties of information items: a pair of information items whose required properties are equivalent are themselves equivalent. A complex type is defined by a set of constraints which must be satisfied by every member of that type.
Ed. Note: We can either think in terms of shallow validity, which only requires a type definition, and checks that all attribute II and daughter EII names are as they should be, and full (recursive) validity in which the types of those attribute IIs and daughter EIIs are also checked. Alas because of element references in content models the schema set parameterises BOTH of those, I was hoping shallow was just a matter for the type. . .
Ed. Note: ************ STALE BELOW HERE *************
First we have to get to the schema(s) involved. This is slightly tricky, as not all namespace declarations will resolve to schemas, and not everything that purports to be a schema will be one.
[Definition:] A URI is said to
nominate a schema if it resolves to an element item in the
information set of a well-formed XML 1.0 document whose local name is
schema
and whose namespace item's URI identifies either
or
[Definition:] A URI is said to resolve successfully to a schema if it nominates a schema, and the element item it resolves to represents an XML schema, that is:
[Definition:] An element item is schema-ready if the URI of any of its namespace declaration items which nominates a schema resolves successfully to a schema.
Issue (namespace-declaration-items): Namespace items associated with namespace declarations have disappeared from the most recent version [XML-Infoset]. Several WGs need them, we expect they'll be back, otherwise we can reconstruct what we need from element and attribute namespace items alone with some effort.
[Definition:] A document is schema-ready if every element item anywhere in its information set is schema-ready.
Note that this means that documents with no namespace declarations, or only namespace declarations which do not nominate schemas are none-the-less schema-ready.
[Definition:] We say an element item is schema-governed if its name is in a namespace, and the URI of the information item for that namespace resolves successfully to a schema.
[Definition:] We use the name schema root for any element item which is schema-governed and which is either
or
Ed. Note: All this has to go, now that we don't provide for entity definition or substitution.
The provision within XML Schema: Structures of a mechanism for defining
parsed entities presents problems
for the relationship between schema-validity and XML 1.0 well-formedness, since
references to entities declared only in a schema are undefined from the XML 1.0
perspective. Strictly speaking, a well-formed XML document may contain
references to undefined entities only if it is declared as
standalone="no"
and contains either an external subset
or one or more references to external parameter entities in their internal
subset. We get around this by [Definition:] defining a nearly well-formed XML
document to be one which either is well-formed per XML 1.0, or which fails to
be well-formed only because of undefined general entity references, but which
would be well-formed if it were standalone="no"
and
identified an external subset. We consider this justified on the
grounds that the use of a namespace declaration which refers to a schema
functions rather as an external subset, and from the XML 1.0 perspective such a
reference almost of necessity renders the document non-standalone when
schema-validation is applied.
[Definition:] We use the name string-infoset-in-context for the XML 1.0 information set items arising from the interpretation of a string in the context of a particular point in an XML 1.0 information set.
[Definition:] The effective element item of an element item (call this OEI) is an element item whose
The Expansions Schema-Ready (§6.2.7) Schema Validity Constraint obtains.
The Ungoverned RUE (§6.2.7) Schema Validity Constraint obtains.
Note that the above constraints and definition mean that in error-free documents, all element items, even ones which are not schema-governed, have well-defined effective element items.
[Definition:] A document is schema-valid if and only if:
NOTE: The validity of all other schema-governed element items follows from (3) above by the recursive nature of the Schema-validity Constraint referenced there.
NOTE: It is intentional that the above definition labels as schema-valid a document with no namespace declarations or with only namespace declarations which do not nominate schemas.
Note that there is no requirement that the schema root mentioned above be the root of its document, or that schemas be the roots of their documents, or that schema and schema root be in different documents. Accordingly, it is possible for a single schema-valid document to contain both a schema and the material which it validates.
The interaction between XML 1.0 DTDs and XML Schemas is complex but clear:
NOTE: The above is silent on whether schema-valid documents must be Namespace-conforming.
[Definition:] The augmented information set of a schema-valid document is the information set rooted in the effective element item of its document element, augmented by all the information items described in any Schema Information Set Contributions which apply to any information items anywhere within it.
Constraint on Schemas: Unique Definition
The same NCName must not appear in
two definitions or declarations of the same type.
Constraint on Schemas: Consistent Import
The URI for the namespace determined by a QName used to reference a schema component must either be the targetNamespace URI of the containing schema, or
must be declared in an References to schema components across namespaces (§4.2.2) of the current
schema.
[Definition:] A ...Ref identifies a ...Spec provided there is a definition or declaration of that ...Spec in the appropriate schema whose NCName matches the NCName of the ...Ref's ...Name. If there is no in the ...Name, the appropriate schema is the current schema or a schema it eventually includes; if there is a , the URI contained in or abbreviated by it must resolve successfully to a schema, which is then the appropriate schema.
Constraint on Schemas: Avoid Built-ins
The NCName must not be the same as
the name of any of the built-in datatypes (see [XML Schemas: Datatypes]).
[Definition:] A string (possibly empty) dt-satisfies a datatypeSpec and an optional datatypeRestriction if
and
Schema Information Set Contribution: Datatype Info
When a string dt-satisfies a
datatypeRef and an optional
datatypeRestriction, the containing attribute or
element information item will be augmented to indicate the
datatypeSpec and the facets (if any) which it satisfied.
Constraint on Schemas: AttrGroup Unique
The same attributeGroupDefn must not be
referenced by two or more attributeGroupRefs in the
same typeSpec.
Constraint on Schemas: AttrGroup Identified
Every attributeGroupRef in an
typeSpec must identify an attributeGroupDefn.
[Definition:] The attribute declaration set of an typeSpec consists of all its effective attributes together with all the attributes contained in the attribute groups identified by any attributeGroupRefs it contains.
[Definition:] The full name of an attribute in an attribute declaration set is its NCName plus its , i.e. if it appeared directly in the typeSpec, the empty string, if it was inherited or if it came from an attribute group, then the which identified the relevant typeSpec or attribute group respectively, if any, otherwise the empty string.
Constraint on Schemas: Attribute Locally Unique
The same full name must not
appear more than once in any typeSpec's
attribute declaration set.
[Definition:] An element item a-satisfies an typeSpec if the element item's attribute items taken together as a set attrs-satisfy the typeSpec's attribute declaration set, and either
or
Issue (sic-elt-default): The above definitions do not provide for handling a default on an type's datatypeRef. Preferred solution: empty element items ipso facto satisfy datatypeRefs with defaults and are augmented with the default value. This would have the consequence that you cannot provide the empty string as the explicit value of an element item if it's governed by a datatypeRef with a default.
Schema Information Set Contribution: Type Info
When an element item a-satisfies a
typeSpec, that element information item
will be augmented to indicate the typeSpec
which it satisfied.
[Definition:] An attribute item attr-satisfies an attribute if
or
where the attribute item's value consists of only character information items and by its "value string" is meant the string formed by concatenating the characters of each of those character information item children, if any, or else the empty string.
[Definition:] The attribute items of an element item attrs-satisfy an attribute declaration set if
and
Schema Information Set Contribution: Attribute Value Default
For every attribute in the
attribute declaration set
not used to attr-satisfy
an attribute item in the context of (1a) above which has a
datatypeRef which has a default, an
attribute item with the default value is added to the parent element item.
[Definition:] A sequence of character and element items (call this CESeq) model-satisfies an effective contentModel if
or
Constraint on Schemas: Element Unique in Mixed
A given NCName must not appear two or
more times among the elementDecls and
elementRefs with no a given elementName must not appear two or more times
among the elementRefs.
[Definition:] An element item mixed-satisfies a mixed if
or
or
Issue (mixed-change-current-schema): There's an implicit change in current schema in the definition of satisfy-mixed above which should be made explicit.
[Definition:] A sequence of element items elementOnly-satisfies an effective elementOnly if
NOTE: The above definition of elementOnly-satisfy does not explicitly incorporate the modifications required when the containing type is open, as set out at the end of Deriving Type Definitions (§3.6), but it should be understood as doing so.
Constraint on Schemas: Element Consistency
A given NCName must not appear both
among the elementDecls and among the
elementRefs with no s, or more than once among the
elementDecls.
NOTE: Note that the above permits repeated use of the same elementRef, analogous to DTD usage.
NOTE: EDITORS: Add a COS for the checking of valid pairs of minOccurs and maxOccurs.
[Definition:] The full name of a top-level elementDecl is its NCName plus its , i.e. if it appeared directly in the current schema or an include, the empty string, if it was imported, then the of that import, which must successfully resolved to its containing schema.
[Definition:] An element item e-satisfies an elementDecl if the elementDecl:
or
Constraint on Schemas: Nested May Not Be Global
An elementSpec in a nested
elementDecl must not be global.
Constraint on Schemas: Cannot Shadow Global
If a top-level elementSpec is
global, then the NCName of its
elementDecl must not be redeclared by any
nested elementDecl in the same schema or
any schema it eventually
includes.
[Definition:] An element item is independently valid if there is a top-level elementDecl whose NCName matches its name in the schema its namespace item resolves to (or a schema that schema includes, in which case see the definition of identify for details on which declaration is used if there is more than one), and the element item must e-satisfy that elementDecl.
[Definition:] An element item ref-satisfies an elementRef if
or
NOTE: The last clause above is much too complex, it needs to be split apart and built up in stages. It is this which allows elements based on refining types to appear in place of those based on their ancestors.
Constraint on Schemas: Refer to Schema
The URI associated with a in any of
the productions above must successfully
resolve to a schema.
Constraint on Schemas: Name Consistently Defined
The NCName in each of the above
productions must identify a declaration or definition of the corresponding
class (element, type, etc.)
Constraint on Schemas: Preorder Priority for Included Definitions
When using a ...Ref to identify
a ...Spec, if there is no appropriate matching declaration or definition
in the current schema, but there is more than one
eventually included schema
which contains an appropriate matching declaration or definition, the
...Spec whose declaration or definition occurs first in a preorder
traversal of the eventually
included schemas is the one identified.
[Definition:] A schema directly includes another schema if the first schema has an include and the URI contained in or abbreviated by the of that include resolves successfully to the second schema.
[Definition:] A schema eventually includes another schema if the first schema directly includes the second, or if the first schema directly includes some other schema which itself eventually includes the second.
Schema-validity Constraint: Expansions Schema-Ready
Any element item anywhere within the string-infoset-in-context replacing an RUE child per
the above must be schema-ready.
Schema-validity Constraint: Ungoverned RUE
RUEs must not appear in element items
which are not schema-governed,
that is in the values of attributes of or as children of such elements.
NOTE: This section has fallen out of alignment with the rest of the specification, but is included none-the-less to give a feeling for how this section will eventually look: the details should not be taken too seriously.
Each step in the following presupposes the successful outcome of the previous step.
A conforming XML Schema processor must:
NOTE: Note that the schema contribution to the information set above is meant to be suggestive only at this point, until we've articulated all the Schema Information Set Contributions in the preceding sections.
The XML Schema definition for XML Schema: Structures itself is presented here as normative part of the specification, and as an illustrative example of the XML Schema in defining itself with the very constructs that it defines. The names of XML Schema language types, elements, attributes and groups defined here are evocative of their purpose, but are occasionally verbose.
There is some annotation in comments, but a fuller annotation will require the use of embedded documentation facilities or a hyperlinked external annotation for which tools are not yet readily available.
Since an XML Schema: Structures is an XML document, it has optional XML and doctype
declarations that are provided here for completeness. The root
schema
element defines a new schema. Since this is a schema for
XML Schema: Structures, the targetNS
references the XML Schema namespace itself, and specifies that this
is version "0.8".
In the following definition of the schema
element, the
preamble is realised with attributes corresponding
to targetNamespace and schemaVersion. The
xmlns
attribute corresponds to xmlSchemaRef. The
schema
's definitions and declarations are represented by
datatype
, type
, element
,
attribute
, attributeGroup
, group
and notation
.
<?xml version='1.0'?> <!-- XML Schema schema for XML Schemas: Part 1: Structures --> <!DOCTYPE schema PUBLIC "-//W3C//DTD XMLSCHEMA 19991216//EN" "structures.dtd" [ <!ATTLIST schema xmlns:x CDATA #IMPLIED> <!-- keep this schema XML1.0 valid --> ]> <schema xmlns="<http://www.w3.org/1999/XMLSchema" targetNamespace="<http://www.w3.org/1999/XMLSchema" xmlns:x="http://www.w3.org/XML/1998/namespace" version="Id: structures.xsd,v 1.28 1999/12/16 09:43:47 aqw Exp "> <!-- get access to the xml: attribute groups for xml:lang --> <import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/XML/1998/xml.xsd"/> <!-- The datatype element and all of its members are defined in XML Schema: Part 2: Datatypes --> <include schemaLocation="<http://www.w3.org/TR/1999/WD-xmlschema-2-19991217/datatypes.xsd"/> <type name="annotated"> <annotation> <info>This type is extended by all types which allow annotation other than <schema> itself</info> </annotation> <element ref="annotation" minOccurs="0"/> </type> <element name="schemaTop" abstract="true" type="annotated"> <annotation> <info>This abstract element defines an equivalence class over the elements which occur freely at the top level of schemas. These are: datatype, type, element, attributeGroup, group, notation All of their types are based on the "annotated" type by extension.</info> </annotation> </element> <!-- schema element --> <element name="schema"> <annotation> <info>The obnoxious duplication in the content model below is to avoid infringing the no-ambiguity constraint while still allowing annotation virtually anywhere.</info> </annotation> <type> <group order="choice" minOccurs="0" maxOccurs="*"> <element ref="include"/> <element ref="import"/> <element ref="annotation"/> </group> <element ref="schemaTop"/> <group order="choice" minOccurs="0" maxOccurs="*"> <element ref="annotation"/> <element ref="schemaTop"/> </group> <attribute name="targetNamespace" type="uri"/> <attribute name="version" type="string"/> <attribute name="finalDefault" type="derivationSet"/> <attribute name="exactDefault" type="exactSet"/> </type> </element> <!-- annotation element --> <element name="annotation"> <type> <group order="choice" minOccurs="0" maxOccurs="*"> <element name="appinfo"> <type content="mixed"> <any minOccurs="0" maxOccurs="*"/> <attribute name="source" type="uri"/> </type> </element> <element name="info"> <type content="mixed"> <any minOccurs="0" maxOccurs="*"/> <attribute name="source" type="uri"/> <attributeGroup ref="x:lang"/> </type> </element> </group> </type> </element> <!-- For references to a type --> <!-- 'element', 'attribute' and any all use this --> <attributeGroup name="typeRef"> <attribute name="type" type="QName"/> </attributeGroup> <!-- For 'element' and 'attribute' --> <attributeGroup name="valueConstraint"> <attribute name="default" type="string"/> <attribute name="fixed" type="string"/> </attributeGroup> <!-- for all particles --> <attributeGroup name="occurs"> <attribute name="minOccurs" type="non-negative-integer" default="1"/> <attribute name="maxOccurs" type="string"/> <!-- allows '*', so integer won't do --> </attributeGroup> <!-- for element, group and attributeGroup, which both define and reference --> <attributeGroup name="defRef"> <attribute name="name" type="NCName" minOccurs="0"/> <attribute name="ref" type="QName" minOccurs="0"/> </attributeGroup> <!-- 'element', 'group' and 'any' --> <group name="particle" order="choice"> <element name="element" type="element"/> <element name="group" type="anonGroup"/> <element ref="any"/> </group> <group name="restrictionParticle" order="choice"> <element name="sic"><type content="empty"/></element> <group ref="particle"/> </group> <group name="attrDecls"> <group order="choice" minOccurs="0" maxOccurs="*"> <element ref="attribute"/> <element ref="attributeGroup"/> </group> <element name="anyAttribute" type="namespaceList" minOccurs="0"/> </group> <!-- types for type --> <type name="type" source="annotated" derivedBy="extension" abstract="true"> <group order="choice"> <element ref="restrictions" minOccurs="0"/> <group> <group ref="particle" minOccurs="0" maxOccurs="*"/> <group ref="attrDecls"/> </group> </group> <attribute name="name" type="NCName" minOccurs="0"> <annotation> <info>Will be restricted to required or forbidden</info> </annotation> </attribute> <attribute name="content"> <datatype source="NMTOKEN"> <enumeration value="elementOnly"/> <enumeration value="textOnly"/> <enumeration value="mixed"/> <enumeration value="empty"/> </datatype> </attribute> <attribute name="source" type="QName"/> <attribute name="derivedBy" type="derivationChoice"/> <attribute name="abstract" type="boolean" default="false"/> <attribute name="final" type="derivationSet"/> <attribute name="exact" type="derivationSet"/> </type> <type name="namedType" source="type" derivedBy="restriction"> <annotation> <info>This is for the top-level type element, daughter of <schema</info> </annotation> <attribute name="name" minOccurs="1"> <annotation><info>Required at the top level</info></annotation> </attribute> </type> <type name="anonType" source="type" derivedBy="restriction"> <annotation> <info>This is for the nested type element, daughter of <element</info> </annotation> <attribute name="name" maxOccurs="0"> <annotation><info>Forbidden when nested</info></annotation> </attribute> </type> <!-- Top level type element, daughter of schema --> <element name="type" equivClass="schemaTop" type="namedType"/> <key name="type"> <selector>schema/type</selector> <field>@name</field> </key> <key name="element"> <selector>schema/element</selector> <field>@name</field> </key> <keyref name="datatypeRef" refer="datatype"> <selector>.//attribute[@type]</selector> <field>@type</field> </keyref> <datatype name="derivationChoice" source="NMTOKEN"> <enumeration value="extension"/> <enumeration value="restriction"/> </datatype> <datatype name="exactSet" source="string"> <annotation> <info>Should be a sequence drawn from the values of derivationChoice plus 'equivClass', or #all -- regexp is only an approximation</info> </annotation> <pattern value="#all?|(equivClass|extension|restriction| )*"/> </datatype> <datatype name="derivationSet" source="exactSet"> <annotation> <info>Should be a sequence drawn from the values of derivationChoice, or #all -- regexp is only an approximation</info> </annotation> <pattern value="#all?|(extension|restriction| )*"/> </datatype> <!-- restrictions element --> <element name="restrictions"> <type source="annotated" derivedBy="extension"> <group order="choice"> <element ref="facet" minOccurs="0" maxOccurs="*"/> <!-- max 1, min 0, for each facet except pattern, period--> <group ref="restrictionParticle" minOccurs="0" maxOccurs="*"/> </group> <group ref="attrDecls"/> </type> </element> <!-- The element element can be used either at the toplevel to define an element-type binding globally, or within a content model to either reference a globally-defined element or type or declare an element-type binding locally. The ref form is not allowed at the top level --> <type name="element" source="annotated" derivedBy="extension"> <group order="choice" minOccurs="0"> <element name="datatype" type="anonDatatype"/> <element name="type" type="anonType"/> </group> <group order="choice" minOccurs="0" maxOccurs="*"> <element ref="unique"/> <element ref="key"/> <element ref="keyref"/> </group> <attributeGroup ref="defRef"/> <attributeGroup ref="typeRef"/> <attribute name="equivClass" type="QName"/> <attributeGroup ref="occurs"/> <attributeGroup ref="valueConstraint"/> <attribute name="nullable" type="boolean" default="false"/> <attribute name="abstract" type="boolean" default="false"/> <attribute name="final" type="boolean" default="false"/> <attribute name="exact" type="exactSet"/> </type> <type name="namedElement" source="element" derivedBy="restriction"> <restrictions> <attribute name="name" minOccurs="1"/> <!-- required at top level --> <attribute name="ref" maxOccurs="0"/> <!-- forbidden at top level --> </restrictions> </type> <element name="element" type="namedElement" equivClass="schemaTop"/> <!-- group element for named top-level groups, group references and anonymous groups in content models --> <type name="group" source="annotated" derivedBy="extension" abstract="true"> <group ref="particle" minOccurs="0" maxOccurs="*"/> <attributeGroup ref="defRef"/> <attributeGroup ref="occurs"/> <attribute name="order" default="seq"> <datatype source="NMTOKEN"> <enumeration value="choice"/> <enumeration value="seq"/> <enumeration value="all"/> <!-- allowed only at top level --> </datatype> </attribute> </type> <type name="namedGroup" source="group" derivedBy="restriction"> <restrictions> <attribute name="name" minOccurs="1"/> <!-- required at top level --> <attribute name="ref" maxOccurs="0"/> <!-- forbidden at top level --> </restrictions> </type> <type name="anonGroup" source="group" derivedBy="restriction"> <restrictions> <!-- required at top level --> <attribute name="name" maxOccurs="0"/> <!-- forbidden when nested --> </restrictions> </type> <element name="group" equivClass="schemaTop" type="namedGroup"/> <!-- The wildcard specifier in content models --> <element name="any"> <type content="empty"> <attribute name="namespace" type="namespaceList"/> <attributeGroup ref="occurs"/> </type> </element> <!-- simple type for the value of the 'namespace' attr of 'any' and 'anyAttribute' --> <!-- Value is ##any - - any non-conflicting WFXML/attribute at all ##other - - any non-conflicting WFXML/attribute from namespace other than targetNS one or - - any non-conflicting WFXML/attribute from more URI the listed namespaces references (space separated) ##targetNamespace may appear in the above list, to refer to the targetNamespace of the enclosing schema --> <datatype name="namespaceList" source="string"/> <!-- the attribute element declares attributes --> <element name="attribute"> <type source="annotated" derivedBy="extension"> <element name="datatype" minOccurs="0"> <type source="datatype" derivedBy="restriction"> <attribute name="name" maxOccurs="0"> <annotation><info>must be nameless</info></annotation> </attribute> </type> </element> <attribute name="name" type="NCName" minOccurs="1"/> <attributeGroup ref="typeRef"/> <attribute name="minOccurs" default="0"> <datatype source="non-negative-integer"> <enumeration value="0"/> <enumeration value="1"/> </datatype> </attribute> <attribute name="maxOccurs" default="1"> <datatype source="non-negative-integer"> <enumeration value="0"/> <enumeration value="1"/> </datatype> </attribute> <attributeGroup ref="valueConstraint"/> </type> </element> <!-- attributeGroup element --> <type name="attributeGroup" source="annotated" derivedBy="extension" abstract="true"> <group order="choice" minOccurs="0" maxOccurs="*"> <element ref="attribute"/> <element name="attributeGroup" type="anonAttributeGroup"/> </group> <element name="anyAttribute" type="namespaceList" minOccurs="0"/> <attributeGroup ref="defRef"/> </type> <type name="namedAttributeGroup" source="attributeGroup" derivedBy="restriction"> <restrictions> <attribute name="name" minOccurs="1"/> <!-- required at top level --> <attribute name="ref" maxOccurs="0"/> <!-- forbidden at top level --> </restrictions> </type> <type name="anonAttributeGroup" source="attributeGroup" derivedBy="restriction"> <restrictions> <attribute name="ref" minOccurs="1"/> <!-- required when nested --> <attribute name="name" maxOccurs="0"/> <!-- forbidden when nested --> </restrictions> </type> <element name="attributeGroup" type="namedAttributeGroup" equivClass="schemaTop"/> <element name="include"> <type content="empty"> <attribute name="schemaLocation" type="uri" minOccurs="1"/> </type> </element> <element name="import"> <type content="empty"> <attribute name="namespace" type="uri" minOccurs="1"/> <attribute name="schemaLocation" type="uri"/> </type> </element> <!-- Better reference mechanisms --> <type name="keybase" source="annotated" derivedBy="extension"> <element name="selector"/> <element name="field" minOccurs="1" maxOccurs="*"/> <attribute name="name" type="NCName" minOccurs="1"/> </type> <element name="unique" type="keybase" equivClass="schemaTop"/> <element name="key" type="keybase" equivClass="schemaTop"/> <element name="keyref" equivClass="schemaTop"> <type source="keybase"> <attribute name="refer" type="QName" minOccurs="1"/> </type> </element> <!-- notation element type --> <element name="notation" equivClass="schemaTop"> <type source="annotated" derivedBy="extension"> <attribute name="name" type="NCName" minOccurs="1"/> <attribute name="public" type="public" minOccurs="1"/> <attribute name="system" type="uri"/> </type> </element> <datatype name="public" source="string"/> <!-- notations for use within XML Schema schemas --> <notation name="XMLSchemaStructures" public="structures" system="<http://www.w3.org/TR/1999/WD-xmlschema-1-19991217/structures.xsd"/> <notation name="XML" public="REC-xml-19980210" system="http://www.w3.org/TR/1998/REC-xml-19980210"/> </schema>
NOTE: And that is the end of the schema for XML Schema: Structures.
The DTD for XML Schema: Structures is given below. Note there is no
implication here the schema
must be the root element of a document.
<!-- DTD for XML Schemas: Part 1: Structures --> <!-- Id: structures.dtd,v 1.27 1999/12/16 15:40:58 ht Exp --> <!-- The datatype element and its components are defined in XML Schema: Part 2: Datatypes --> <!-- Note %p is defined in datatypes.dtd --> <!ENTITY % xs-datatypes PUBLIC 'datatypes' '../WD-xmlschema-2-19991217/datatypes.dtd' > %xs-datatypes; <!ENTITY % s ''> <!-- if %p is defined (e.g. as foo:) then you must also define %s as the suffix for the appropriate namespace declaration (e.g. :foo) --> <!ENTITY % nds 'xmlns%s;'> <!-- Define all the element names, with optional prefix --> <!ENTITY % schema "%p;schema"> <!ENTITY % type "%p;type"> <!ENTITY % restrictions "%p;restrictions"> <!ENTITY % element "%p;element"> <!ENTITY % unique "%p;unique"> <!ENTITY % key "%p;key"> <!ENTITY % keyref "%p;keyref"> <!ENTITY % selector "%p;selector"> <!ENTITY % field "%p;field"> <!ENTITY % group "%p;group"> <!ENTITY % any "%p;any"> <!ENTITY % anyAttribute "%p;anyAttribute"> <!ENTITY % sic "%p;sic"> <!ENTITY % attribute "%p;attribute"> <!ENTITY % attributeGroup "%p;attributeGroup"> <!ENTITY % include "%p;include"> <!ENTITY % import "%p;import"> <!ENTITY % notation "%p;notation"> <!-- the duplication below is to produce an unambiguous content model which allows annotation everywhere --> <!-- This has the unfortunate consequence of disallowing a schema with only import/includes, this should be fixed --> <!ELEMENT %schema; ((%include; | %import; | %annotation;)*, (%datatype; | %type; | %element; | %attributeGroup; | %group; | %notation; ), (%annotation; | %datatype; | %type; | %element; | %attributeGroup; | %group; | %notation; | %unique; | %key; | %keyref; )* )> <!ATTLIST %schema; targetNamespace %URI; #IMPLIED version CDATA #IMPLIED %nds; %URI; #FIXED '<http://www.w3.org/1999/XMLSchema' finalDefault %derivationSet; '' exactDefault %exactSet; ''> <!-- Note the xmlns declaration is NOT in the Schema for Schemas, because at the Infoset level where schemas operate, xmlns(:prefix) is NOT an attribute! --> <!-- a type is a named content type specification which allows attribute declarations--> <!-- --> <!ELEMENT %type; ((%annotation;)?, (%restrictions; | ((%element;| %group;| %any;)*, (%attribute;| %attributeGroup;)*, (%anyAttribute;)?)))> <!ATTLIST %type; name %NCName; #IMPLIED content (textOnly|mixed|elementOnly|empty) #IMPLIED abstract %boolean; 'false' final %derivationSet; '' exact %derivationSet; '' derivedBy %derivationChoice; #IMPLIED source %QName; #IMPLIED> <!-- restrictions iff derivedBy='restriction' --> <!-- (element|group|any) only if content=mixed or =elementOnly and NO derivedBy at all, i.e. a root type --> <!-- content defaults to source's if there is a complex source, textonly if there's a simple source, 'mixed' if no source (because that's the urType's content) and no content daughters, 'elementOnly' otherwise --> <!-- should we replace content='empty' with content='elementOnly' final='#all' plus no content? --> <!-- If one top-level group, that IS the content model, otherwise an implicit group obtains. This is <group order='seq' minOccurs='1' maxOccurs='1'> unless content='mixed', in which case it's <group order='choice' minOccurs='0' maxOccurs='*'> --> <!-- If anyAttribute appears in one or more referenced attributeGroups and/or explicitly, the intersection of the permissions is used --> <!-- A text-only type with no attributes differs from a datatype with the same source qualified the same way in regard to the impact on attributes of anyAttribute --> <!ELEMENT %restrictions; ((%annotation;)?, ((%facet;)*| (%element;| %group;| %any;| %sic;)*), (%attribute;| %attributeGroup;)*, (%anyAttribute;)?)> <!-- this contains material for restricting components of inherited types --> <!-- (element|group|any|sic) allowed only if source refers to an elementOnly or mixed type, the sequence and GI must match point for point with (an initial sub-sequence of) the content model of the basetype, restricting in each case, except that 'sic' is allowed to "copy through" a single particle. Only the top-level content model can be restricted, e.g. the content model of an anonymous embedded 'type' within an 'element' particle cannot be restricted piecemeal. --> <!-- attributes to be restricted are identified by name, without order constraints. Attributes incorporated into sources via attributeGroups may be restricted by name. --> <!-- If anyAttribute appears in one or more referenced attributeGroups and/or explicitly, the intersection of the permissions with the inherited permission (which must exist) is used --> <!-- facets are allowed only if source refers to a textonly type --> <!-- an element is declared by either: a name and a type (either nested or referenced via the type attribute) or: a ref to an existing element declaration --> <!ELEMENT %element; ((%annotation;)?, (%type;| %datatype;)?, (%unique; | %key; | %keyref;)*)> <!-- type or datatype only if no type|ref attribute --> <!-- ref not allowed at top level --> <!ATTLIST %element; name %NCName; #IMPLIED ref %QName; #IMPLIED type %QName; #IMPLIED minOccurs %non-negative-integer; '1' maxOccurs CDATA #IMPLIED nullable %boolean; 'false' equivClass %QName; #IMPLIED abstract %boolean; 'false' final %boolean; 'false' exact %exactSet; '' default CDATA #IMPLIED fixed CDATA #IMPLIED> <!-- type and ref are mutually exclusive. name and ref are mutually exculsive, one is required --> <!-- In the absence of type AND ref, type defaults to type of equivClass, if any, else the ur-type, i.e. unconstrained --> <!-- maxOccurs defaults to 1 or minOccurs, whichever is greater --> <!-- default and fixed are mutually exclusive --> <!ELEMENT %group; ((%annotation;)?, (%element;| %group;| %any;)*)> <!ATTLIST %group; minOccurs %non-negative-integer; '1' maxOccurs CDATA #IMPLIED order (choice|seq|all) 'seq' name %NCName; #IMPLIED ref %QName; #IMPLIED> <!-- an anonymous grouping in a model, or a top-level named group definition, or a reference to same --> <!-- Note that if order is 'all', group is not allowed inside. If order is 'all' THIS group must be alone (or referenced alone) at the top level of a content model --> <!-- If order is 'all', minOccurs==maxOccurs==1 on element/any inside --> <!-- Should allow minOccurs=0 inside order='all' . . . --> <!ELEMENT %any; EMPTY> <!ATTLIST %any; namespace CDATA '##any' minOccurs %non-negative-integer; '1' maxOccurs CDATA #IMPLIED> <!-- namespace is interpreted as follows: ##any - - any non-conflicting WFXML at all ##other - - any non-conflicting WFXML from namespace other than targetNamespace one or - - any non-conflicting WFXML from more URI the listed namespaces references ##targetNamespace may appear in the above list, with the obvious meaning --> <!ELEMENT %anyAttribute; EMPTY> <!ATTLIST %anyAttribute; namespace CDATA '##any'> <!-- namespace is interpreted as for 'any' above --> <!-- for use inside basetype to copy down corresponding content model particle from the basetype's content model --> <!ELEMENT %sic; EMPTY> <!ELEMENT %attribute; ((%annotation;)?, (%datatype;)?)> <!ATTLIST %attribute; name %NCName; #REQUIRED type %QName; #IMPLIED maxOccurs (0|1) '1' minOccurs (0|1) '0' default CDATA #IMPLIED fixed CDATA #IMPLIED> <!-- default and fixed are mutually exclusive --> <!-- type attr and datatype content are mutually exclusive --> <!-- an attributeGroup is a named collection of attribute decls, or a reference thereto --> <!ELEMENT %attributeGroup; ((%annotation;)?, (%attribute; | %attributeGroup;)*, (%anyAttribute;)?) > <!ATTLIST %attributeGroup; name %NCName; #IMPLIED ref %QName; #IMPLIED> <!-- ref iff no content, no name. ref iff not top level --> <!-- better reference mechanisms --> <!ELEMENT %unique; (%selector;, (%field;)+)> <!ATTLIST %unique; name %NCName; #REQUIRED> <!ELEMENT %key; (%selector;, (%field;)+)> <!ATTLIST %key; name %NCName; #REQUIRED> <!ELEMENT %keyref; (%selector;, (%field;)+)> <!ATTLIST %keyref; name %NCName; #REQUIRED refer %QName; #REQUIRED> <!ELEMENT %selector; (#PCDATA)> <!ELEMENT %field; (#PCDATA)> <!-- Schema combination mechanisms --> <!ELEMENT %include; EMPTY> <!ATTLIST %include; schemaLocation %URI; #REQUIRED> <!ELEMENT %import; EMPTY> <!ATTLIST %import; namespace %URI; #REQUIRED schemaLocation %URI; #IMPLIED> <!ELEMENT %notation; EMPTY> <!ATTLIST %notation; name %NCName; #REQUIRED public CDATA #REQUIRED system %URI; #IMPLIED> <!NOTATION XMLSchemaStructures PUBLIC 'structures' '<http://www.w3.org/TR/1999/WD-xmlschema-1-19991217/structures.xsd' > <!NOTATION XML PUBLIC 'REC-xml-1998-0210' 'http://www.w3.org/TR/1998/REC-xml-19980210' >
Ed. Note: The Glossary has barely been started. An XSL macro will be used to collect definitions from throughout the spec and gather them here for easy reference.
The following have contributed material to this draft:
The editors acknowledge the members of the XML Schema Working Group, the members of other W3C Working Groups, and industry experts in other forums who have contributed directly or indirectly to the process or content of creating this document. The Working Group is particularly grateful to Lotus Development Corp. and IBM for providing teleconferencing facilities.
The current members of the XML Schema Working Group are:
Paula Angerstein, Vignette Corporation; David Beech, Oracle Corp.; Paul V. Biron, Health Level Seven; Allen Brown, Microsoft; Greg Bumgardner, Rogue Wave Software; Lee Buck, Extensibility; Dean Burson, Lotus Development Corporation; Charles E. Campbell, Informix; Peter Chen, Bootstrap Alliance and LSU; David Cleary, Progress Software; Dan Connolly, W3C (staff contact); Andrew Eisenberg, Progress Software; Rob Ellman, Calico Commerce; David Ezell, Hewlett Packard Company; David Fallside, IBM; Matthew Fuchs, Commerce One; Paul Grosso, ArborText, Inc.; Dave Hollander, CommerceNet (co-chair); Mary Holstege, Calico Commerce; Jane Hunter, Distributed Systems Technology Centre (DSTC Pty Ltd); Renato Iannella, Distributed Systems Technology Centre (DSTC Pty Ltd); Rick Jelliffe, Academia Sinica; Dianne Kennedy, Graphic Communications Association; Setrag Khoshafian, Technology Deployment International (TDI); Janet Koenig, Sun Microsystems; Setrag Khoshafian, Technology Deployment International (TDI); Ara Kullukian, Technology Deployment International (TDI); Andrew Layman, Microsoft; Dmitry Lenkov, Hewlett Packard Company; Eve Maler, Sun Microsystems; Ashok Malhotra, IBM; Murray Maloney, Commerce One; John McCarthy, Lawrence Berkeley National Laboratory; Noah Mendelsohn, Lotus Development Corporation; Don Mullen, Extensibility; Murata Makoto, Xerox; Frank Olken, Lawrence Berkeley National Laboratory; Dave Peterson, Graphic Communications Association; Mark Reinhold, Sun Microsystems; Shriram Revankar, Xerox; Jonathan Robie, Software AG; Lew Shannon, NCR; C. M. Sperberg-McQueen, W3C (co-chair); Henry S. Thompson, University of Edinburgh; Matt Timmermans, Microstar; Jim Trezzo, Oracle Corp.; Steph Tryphonas, Microstar; Mark Tucker, Health Level Seven; Priscilla Walmsley, XMLSolutions; Norm Walsh, ArborText, Inc; Aki Yoshida, SAP AGThe XML Schema Working Group has benefited in its work from the participation and contributions of a number of people not currently members of the Working Group, including in particular those named below. Affiliations given are those current at the time of their work with the WG.
Gabe Beged-Dov, Rogue Wave Software; George Feinberg, Object Design; Charles Frankston, Microsoft; Ernesto Guerrieri, Inso; Michael Hyman, Microsoft; Chris Olds, Wall Data; William Shea, Merrill Lynch; Ralph Swick, W3C; Tony Stewart, Rivcom
Example
An example of a full blown schema, for the PurchaseOrder
example from Schemas, Types and Elements (§2.3):
<schema targetNamespace="http://www.myco.com/MYPO" xmlns="http://www.w3.org/TR/1999/WD-xmlschema-1-19991217" xmlns:po="http://www.myco.com/MYPO"> <element name="PurchaseOrder" type="po:PurchaseOrderType"/> <element name="comment" type="string"/> <type name="PurchaseOrderType"> <element name="shipTo" type="po:Address"/> <element name="shipDate" type="date"/> <element ref="po:comment" minOccurs="0"/> <element name="Items" type="po:Items"/> <attribute name="orderDate" type="date"/> </type> <type name="Address"> <element name="name" type="string"/> <element name="street" type="string"/> <element name="city" type="string"/> <element name="state" type="string"/> <element name="zip" type="integer"/> <attribute name="type" type="string"/> </type> <type name="Items"> <element name="Item" minOccurs="0" maxOccurs="*"> <type> <element name="productName" type="string"/> <element name="quantity"> <datatype source="integer"/> <minExclusive value="0"/> </datatype> </element> <element name="price" type="decimal"/> <element ref="po:comment" minOccurs="0"/> </type> </element> </type> </schema>
$Log: structures.xml,v $ Revision 1.45.1.6 1999/12/17 16:14:53 ht one more link Revision 1.45.1.5 1999/12/17 15:35:47 ht reorder prevloc Revision 1.45.1.4 1999/12/17 14:50:25 ht link fixes Revision 1.45.1.3 1999/12/17 14:46:51 ht additional status prose Revision 1.45.1.2 1999/12/17 13:33:11 ht PWD status prose and link fixes Revision 1.45.1.1 1999/12/17 12:26:37 ht towards PWD Revision 1.45 1999/12/17 12:06:05 ht eacute Revision 1.44 1999/12/17 11:51:45 ht make all example quotes double some changes merged from pre-Nov-publ branch (1.15.1...) Revision 1.43 1999/12/17 10:59:16 ht minor fix in schema wrt keys Revision 1.42 1999/12/16 15:48:08 ht fix some refs, incorporate up-to-date schema and DTD Revision 1.41 1999/12/16 15:40:05 aqw describe exact, final and abstract on elements Revision 1.40 1999/12/16 09:44:11 aqw minor editorial Revision 1.39 1999/12/14 16:22:46 aqw various QName fixes Revision 1.38 1999/12/10 18:01:42 ht remove * where status has changed Revision 1.37 1999/12/10 16:14:10 aqw renaming attrGroup, elemOnly, integrate DTD and schema Revision 1.36 1999/12/10 16:08:05 aqw more on BRM, add an wildcard ambiguity issue Revision 1.35 1999/12/10 10:07:35 ht merge in new-design branch Revision 1.34 1999/12/10 09:38:42 ht minor orphan changes Revision 1.33 1999/12/08 20:06:52 aqw added prose for BRM Revision 1.32 1999/12/03 15:49:25 ht fix status text Revision 1.31 1999/12/03 14:54:16 aqw ht version confusion Revision 1.30 1999/12/03 14:48:41 aqw Outline inclusion of BRM proposal Implemented QName aspect of composition decision Revision 1.29.1.6 1999/12/08 20:04:26 aqw describe impact of exact for equiv classes, remove substitutabiliity secton Revision 1.29.1.5 1999/12/06 23:15:09 ht add stars, no standalone Revision 1.29.1.4 1999/12/06 22:50:02 aqw add equiv classes, up-to-date DTD and schema Revision 1.29.1.3 1999/12/03 19:37:27 ht Fill in 3.6.3, controlling derivation, including mod agreed with Allen wrt type tolerance. Revision 1.29.1.2 1999/12/03 15:55:38 aqw cleanup new proposal stuff for preliminary release Revision 1.29.1.1 1999/12/02 22:55:36 aqw begin serious work on integrating new type derivation proposal Revision 1.29 1999/12/02 19:05:00 aqw resolve ! issue in <any namespace='...'> in favour of ## remove private exploratory hierarchy rewrite Revision 1.28 1999/12/02 11:36:27 aqw merge in DTD and schema for internal point release Revision 1.27 1999/12/02 11:09:36 aqw composition tf integrates, validates nearly ok Revision 1.26 1999/12/01 15:43:23 aqw integrating composition tf . . . Revision 1.25 1999/11/22 14:14:59 aqw integrate tentative type construction compromise Revision 1.24 1999/11/12 17:00:41 ht Incorporate minimal null support Revision 1.23 1999/11/11 11:19:11 ht add issues per 1999-11-04 telcon vote Revision 1.22 1999/11/11 11:11:51 ht include dtd and schema Revision 1.21.1.1 1999/11/22 14:03:55 aqw try out type construction compromise Revision 1.21 1999/11/05 00:48:14 aqw note about status vis a vis forthcoming issues from minutes Revision 1.20 1999/11/04 23:46:57 aqw remove element classes per WG vote Revision 1.19 1999/11/03 21:39:47 aqw fix editors list and acks typoes spotted by DBeech Revision 1.18 1999/11/03 15:49:26 aqw Implement chair's instructions wrt WG poll closed 1999-10-30: * re-integrate named model groups * change details of implicit openness * remove entities, not notations Revision 1.17 1999/11/02 23:37:35 aqw merge refinement proposal into mainline preparatory to implementing poll results Revision 1.16 1999/11/02 21:32:25 aqw remove entity definitions and related material (e.g. notations) Revision 1.15 1999/10/27 13:28:58 ht Fix some (all?) syntax paradigms, examples Include bug-fixed .xsd and .dtd Revision 1.14 1999/10/27 10:48:01 ht Incorporate up-to-date schema and DTD, completing concrete syntax changes Parameterise paths/dates to facilitate release process Revision 1.13.1.15 1999/10/26 16:23:30 ht a bit more on validity Revision 1.13.1.14 1999/10/25 14:37:31 ht Begin to try to fill in validity section on basis of schema-valid(EII,type,schemaSet) minimalist approach Revision 1.13.1.13 1999/10/19 18:42:13 ht added xsd:type to text, changed status Revision 1.13.1.12 1999/10/18 19:46:12 aqw Added concrete syntax, prose and examples to 3.6, new section on the hierarchy and restrictions Revision 1.13.1.11 1999/10/18 15:28:16 aqw Included lots in 3.5 and 3.6, editting it in to shape Revision 1.13.1.10 1999/10/18 13:55:23 aqw correct and corresponding DTD and Schema included text from new design included in 3.5 and 3.6 for editting Revision 1.13.1.9 1999/10/18 11:39:01 aqw light pass over section 2, cleaning up 'type' terminology minor fixups in 3.4 Revision 1.13.1.8 1999/10/17 22:05:19 aqw first pass through 3.4 complete Revision 1.13.1.7 1999/10/16 22:12:03 aqw element classes Revision 1.13.1.6 1999/10/16 19:02:51 aqw attribute finished, also attrGroup Revision 1.13.1.5 1999/10/16 17:53:28 aqw finished (?) with complex types, on to attributes Revision 1.13.1.4 1999/10/16 11:52:07 aqw still working on complex type Revision 1.13.1.3 1999/10/15 16:38:31 aqw more work on 'type' Revision 1.13.1.1 1999/10/15 11:46:18 ht on the way to matching new refinement proposal Revision 1.13 1999/10/09 10:49:40 ht correct headline date Revision 1.12 1999/10/05 09:56:19 ht Preliminary implementation of A3 and A7 (ampConnector and richerMixed) votes. Moving towards a parallel syntax for elementDecl/Ref and groupDefn/Ref. Concrete syntax paradigms, examples, DTD and Schema NOT up-to-date Revision 1.11 1999/09/27 16:31:07 ht merge simple back to main branch Revision 1.10.2.38 1999/09/27 16:29:02 ht return to xmlschema-current as base Revision 1.10.2.37 1999/09/24 16:40:22 ht add comments archive pointer Revision 1.10.2.36 1999/09/24 16:38:23 ht link housekeeping, move TF reports bibliography to separate appendix Revision 1.10.2.35 1999/09/24 13:44:27 ht final (?) housekeeping before publication Revision 1.10.2.34 1999/09/23 18:48:51 ht changes to front matter in preparation for public WD ponter to Simple TF included Revision 1.10.2.33 1999/09/23 13:32:15 ht up-to-date pointer to refinement TF report Revision 1.10.2.32 1999/09/23 13:00:22 ht typo in db entity Revision 1.10.2.31 1999/09/23 12:59:04 ht per suggestions from Ashok, some rewording of summary of Composition TF, added issue regarding priority of instance->schema alternatives Revision 1.10.2.30 1999/09/22 14:02:35 ht typo in correction to 4.1 Revision 1.10.2.29 1999/09/22 13:58:39 ht edits implementing Noah's comments Revision 1.10.2.28 1999/09/22 08:07:07 ht add verbatim change log at end ---------------------------- revision 1.10.2.27 date: 1999/09/21 16:26:11; author: ht; state: Exp; lines: +4 -4 added Id: to title for now ---------------------------- revision 1.10.2.26 date: 1999/09/21 16:06:08; author: ht; state: Exp; lines: +42 -244 replaced composition tf report with a summary and a pointer ---------------------------- revision 1.10.2.25 date: 1999/09/21 14:11:50; author: aqw; state: Exp; lines: +495 -111 some dates, up-to-date DTD and Schema for schemas ---------------------------- revision 1.10.2.24 date: 1999/09/21 10:50:37; author: ht; state: Exp; lines: +18 -3 supply missing content model for 'attribute' in concrete syntax paradigm ---------------------------- revision 1.10.2.23 date: 1999/09/21 10:37:51; author: aqw; state: Exp; lines: +21 -20 define/declare consistency pass ---------------------------- revision 1.10.2.22 date: 1999/09/20 13:08:36; author: aqw; state: Exp; lines: +47 -49 track datatype content model changes, minor wording ---------------------------- revision 1.10.2.21 date: 1999/09/16 14:55:17; author: ht; state: Exp; lines: +136 -14 header disclaimer, graveyards rescued to discharge references ---------------------------- revision 1.10.2.20 date: 1999/09/16 14:25:01; author: aqw; state: Exp; lines: +274 -1541 rip out all of 3.5, all of 4, install 'Draft Proposal' in 4 ---------------------------- revision 1.10.2.19 date: 1999/09/16 12:08:59; author: aqw; state: Exp; lines: +107 -143 Clean up import/include/export, references in particular Add archetypeRef to content models, minimally New example of datatype+attr ---------------------------- revision 1.10.2.18 date: 1999/09/15 22:06:29; author: aqw; state: Exp; lines: +26 -3 Two clarifications following discussion with andrew 1) what it would take to remove the two symbol spaces problem 2) How <archetype> allows either datatypeRef or contentType ---------------------------- revision 1.10.2.17 date: 1999/09/15 20:30:49; author: aqw; state: Exp; lines: +114 -105 change date, incorporate edited dtd ---------------------------- revision 1.10.2.16 date: 1999/09/15 19:52:39; author: aqw; state: Exp; lines: +90 -76 Encorporate/respond to Eve Maler's suggested edits ---------------------------- revision 1.10.2.15 date: 1999/09/13 16:14:12; author: aqw; state: Exp; lines: +306 -335 Finish consistency pass through 3.4 Brutal 'element type' -> element ---------------------------- revision 1.10.2.14 date: 1999/09/09 14:22:29; author: aqw; state: Exp; lines: +53 -56 cleanup pass, down to 3.3 ---------------------------- revision 1.10.2.13 date: 1999/09/08 18:23:47; author: ht; state: Exp; lines: +41 -41 more type back to archetype ---------------------------- revision 1.10.2.12 date: 1999/09/08 18:03:06; author: aqw; state: Exp; lines: +214 -216 put archetype back in, imperfectly, I expect ---------------------------- revision 1.10.2.11 date: 1999/09/07 21:50:36; author: bu; state: Exp; lines: +124 -63 fix paradigm contexts, extend example, consolidate example in appendix ---------------------------- revision 1.10.2.10 date: 1999/09/07 16:54:39; author: aqw; state: Exp; lines: +514 -521 syntax paradigms now properly distributed, I think ---------------------------- revision 1.10.2.9 date: 1999/09/07 15:53:06; author: ht; state: Exp; lines: +5 -8 fixed minor validity errors ---------------------------- revision 1.10.2.8 date: 1999/09/07 15:31:58; author: aqw; state: Exp; lines: +288 -285 working on integrating syntax paradigms ---------------------------- revision 1.10.2.7 date: 1999/09/07 09:44:57; author: aqw; state: Exp; lines: +630 -33 added ALL concrete syntax boxes at once ---------------------------- revision 1.10.2.6 date: 1999/09/06 14:55:04; author: ht; state: Exp; lines: +35 -2 added one e: syntax exposition ---------------------------- revision 1.10.2.5 date: 1999/09/02 15:28:27; author: ht; state: Exp; lines: +6 -6 fix URLs for self, a bit ---------------------------- revision 1.10.2.4 date: 1999/09/02 12:53:34; author: aqw; state: Exp; lines: +108 -95 Added not-status-quo marks, changed e.g. String to string ---------------------------- revision 1.10.2.3 date: 1999/09/01 17:02:14; author: aqw; state: Exp; lines: +587 -977 integration of 2.3 from simple more renaming ---------------------------- revision 1.10.2.2 date: 1999/08/23 15:32:16; author: aqw; state: Exp; lines: +730 -248 Modified simple integration to give preliminary consistency ---------------------------- revision 1.10.2.1 date: 1999/08/22 17:44:40; author: aqw; state: Exp; lines: +317 -260 Textual integration of Simple update of 1999-08-13 ---------------------------- revision 1.10 date: 1999/07/20 19:47:27; author: ht; state: Exp; lines: +5 -5 branches: 1.10.2; fixed dates, dangling reference ---------------------------- revision 1.9 date: 1999/07/19 09:31:26; author: ht; state: Exp; lines: +34 -38 David Beech: updated definition of "Schema" following WG and IG email discussion. Changed "Schemata" to "Schemas" except where directly quoted from Requirements doc. Clarified in 2.5 that elements and attributes have separate symbol spaces (public comment). Fixed assorted typos. ---------------------------- revision 1.8 date: 1999/06/23 10:00:31; author: aqw; state: Exp; lines: +1 -1 fix Id: ---------------------------- revision 1.7 date: 1999/06/23 09:51:15; author: aqw; state: Exp; lines: +28 -28 Restrict content model of 'all' in schema and dtd, change entities for point releases ---------------------------- revision 1.6 date: 1999/06/23 09:10:01; author: aqw; state: Exp; lines: +147 -187 pushed & down to lowest level, fixed incoherent validity definition in 6.2.3.7 to agree with the note which follows. Wrapped validation text from 3.4 in appropriately named div4's ---------------------------- revision 1.5 date: 1999/06/21 16:31:59; author: aqw; state: Exp; lines: +569 -551 Really moved validity-oriented definitions to 6.3 (previous revision was just housekeeping) ---------------------------- revision 1.4 date: 1999/06/21 16:25:21; author: aqw; state: Exp; lines: +45 -36 moved validity-oriented definitions to 6.3 ---------------------------- revision 1.3 date: 1999/06/21 12:25:21; author: aqw; state: Exp; lines: +3540 -3650 Low-level: Normalise line ends, quotes Editorial: Move all constraintnotes to new separate section ---------------------------- revision 1.2 date: 1999/05/27 14:13:54; author: aqw; state: Exp; lines: +2 -2 fix stylesheet and dtd urls to local versions ---------------------------- revision 1.1 date: 1999/05/23 16:51:11; author: ht; state: Exp; branches: 1.1.1; Initial revision