W3C

XML Inclusions (XInclude) Version 1.0

W3C Candidate Recommendation 21 February 2002

This version:
http://www.w3.org/TR/2002/CR-xinclude-20020221
(available in: HTML, XML)
Latest version:
http://www.w3.org/TR/xinclude/
Previous version:
http://www.w3.org/TR/2001/WD-xinclude-20010516/
Editors:
Jonathan Marsh, Microsoft <jmarsh@microsoft.com>
David Orchard, BEA Systems <dorchard@bea.com>

Abstract

This document specifies a processing model and syntax for general purpose inclusion. Inclusion is accomplished by merging a number of XML information sets into a single composite Infoset. Specification of the XML documents (infosets) to be merged and control over the merging process is expressed in XML-friendly syntax (elements, attributes, URI references).

Status of this Document

This document is a Candidate Recommendation of the World Wide Web Consortium. (For background on this work, please see the XML Activity Statement.) This specification is considered stable by the XML Core Working Group and is available for public review.

The Working Group invites implementation feedback on this specification. We expect that sufficient feedback to determine its future will have been received by 30 April 2002. Comments on this document should be sent to the public mailing list www-xml-xinclude-comments@w3.org (archive). While we welcome implementation experience reports, the XML Core Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release.

XInclude has a dependency on [XPointer]. This adds significantly to the complexity of XInclude implementations. The XML Core Working Group specifically requests feedback on the use of XPointer in XInclude, including the following:

  1. Would a subset of XPointer simplfy XInclude implementation? Which features should be available in this subset?

  2. Would a subset of XPointer assist in building streaming XInclude processors? Which features should be available in this subset?

In addition to the specific points above, any feedback on patterns of implementation and use of this specification would be very welcome. Comments on XPointer can also be reported against the XPointer specification.

The requirements used to guide the development of XInclude may be found in the [XML Inclusion Proposal] W3C Note of 23 November 1999.

The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/XML/#trans. A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/. W3C publications may be updated, replaced, or obsoleted by other documents at any time. In particular it is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".

Table of Contents

1 Introduction
    1.1 Relationship to XLink
    1.2 Relationship to XML External Entities
    1.3 Relationship to DTDs
    1.4 Relationship to XML Schemas
    1.5 Relationship to Grammar-Specific Inclusions
2 Terminology
3 Syntax
    3.1 xi:include Element
    3.2 xi:fallback Element
4 Processing Model
    4.1 The Include Location
        4.1.1 URI Escaping
    4.2 Included Items when parse="xml"
        4.2.1 Document Information Items
        4.2.2 Multiple Nodes
        4.2.3 Range Locations
        4.2.4 Point Locations
        4.2.5 Element, Comment, and Processing Instruction Information Items
        4.2.6 Attribute and Namespace Declaration Information Items
        4.2.7 Inclusion Loops
    4.3 Included Items when parse="text"
    4.4 Fallback Behavior
    4.5 Creating the Result Infoset
        4.5.1 Unparsed Entities
        4.5.2 Notations
        4.5.3 references property fixup
        4.5.4 Namespace Fixup
        4.5.5 Base URI
        4.5.6 Properties Preserved by the Infoset
5 Conformance
    5.1 Markup Conformance
    5.2 Application Conformance
    5.3 XML Information Set Conformance

Appendices

A References
B References (Non-Normative)
C Examples (Non-Normative)
    C.1 Basic Inclusion Example
    C.2 Textual Inclusion Example
    C.3 Textual Inclusion of XML Example
    C.4 Range Inclusion Example
    C.5 Fallback Example


1 Introduction

Many programming languages provide an inclusion mechanism to facilitate modularity. Markup languages also often have need of such a mechanism. This specification introduces a generic mechanism for merging XML documents (as represented by their information sets) for use by applications that need such a facility. The syntax leverages existing XML constructs - elements, attributes, and URI references.

1.1 Relationship to XLink

XInclude differs from the linking features described in the [XML Linking Language], specifically links with the attribute value show="embed". Such links provide a media-type independent syntax for indicating that a resource is to be embedded graphically within the display of the document. XLink does not specify a specific processing model, but simply facilitates the detection of links and recognition of associated metadata by a higher level application.

XInclude, on the other hand, specifies a media-type specific (XML into XML) transformation. It defines a specific processing model for merging information sets. XInclude processing occurs at a low level, often by a generic XInclude processor which makes the resulting information set available to higher level applications.

Simple information item inclusion as described in this specification differs from transclusion, which preserves contextual information such as style.

1.2 Relationship to XML External Entities

There are a number of differences between XInclude and [XML 1.0] external entities which make them complementary technologies.

Processing of external entities (as with the rest of DTDs) occurs at parse time. XInclude operates on information sets and thus is orthogonal to parsing.

Declaration of external entities requires a DTD or internal subset. This places a set of dependencies on inclusion, for instance, the syntax for the DOCTYPE declaration requires that the document element be named - orthogonal to inclusion in many cases. Validating parsers must have a complete content model defined. XInclude is orthogonal to validation and the name of the document element.

External entities provide a level of indirection - the external entity must be declared and named, and separately invoked. XInclude uses direct references. Applications which generate XML output incrementally can benefit from not having to pre-declare inclusions.

The syntax for an internal subset is cumbersome to many authors of simple well-formed XML documents. XInclude syntax is based on familiar XML constructs.

1.3 Relationship to DTDs

XInclude defines no relationship to DTD validation. XInclude describes an infoset-to-infoset transformation and not a change in XML 1.0 parsing behavior. XInclude does not define a mechanism for DTD validation of the resulting infoset.

1.4 Relationship to XML Schemas

XInclude defines no relationship to the augmented infosets produced by applying an XML schema. Such an augmented infoset can be supplied as the input infoset, or such augmentation may be applied to the infoset resulting from the inclusion.

1.5 Relationship to Grammar-Specific Inclusions

Special-purpose inclusion mechanisms have been introduced into specific XML grammars. XInclude provides a generic mechanism for recognizing and processing inclusions, and as such can offer a simpler overall authoring experience, greater performance, and less code redundancy.

2 Terminology

[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].]

[Definition: The term information set refers to the output of an XML processor, expressed as a collection of information items and properties as defined by the [XML Information Set] specification.] In this document the term infoset is used as a synonym for information set.

[Definition: The term fatal error refers to the presence of factors that prevent normal processing from continuing.][Definition: The term resource error refers to a failure of an attempt to fetch a resource from a URL.] XInclude processors must stop processing when encountering errors other than resource errors, which must be handled as described in 4.4 Fallback Behavior.

3 Syntax

XInclude defines a namespace associated with the URI http://www.w3.org/2001/XInclude. The XInclude namespace contains two elements with the local names include and fallback. For convenience, within this specification these elements are referred to as xi:include and xi:fallback respectively.

The following (non-normative) XML schema [XML Schemas] illustrates the content model of the xi namespace:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:xi="http://www.w3.org/2001/XInclude"
           targetNamespace="http://www.w3.org/2001/XInclude">

  <xs:element name="include">
    <xs:complexType mixed="true">
      <xs:choice minOccurs='0' maxOccurs='unbounded' >
        <xs:element ref='xi:fallback' />
        <xs:any namespace='##other' processContents='lax' />
        <xs:any namespace='##local' processContents='lax' /> 
      </xs:choice>
      <xs:attribute name="href" type="xs:anyURI" use="required"/>
      <xs:attribute name="parse" use="optional" default="xml">
        <xs:simpleType>
          <xs:restriction base="xs:string">
            <xs:enumeration value="xml"/>
            <xs:enumeration value="text"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name="encoding" type="xs:string" use="optional"/>
      <xs:anyAttribute namespace="##other" processContents="lax"/>
    </xs:complexType>
  </xs:element>

  <xs:element name="fallback">
    <xs:complexType mixed="true">
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="xi:include"/>
        <xs:any namespace="#other" processContents="lax"/>
      </xs:choice>
      <xs:anyAttribute />
    </xs:complexType>
  </xs:element>

</xs:schema>

3.1 xi:include Element

The xi:include element has the following attributes:

href

A URI reference indicating the location of the resource to include. This attribute is required.

parse

An enumeration specifying whether to include the resource as parsed XML or as text. A value of "xml" indicates that the resource must be parsed as XML and the infosets merged. A value of "text" indicates that the resource must be included as the character information items. This attribute is optional. When omitted, the value of "xml" is implied (even in the absence of a default value declaration). Values other than "xml" and "text" are a fatal error.

encoding

When parse="text", it may be impossible to correctly detect the encoding of the text resource. The encoding attribute specifies how the resource is to be translated. The value of this attribute is an EncName as defined in XML 1.0 specification, section 4.3.3, rule [81]. The encoding attribute has no effect when parse="xml".

Attributes from other namespaces may be placed on the xi:include element. Unqualified attribute names are reserved for future versions of this specification, and must be ignored by XInclude 1.0 processors.

The content of the xi:include element may include an xi:fallback element. Other content is not constrained by this specification and is ignored by the XInclude processor.

The following (non-normative) DTD fragment illustrates a sample declaration for the xi:include element:

<!ELEMENT xi:include (xi:fallback)>
<!ATTLIST xi:include
    xmlns:xi   #FIXED   "http://www.w3.org/2001/XInclude"
    href       CDATA                                   #REQUIRED
    parse      (xml|text)                              "xml"
    encoding   CDATA                                   #IMPLIED
>

3.2 xi:fallback Element

The xi:fallback element appears as a child of an xi:include element. It provides a mechanism for recovering from missing resources. When a resource error is encountered, the xi:include element is replaced with the contents of the xi:fallback element. If the xi:fallback element is empty, the xi:include element is removed from the result. If the xi:fallback element is missing, a fatal error results.

The xi:fallback element can appear only as a child of an xi:include element. It is a fatal error for an xi:fallback element to appear in a document anywhere other than as the direct child of the xi:include (before inclusion processing on the contents of the element.)

The following (non-normative) DTD fragment illustrates a sample declaration for the xi:include element:

<!ELEMENT xi:fallback ANY>
<!ATTLIST xi:fallback
    xmlns:xi   #FIXED   "http://www.w3.org/2001/XInclude"
>

4 Processing Model

Inclusion as defined in this document is a specific type of [XML Information Set] transformation.

[Definition: The input for the inclusion transformation consists of a source infoset]. [Definition: The output, called the result infoset, is a new infoset which merges the source infoset with the infosets of resources identified by URI references appearing in xi:include elements]. Thus a mechanism to resolve URIs and return the identified resources as infosets is assumed. Well-formed XML entities that do not have defined infosets (e.g. an external entity with multiple top-level elements) are outside the scope of this specification, either for use as a source infoset or the result infoset.

xi:include elements in the source infoset serve as inclusion transformation instructions. [Definition: The information items located by the xi:include element are called the top-level included items]. [Definition: The top-level included items together with their attributes, namespaces, and descendents, are are called the included items]. The result infoset is essentially a copy of the source infoset, with each xi:include element and its descendents replaced by its corresponding included items.

4.1 The Include Location

The value of the href attribute is interpreted as an IURI reference. [Definition: An internationalized URI reference, or IURI, is a URI reference that directly uses [Unicode] characters.] IURI references allow a superset of the characters of fully escaped URI references, but must have normal occurrences of the percent sign (%) escaped because it is the character used for escaping in URIs and IURIs. Also see [Internationalized URIs] (non-normative).

The base URI for relative IURIs is the base URI of the xi:include element as specified in [XML Base]. [Definition: The IURI resulting from resolution to absolute IURI form is called the include location.]

The set of characters allowed in an href attribute is the same as for XML, namely [Unicode]. However, some Unicode characters are disallowed from URI references. Thus the disallowed characters in URI references must ultimately be encoded and escaped by the XInclude or other processor when the URI is resolved.

4.1.1 URI Escaping

The disallowed characters include all non-ASCII characters, plus the excluded characters listed in Section 2.4 of [IETF RFC 2396], except for the number sign (#) and percent sign (%) characters and the square bracket characters re-allowed in [IETF RFC 2732]. Disallowed characters are escaped as follows:

  1. Each disallowed character is converted to UTF-8 [IETF RFC 2279] as one or more bytes.

  2. Any bytes corresponding to a disallowed character are escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the hexadecimal notation of the byte value).

  3. The original character is replaced by the resulting character sequence.

4.2 Included Items when parse="xml"

When parse="xml", the include location is dereferenced and the resource is fetched, coerced to text/xml, and an infoset is created.

Note:

The specifics of how an infoset is created are intentionally unspecified, to allow for flexibility by implementations and to avoid defining a particular processing model for components of the XML architecture. Particulars of whether DTD or XML schema validation are performed, for example, are not constrained by this specification.

Note:

The character encodings of the including and included resources can be different. This does not affect the resulting infoset, but may need to be taken into account during any subsequent serialization.

Resources that are unavailable for any reason (for example the resource doesn't exist, connection difficulties or security restrictions prevent it from being fetched, the URI scheme isn't a fetchable one, or a syntax error in an XPointer) result in a resource error. Resources that contain non-well-formed XML result in a fatal error.

[Definition: xi:include elements in this infoset are recursively processed to create the acquired infoset.]

When the resource is coerced to text/xml, the fragment part of the URI reference is interpreted as an [XPointer], regardless of the media type of the resource. The XPointer indicates a subresource as the target for inclusion.

XPointer is not specified in terms of the [XML Information Set], but instead is based on the [XPath 1.0] Data Model, because the XML Information Set had not yet been developed. The mapping between XPath node locations and information items is straightforward. However, XPointer assumes that all entities have been expanded. Thus it is a fatal error to attempt to resolve an XPointer on a document that contains Unexpanded Entity Reference Information Items.

The set of top-level included items is derived from the acquired infoset as follows.

4.2.1 Document Information Items

An include location might identify the document information item (for instance, a URI reference without an XPointer, or an XPointer specifically locating the document root.) In this case, the set of top-level included items is the [children] of the acquired infoset's document information item, except for the document type declaration information item child, if one exists.

Note:

The XML Information Set specification does not provide for preservation of white space outside the document element. XInclude makes no further provision to preserve this white space.

4.2.2 Multiple Nodes

An include location having an XPointer might identify a subresource that consists of more than a single node. In this case the set of top-level included items is the set of information items from the acquired infoset corresponding to the nodes referred to by the XPointer, in the order in which they appear in the acquired infoset.

If the document (top-level) element in the source infoset is an xi:include element, it is a fatal error to attempt to replace it with something other than a list of zero or more comments, zero or more processing instructions, and one element.

4.2.3 Range Locations

An include location having an XPointer might identify a location set that represents a range or a set of ranges.

Each range corresponds to a set of information items in the acquired infoset. [Definition: An information item is said to be selected by a range if it occurs after (in document order) the starting point of the range and before the ending point of the range.] [Definition: An information item is said to be partially selected by a range if it contains only the starting point of the range, or only the ending point of the range.] By definition, a character information item cannot be partially selected.

The set of top-level included items is the union, in document order with duplicates removed, of the information items either selected or partially selected by the range. The [children] property of selected information items is not modified. The [children] property of partially selected information items is the set of information items that are in turn either selected or partially selected, and so on.

4.2.4 Point Locations

An include location having an XPointer might identify a location set that represents a point. In this case the set of included items is empty.

4.2.5 Element, Comment, and Processing Instruction Information Items

An include location having an XPointer might identify an element node, a comment node, or a processing instruction node, respectively representing an element information item, a comment information item, or a processing instruction information item. In this case the set of top-level included items consists of the information item corresponding to the element, comment, or processing instruction node in the acquired infoset.

4.2.6 Attribute and Namespace Declaration Information Items

An include location having an XPointer might identify an attribute node or a namespace node. An include location identifying such a node is a fatal error.

4.2.7 Inclusion Loops

When recursively processing an xi:include element, it is a fatal error to process another xi:include element with an include location that has already been processed in the inclusion chain.

In other words, the following are all legal:

  • An xi:include element may reference the document containing the include element, when parse="text".

  • An xi:include element may identify a different part of the same local resource.

  • Two non-nested xi:include elements may identify a resource which itself contains an xi:include element.

The following are illegal:

  • An xi:include element pointing to itself or any ancestor thereof, when parse="xml".

  • An xi:include element pointing to any include element or ancestor thereof which has already been processed at a higher level.

4.3 Included Items when parse="text"

When parse="text", the include location is dereferenced and the resource is fetched. Resources that are unavailable for any reason (for example the resource doesn't exist, connection difficulties or security restrictions prevent it from being fetched, the URI scheme isn't a fetchable one, or a syntax error in a fragment identifer) result in a resource error.

The fetched resource is treated as plain text and converted to a set of character information items without attempting to parse the resource as XML. This feature facilitates the inclusion of working XML examples, as well as other text-based formats.

The encoding of such a resource is determined by:

  • external encoding information, if available, otherwise

  • if the media type of the resource is text/xml, application/xml, or matches the conventions text/*+xml or application/*+xml as described in XML Media Types [IETF RFC 3023], the encoding is recognized as specified in XML 1.0, otherwise

  • the value of the encoding attribute if one exists, otherwise

  • UTF-8.

Byte sequences outside the range allowed by the encoding are a fatal error. Characters that are not permitted in XML documents also are a fatal error.

[Definition: A range of characters (the selected range) may be identified by a fragment identifier.] The syntax of the fragment identifier is interpreted using the syntax of the fragment identifier for the media type text/plain. In the absence of a fragment identifier, the selected range contains all the characters in the resource except the initial byte order mark (BOM) if one is present. A BOM is the character U+FEFF when it appears as the first character in resource encoded in UTF-8, UTF-16 or UTF-32. UTF-16BE and UTF-16LE will not contain a BOM.

Note:

There is currently no standard defining fragment identifiers for the media type text/plain.

The set of characters in the selected range is converted to a set of top-level included items by creating a character information item with the [character code] set to the character code representing the character in ISO 10646 encoding, and the [element content whitespace] set to false.

[Character Model] describes the required treatment of non-normalized Unicode text.

4.4 Fallback Behavior

XInclude processors must perform fallback behavior in the event of a resource error, as follows:

If the [children] of the xi:include element information item in the source infoset contain exactly one xi:fallback element, the top-level included items consists of the information items corresponding to the result of performing XInclude processing on the [children] of the xi:fallback element. It is a fatal error if there is zero or more than one xi:fallback element.

Note:

Fallback content is not dependent on the value of the parse attribute. The xi:fallback element can contain markup even when parse="text". Likewise, it can contain a simple string when parse="xml".

4.5 Creating the Result Infoset

The result infoset is a copy of the source infoset, with each xi:include element processed as follows:

The information item for the xi:include element is found. [Definition: The [parent] property of this item refers to an information item called the include parent.] The [children] property of the include parent is modified by replacing the xi:include element information item with the top-level included items. The [parent] property of each included item is set to the include parent.

Each top-level included item is assigned an extension property [included] with the boolean value "true". Information items which were not processed as top-level included items will have no value for the [included] property. This property may be used by applications which require knowledge of where inclusion has been performed.

The included items will all appear in the result infoset. This includes Unexpanded Entity Reference Information Items if they are present.

Intra-document references within xi:include elements must be resolved against the source infoset. The effect of this is that the order in which xi:include elements are processed does not affect the result.

In the following example, the second include always points to the first xi:include element and not to itself, regardless of the order in which the includes are processed. Thus the result of this inclusion is two copies of something.xml, and does not produce an inclusion loop error.

<x xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="something.xml"/>
  <xi:include href="#xmlns(xi=http://www.w3.org/2001/XInclude)
                     xpointer(x/xi:include[1])"
              parse="xml"/>
</x>

4.5.1 Unparsed Entities

Any unparsed entity information item appearing in the [references] property of an attribute on the included items or any descendent thereof is added to the [unparsed entities] property of the source infoset's document information item, if it is not a duplicate of an existing member.

Unparsed entity items with the same [name], [system identifier], [public identifier], [declaration base URI], [notation name], and [notation] are considered to be duplicate. An application may also be able to detect that unparsed entities are duplicate through other means. For instance, the URI resulting from combining the system identifier and the declaration base URI is the same.

It is a fatal error to include unparsed entity items with the same name, but different [system identifier] or [public identifier] properties.

4.5.2 Notations

Any notation information item appearing in the [references] property of an attribute in the included items or any descendent thereof is added to the [notations] property of the result infoset's document information item, if it is not a duplicate of an existing member. Likewise, any notation referenced by an unparsed entity added as described in 4.5.1 Unparsed Entities, is added unless it is a duplicate.

Notation items with the same [name], [system identifier], [public identifier], and [declaration base URI] are considered to be duplicate. An application may also be able to detect that notations are duplicate through other means. For instance, the URI resulting from combining the system identifier and the declaration base URI is the same.

It is a fatal error to include notation items with the same name, but different [system identifier] or [public identifier] properties.

4.5.3 [references] property fixup

During inclusion, an attribute information item whose [attribute type] property is IDREF or IDREFS has a [references] property with zero or more element values from the source or included infosets. These values must be adjusted to correspond to element values that occur in the result infoset. During this process, XInclude also corrects inconsistencies between the [references] property and the [attribute type] property, which may arise in the following circumstances:

  • A document fragment contains an IDREF pointing to an element in the included document but outside the part being included. In this case there is no element in the result infoset that corresponds to the element value in the original [references] property.

  • A document or document fragment is not self-contained. That is, it contains IDREFs which do not refer to an element within that document or document fragment, with the intention that these references will be realized after inclusion. In this case, the value of the [references] property is unknown or has no value.

  • The result infoset has ID clashes - that is, more than one attribute with [attribute type] ID with the same [normalized value]. In this case, attributes with [attribute type] IDREF or IDREFS with the same [normalized value] may have different values for their [references] properties.

In resolving these inconsistencies, XInclude takes the [attribute type] property as definitive. In the result infoset, the value of the [references] property of an attribute information item whose [attribute type] property is IDREF or IDREFS is adjusted as follows:

For each token in the [normalized value] property, the [references] property contains an element information item with the same properties as the first element information item in the result infoset with an attribute with [attribute type] ID and [normalized value] equal to the token. The order of the elements in the [references] property is the same as the order of the tokens appearing in the [normalize value]. If no element values are found, the [references] property has no value.

4.5.4 Namespace Fixup

The [in-scope namespaces] property ensures that namespace scope is preserved through inclusion. However, after inclusion, the [namespace attributes] property may not provide the full list of namespace declarations necessary to interpret qualified names in attribute or element content in the result. An XInclude processor must fix up both the [namespace attributes] and the [in-scope namespaces] to ensure that these two properties are complete and consistent.

Each [in-scope namespaces] in the included items is augmented with Namespace Information Items corresponding to the Namespace Information Items appearing on the include parent, if any. Items with the same [prefix] as an existing Namespace Information Item are omitted.

The [namespace attributes] property requires similar fixup. An Attribute Information Item is added to the [namespace attributes] property of the top-level included items, for each namespace the element has in scope which is not already declared on the element or on its ancestors in the result infoset.

For example, the following document:

<foo xmlns:x="uri1">
  <xi:include href="common.xml#xpointer(a/b)"
              xmlns:xi="http://www.w3.org/2001/XInclude"/>
</foo>

including an element from common.xml:

<a xmlns:x="uri2">
  <b>
    <x:a/>
  </b>
</a>

results in a document that could be serialized as:

<foo xmlns:x="uri1">
  <b xmlns:x="uri2">
    <x:a/>
  </b>
</foo>

4.5.5 Base URI

The base URI property of the acquired infoset is not changed as a result of merging the infoset, and remains unchanged after merging. Thus relative URI references in the included infoset resolve to the same URI despite being included into a document with a potentially different base URI in effect. xml:base attributes are added to the result infoset to indicate this fact.

Each Element Information Item in the top-level included items which has a different [base URI] than its include parent has an Attribute Information Item added to its [attributes] property. If an xml:base attribute information item is already present, it is replaced by the new attribute. This attribute has the following properties:

  1. A [namespace name] of http://www.w3.org/XML/1998/namespace.

  2. A [local name] of base.

  3. A [prefix] of xml.

  4. A [normalized value] equal to the [base URI] of the element.

  5. A [specified] flag indicating that this attribute was actually specified in the start-tag of its element.

  6. An [attribute type] of CDATA.

  7. A [references] property with no value.

  8. An [owner element] of the information item of the element.

Note:

The xml:lang and xml:space attributes are not treated specially by XInclude.

4.5.6 Properties Preserved by the Infoset

As an infoset transformation, XInclude operates on the logical structure of XML documents, not on their text serialization. All properties of an information item described in [XML Information Set] other than those specifically modified by this specification are preserved during inclusion. Extension properties such as [XML Schemas] Post Schema Validation Infoset (PSVI) properties are discarded by default. However, an XInclude processor may, at user option, preserve these properties in the resulting infoset if they are correct according to the specification describing the semantics of the extension properties.

For instance, the PSVI [validity] property describes the conditions of ancestors and descendants. Modification of ancestors and descendants during the XInclude process can render the value of this property inaccurate. By default, XInclude strips this property, but by user option the property could be recalculated to obtain a semantically accurate value. Precisely how this is accomplished is outside the scope of this specification.

5 Conformance

5.1 Markup Conformance

An element information item conforms to this specification if it meets the structural requirements for include elements defined in this specification. This specification imposes no particular constraints on DTDs or XML schemas; conformance applies only to elements and attributes.

5.2 Application Conformance

An application conforms to XInclude if it:

  • supports XML 1.0, XML namespaces, the XML Information Set, and XML Base

  • stops processing when a fatal error is encountered.

  • observes the mandatory conditions (must) set forth in this specification, and for any optional conditions (should and may) it chooses to observe, observes them in the way prescribed

  • performs markup conformance testing according to all the conformance constraints appearing in this specification.

5.3 XML Information Set Conformance

This specification conforms to the [XML Information Set]. The following information items must be present in the input infosets to enable correct processing:

  • Document Information Items with [children] and [base URI] properties.

  • Element Information Items with [namespace name], [local name], [children], [attributes], [base URI] and [parent] properties.

  • Attribute Information Items with [namespace name], [local name] and [normalized value] properties.

Additionally, XInclude processing may generate the following kinds of information items in the result:

  • Character Information Items with [character code], [element content whitespace] and [parent] properties.

XInclude also extends the infoset with the boolean property [included], which may appear on the following types of information items in the result:

  • Element Information Items.

  • Processing Instruction Information Items.

  • Comment Information Items.

  • Character Information Items.

Appendices

A References

Character Model
Martin J. Dürst, François Yergeau, Misha Wolf, Asmus Freytag, Tex Texin. Character Model for the World Wide Web 1.0. World Wide Web Consortium, 2001. (See http://www.w3.org/TR/charmod/.)
IETF RFC 2119
RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. Internet Engineering Task Force, 1997. (See http://www.ietf.org/rfc/rfc2119.txt.)
IETF RFC 2279
RFC 2279: UTF-8, a transformation format of ISO 10646. Internet Engineering Task Force, 1998. (See http://www.ietf.org/rfc/rfc2279.txt.)
IETF RFC 2396
RFC 2396: Uniform Resource Identifiers. Internet Engineering Task Force, 1995. (See http://www.ietf.org/rfc/rfc2396.txt.)
IETF RFC 2732
RFC 2732: Format for Literal IPv6 Addresses in URL's. Internet Engineering Task Force, 1999. (See http://www.ietf.org/rfc/rfc2732.txt.)
IETF RFC 3023
RFC 3023: XML Media Types. Internet Engineering Task Force, 2001. (See http://www.ietf.org/rfc/rfc3023.txt.)
Unicode
The Unicode Consortium. The Unicode Standard. (See http://www.unicode.org/unicode/standard/standard.html.)
XML 1.0
Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, and Eve Maler, editors. Extensible Markup Language (XML) 1.0 (Second Edition). World Wide Web Consortium, 1998. (See http://www.w3.org/TR/REC-xml.)
XML Base
Jonathan Marsh, editor. XML Base. World Wide Web Consortium, 1999. (See http://www.w3.org/TR/xmlbase/.)
XML Information Set
John Cowan and David Megginson, editors. XML Information Set. World Wide Web Consortium, 1999. (See http://www.w3.org/TR/xml-infoset/.)
XML Names
Tim Bray, Dave Hollander, and Andrew Layman, editors. Namespaces in XML. World Wide Web Consortium, 1999. (See http://www.w3.org/TR/REC-xml-names/.)
XPointer
Steve DeRose, Ron Daniel, Eve Maler, editors. XML Pointer Language (XPointer). World Wide Web Consortium, 1999. (See http://www.w3.org/TR/xptr/.)

B References (Non-Normative)

Internationalized URIs
Internationalized URIs. Internet Engineering Task Force, 2000. Expired Internet-Draft. (See http://www.w3.org/International/2000/03/draft-masinter-url-i18n-05.txt.)
XML Inclusion Proposal
Jonathan Marsh, David Orchard, editors. XML Inclusion Proposal (XInclude). World Wide Web Consortium, 1999. (See http://www.w3.org/TR/1999/NOTE-xinclude-19991123.)
XML Linking Language
Steve DeRose, Eve Maler, David Orchard, and Ben Trafford, editors. XML Linking Language (XLink). World Wide Web Consortium, 2000. (See http://www.w3.org/TR/xlink/.)
XML Schemas
Henry S. Thompson, David Beech, Murray Maloney, Noah Mendelsohn, editors. XML Schema Part 1: Structures. World Wide Web Consortium, 2001. (See http://www.w3.org/TR/xmlschema-1/.)
XPath 1.0
James Clark, Steve DeRose, editors. XML Path Language (XPath) Version 1.0. World Wide Web Consortium, 1999. (See http://www.w3.org/TR/xpath.)

C Examples (Non-Normative)

C.1 Basic Inclusion Example

The following XML document contains an xi:include element which points to an external document. Assume the base URI of this document is http://www.example.org/document.xml.

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>120 Mz is adequate for an average home user.</p>
  <xi:include href="disclaimer.xml"/>
</document>

disclaimer.xml contains:

<?xml version='1.0'?>
<disclaimer>
  <p>The opinions represented herein represent those of the individual
  and should not be interpreted as official policy endorsed by this
  organization.</p>
</disclaimer>

The infoset resulting from resolving inclusions on this document is the same as that of the following document:

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>120 Mz is adequate for an average home user.</p>
  <disclaimer xml:base="http://www.example.org/disclaimer.xml">
  <p>The opinions represented herein represent those of the individual
  and should not be interpreted as official policy endorsed by this
  organization.</p>
</disclaimer>
</document>

C.2 Textual Inclusion Example

The following XML document includes a "working example" into a document.

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>This document has been accessed
  <xi:include href="count.txt" parse="text"/> times.</p>
</document>

where count.txt contains:

324387

The infoset resulting from resolving inclusions on this document is the same as that of the following document:

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>This document has been accessed
  324387 times.</p>
</document>

C.3 Textual Inclusion of XML Example

The following XML document includes a "working example" into a document.

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>The following is the source of the "data.xml" resource:</p>
  <example><xi:include href="data.xml" parse="text"/></example>
</document>

data.xml contains:

<?xml version='1.0'?>
<data>
  <item><![CDATA[Brooks & Shields]]></item>
</data>

The infoset resulting from resolving inclusions on this document is the same as that of the following document:

<?xml version='1.0'?>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
  <p>The following is the source of the "data.xml" resource:</p>
  <example>&lt;?xml version='1.0'?&gt;
&lt;data&gt;
  &lt;item&gt;&lt;![CDATA[Brooks &amp; Shields]]&gt;&lt;/item&gt;
&lt;/data&gt;</example>
</document>

C.4 Range Inclusion Example

The following illustrates the results of including a range specified by an XPointer. Assume the base URI of the document is http://www.example.com/document.xml.

<?xml version='1.0'?>
<document>
  <p>The relevant excerpt is:</p>
  <quotation>
    <include xmlns="http://www.w3.org/2001/XInclude"
       href="source.xml#xpointer(string-range(chapter/p[1],'Sentence 2')/
             range-to(string-range(chapter/p[2]/i,'3.',1,2)))"/>
  </quotation>
</document>

source.xml contains:

<chapter>
  <p>Sentence 1.  Sentence 2.</p>
  <p><i>Sentence 3.  Sentence 4.</i>  Sentence 5.</p>
</chapter>

The infoset resulting from resolving inclusions on this document is the same as that of the following document:

<?xml version='1.0'?>
<document>
  <p>The relevant excerpt is:</p>
  <quotation>
    <p xml:base="http://www.example.com/source.xml">Sentence 2.</p>
  <p xml:base="http://www.example.com/source.xml"><i>Sentence 3.</i></p>
  </quotation>
</document>

C.5 Fallback Example

The following XML document relies on the fallback mechanism to succeed in the event that the resources example.txt and fallback-example.txt are not available..

<?xml version='1.0'?>
<div>
  <xi:include href="example.txt" parse="text">
    <xi:fallback>
      <xi:include href="fallback-example.txt" parse="text">
        <xi:fallback><a href="mailto:bob@example.org">Report error</a></xi:fallback>
      </xi:include>
    </xi:fallback>
  </xi:include>
</div>

If neither example.txt nor fallback-example.txt are available, the result would be:

<?xml version='1.0'?>
<div>
  <a href="mailto:bob@example.org">Report error</a>
</div>