Copyright ©1998, 1999 W3C ( MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
The XML Fragment Working Group, with this 1999 March 3 first working draft, invites comment on our specification for XML Fragment Interchange. For background on this work, please see the XML Activity Statement.
The W3C Membership and other interested parties are invited to review the specification and report implementation experience. Please send comments to www-xml-fragment-comments@w3.org (archive). Comments received by 1999 March 26 will be considered for a revision soon after. While we welcome implementation experience reports, the XML Fragment Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release.
In the current document, open issues are so titled and appear in orange, and records of major WG decisions are so marked and appear in red. Before this specification is submitted as a Proposed Recommendation, all open issues will be resolved and all decision records will be moved to another document so that no occurrences of either open issues or decision record notes will appear in the final Recommendation.
This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at http://www.w3.org/TR .
The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient. The XML Fragment WG is chartered with defining a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. This document defines Version 1.0 of the [eventual] W3C Recommendation that addresses this issue.
The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient.
In the case of many XML documents, it is suboptimal to have to receive and parse the entire document when only a fragment of it is desired. If the user asked to look at chapter 20, one shouldn't need to parse 19 whole chapters before getting to the part of interest. The goal of this activity is to define a way to enable processing of small parts of an XML document without having to process everything up to the part in question. This can be done regardless of whether the parts are entities or not, and the parts can either be viewed immediately or accumulated for later use, assembly, or other processing.
Conceptually, the holder of the complete source document considers a fragment of that document and, using the notation to be defined by this activity, constructs a fragment context specification. The object representing the fragment removed from its source document is called the fragment body. The fragment context specification and the fragment body are transmitted to the recipient. The storage object in which the fragment body is transmitted is call the fragment entity. (In some packaging schemes, the fragment context specification may also be embedded in the fragment entity.) The recipient processes the fragment context specification to determine the proper parser state for the context at the beginning of the fragment and uses that information to enable the XML parser to parse the fragment body. (The terms "sender," "recipient," "transmit," are used throughout this document to describe the process of fragment interchange. It should be noted, however, that there are many feasible and useful scenarios for fragment interchange, and in some cases, the "sender" and "recipient" may be on the same machine, node, system, or network, and may even be the same tool in different guises.)
The challenge is that an isolated element from an XML document may not contain quite enough information to be parsed correctly. The goal of this activity is to enable senders to provide the remaining information required so that systems can interchange any XML elements they choose, from books or chapters all the way down to paragraphs, tables, footnotes, book titles, and so on, without having to manage each as a separate entity or having to risk incorrect parsing due to loss of context.
To accomplish these ends, this Recommendation defines:
This Recommendation enables interchanging portions of XML documents while retaining the ability to parse them correctly (that is, as they would be parsed in their originating document context), and, as far as practical, to be formatted, edited, and otherwise processed in useful ways.
The goal of this activity is to define a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. The delivered parts can either be viewed or edited immediately or accumulated for later use, assembly, or other processing; what the receiving application does with the information--and issues involved with the possible "return" of such a fragment to the original sender--is beyond the scope of this activity. While implementations of this Recommendation may serve as part of a larger system that allows for "fragment reuse," the many important issues about reuse of XML text and "concurrent multiple author environments" are beyond the scope of this Recommendation.
The point of the fragment context information is to provide information that is not available in the fragment body itself but that would be available from the complete XML document. Specifically, any information not available from the XML document (which may include an external subset) as a whole (plus knowledge of the location of the fragment body within the document) is out of scope for inclusion in the fragment context information. Such information may well be useful and important metadata in a variety of applications, but there are (or need to be) other mechanisms for handling this information.
This Recommendation considers fragments of XML as defined by XML 1.0 and XML Namespaces . It is explicitly noted that this version of this Recommendation does not take into account work such as that taking place in the XML Schema Working Group; insofar as such work by other currently active working groups places new requirements on a fragment interchange solution, those requirements would be input to a new version of the fragment interchange specification that may become a chartered activity at a later date.
It is also explicitly noted that this Recommendation does not consider interchange of information that is not well-formed XML; in particular, issues specific to the interchange of fragments of SGML (including HTML)--other than such SGML that is, in fact, also well-formed XML--are not within scope of this Recommendation.
This list is sorted "logically" as opposed to alphabetically. In an entry, phrases in parentheses are "optional" modifiers; whether they are used explicitly or not, we are still talking about the same thing for the purposed of this Recommendation.
In this section, numbers in brackets refer to productions in XML 1.0. The following information shall constitute the complete fragment context information (fci) set:
From the above list, the following items affect proper (validating) parsing of the fragment:
The following items, while they cannot affect proper parsing, are usually considered part of the basic, structural XML parse tree:
The following items, while not usually considered part of the basic, structural XML parse tree, are clearly definable pieces of information known or computable by any XML processor that is processing the parent document:
WG consensus decision: (1998/12/09): The WG decided that information such as copyright information and a pointer to the parent document's stylesheet was general metadata that shouldn't be included within the FCI set.
WG consensus decision: (1998/12/09): The WG decided not to allow for a "commenting" feature within the FCS as it was felt this was too subject to potential misuse.
WG consensus decision: (1999/01/06): The WG decided not to allow for an extension mechanism within the FCS since our packaging mechanism can be extended to allow the inclusion of other metadata, but the specification of that is outside the scope of this Recommendation.
WG consensus decision: (1999/01/06): Especially because the XML 1.0 syntax for declarations is difficult to embed within an XML instance, the WG decided not to allow for inline inclusion of internal subset information within the FCS; internal subset information can only be included in the FCS via a reference to an "externalized copy" of the internal subset. Inline internal subset information may be more feasible after the XML Schema WG defines instance syntax for declarations, but this would not make it into version 1.0 of this Fragment Interchange Recommendation.
The previous section defined the logical set of information possible in a fragment context. This section describes the notation in which to express a specific fragment context specification. All information would be optional; how much gets included in any particular fragment context specification is up to the sender and recipient, and how this gets determined is outside of the scope of this Recommendation.
Note
A given fragment context specification need not necessarily provide the ability to specify the complete set of fragment context information described in the previous section. In particular, because the XML 1.0 syntax for declarations is difficult to embed within an XML instance, the specific fragment context specification notation defined by this Recommendation does not allow for inline inclusion of internal subset information within the FCS. Internal subset information can only be included in the FCS via a reference to an "externalized copy" of the internal subset. Inline internal subset information may be more feasible once an instance syntax for declarations is defined, and such may be considered in future versions of the Fragment Interchange specification.
The syntax used is XML itself. In particular, a fragment context specification (fcs) is written as a single root XML element allowing up to four attributes and containing a subtree of other elements (possibly with attributes). The root element (and the element serving as the placeholder for the fragment body) comes from Fragment Interchange namespace, a specific namespace defined by this Recommendation; the contained subtree of elements comes from the namespace of the document from which this fragment comes. When an fcs is packaged into a fragment package (see the following section on packaging), the appropriate namespace declarations must be present. For the purposes of exposition in this section, we assume namespace declarations such as the following are in force:
xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns="http://www.oasis-open.org/docbook/DocbookSchema"
That is,
within this example, f
is the local prefix referring to the Fragment Interchange namespace defined by this Recommendation
for fragment-interchange related components, and the default namespace is
one pointing to the parent document from which the fragment comes.
The element type for the single root element for the fcs shall be f:fcs
(where f
is whatever namespace prefix is mapped to the Fragment Interchange namespace). It allows up to four
attributes, each of whose value shall be a URI (possibly with an XPointer
[XPointer WD] part). The attribute
names and the meaning of their values are as follows:
The content of the f:fcs
element shall be a subtree of
elements (possibly with attribute value assignments) from the parent document's
namespace. This subtree shall provide all the structural context for the fragment
body including various information about ancestor and sibling elements and
attributes by mimicking the (relevant) context within this parent document;
the special empty element f:fragbody
shall be used to indicate
the placement of the fragment body within the specified context. No data characters
(mixed content) are allowed within the f:fcs
element.
For example, consider a fragment body that consists of listitems
2 and 3 of an orderedlist
(indicated to be enumerated with arabic
numbers by the numeration
attribute on the orderedlist
element) within the second sect1
within the first
chapter
within the first part
within the body
of a book
. Assume that the external subset (aka "DTD") is in
the file Docbook.dtd
on the OASIS Open web server, the parent
document is in mybook.xml
on Acme's web server, and that there
need be no internal subset given as part of the fcs. Then the fcs for this
fragment body might look like:
<f:fcs extref="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd" parentref="http://www.acme.com/~me/mydocs/mybook.xml"> <book> <part> <chapter> <sect1/> <sect1> <orderedlist numeration="arabic"> <listitem/> <f:fragbody/> </orderedlist> </sect1> </chapter> </part> </book> </f:fcs>
A formal notation for the fcs
element
used in the examples of the previous section follows. Therein, the following
terms are defined in either the "Extensible Markup Language (XML) 1.0" (XML 1.0) or "Namespaces in XML" (XML Namespaces) Recommendations: NCName,
AttValue, Eq,
S, Attribute, STag,
ETag, EmptyElemTag
, CharData, Reference,
CDSect, PI,
Comment, and S.
[2] FCSstag ::=
'<'
NCName ':fcs' ((S 'extref' Eq
AttValue) | (S 'intref' Eq
AttValue) | (S 'parentref' Eq
AttValue) | (S 'fragbodyref' Eq
AttValue) | (S
Attribute))* S? '>'
[3] FCSelement ::=
EmptyElemTag | STag FCScontent
ETag | FCSfragbody
[4] FCSfragbody ::=
'<'
NCName ':fragbody' (S Attribute)*
S? '/>'
[5] FCSetag ::=
'</'
NCName ':fcs' S? '>'
[6] FCScontent ::=
(FCSelement |
CharData | Reference
| CDSect |
PI | Comment)*
fragbody
(FCSfragbody) element in the fcs.The fragment Interchange
namespace shall be associated with the following URI: http://www.w3.org/XML/Fragment/1.0
.
Open Issue
The exact URLs for the various namespaces defined by this W3C specification is still an open issue. This issue has been raised to the XML Coordination Group (issue 1999-0201-07 Standardizing W3C namespace URIs) for general coordination and resolution.
In the production for FCSetag, there can be any number of other attribute assignments, all of which are ignored by the fragment context specification processor. Per XML 1.0 compliance, there can be at most one assignment to any given attribute including the specifically mentioned attributes. (Since there is no "and" connector in EBNF, this restriction is difficult to show directly in the EBNF, hence this restriction in prose; however, this prose restriction is normative.)
In the production for FCScontent, the fragment processor can optionally expand any References that it can expand. Then all CDSects, PIs, Comments, remaining References, and CharData (including whitespace, S ) are ignored by the FCS processor.
Note
If a Reference in FCScontent is expanded and the expansion includes element structure, that element structure is considered part of the fcs as it would if it had been included originally in its expanded form in the fcs. However, since expansion of any Reference in FCScontent is optional on the part of the fragment context specification processor, any sender for which such expansion is important should do the expansion when creating the fragment package.
The fcs is packaged along with the fragment body by combining them into a single well-formed XML document. For the purposed of fragment interchange packaging, this Recommendation defines a simple "document type" consisting of a "head" part containing the fcs (and, potentially, other) metadata followed by a "body" part containing the fragment body itself. (The XML Fragment WG notes that such a packaging mechanism could be extended to work with complete documents as well as fragment bodies and to include other metadata such as pointers to stylesheets, copyright information, and so on. This Recommendation defines the minimum necessary to support fragment interchange in the hopes that any other Recommendation that addresses packaging of XML documents would be able to be upward compatible with what has been done here.)
In the following example, p
is the local prefix
referring to the namespace defined (currently, by this Recommendation, but
perhaps later by another "packaging" specification) for the packaging structure,
and f
, as in the previous section, is the local prefix referring
to the namespace defined by this Recommendation for fragment-interchange related
components.
The format of a complete fragment package is outlined as follows:
<p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns="{the namespace of the parent document}"> <f:fcs {the ref attributes on the fcs tag}> {the content of the fcs with no namespace prefixes necessary except that on the <f:fragbody/> element} </f:fcs> <p:body> {the fragment body with no namespace prefixes necessary} </p:body> </p:package>
For example, the complete fragment packages for the
two listitem
s for the Docbook book mentioned in the previous
section might look like:
<p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns="http://www.oasis-open.org/docbook/DocbookSchema"> <f:fcs extref="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd" parentref="http://www.acme.com/~me/mydocs/mybook.xml"> <book> <part> <chapter> <sect1/> <sect1> <orderedlist numeration="arabic"> <listitem/> <f:fragbody/> </orderedlist> </sect1> </chapter> </part> </book> </f:fcs> <p:body> <listitem><para>This is the second listitem within the second sect1 of the first chapter within the first part of a Docbook <quote>book</quote> document.</para></listitem> <listitem><para>And this is the next listitem.</para></listitem> </p:body> </p:package>
Note
The above example includes indentation and blank lines to help display the overall structure of the package. However, all whitespace within the
p:body
element is significant and is therefore part of the fragment body. Therefore, the packaging process can introduce no whitespace (including record ends immediately following<p:body>
and immediately preceding</p:body>
) within thep:body
element.
A fragment conforms to this XML Fragment Interchange Recommendation if it adheres to all syntactic requirements defined in this Recommendation.
Application software conforms to the XML Fragment Interchange Recommendation if it interprets all conforming XML fragments (as defined above) according to all required semantics prescribed by this Recommendation, and, for any optional semantics it chooses to support, supports them in the way prescribed.
World Wide Web Consortium. Extensible Markup Language (XML) 1.0. W3C Recommendation. See http://www.w3.org/TR/REC-xml
World Wide Web Consortium. Namespaces in XML W3C Proposed Recommendation. See http://www.w3.org/TR/PR-xml-names
World Wide Web Consortium. XML Pointer Language (XPointer) W3C Working Draft. See http://www.w3.org/TR/WD-xptr
World Wide Web Consortium. Associating stylesheets with XML documents W3C Working Draft. See http://www.w3.org/TR/WD-xml-stylesheet
OASIS (formerly SGML Open) Fragment Interchange -- SGML Open Technical Resolution 9601:1996. OASIS (SGML Open) Technical Resolution. See http://www.oasis-open.org/html/techpubs.htm#fragment for a non-normative HTML version (that, as of 1998 Dec 02, had a few glitches)
World Wide Web Consortium. XML Fragment Interchange Requirements W3C Note. See http://www.w3.org/TR/NOTE-XML-FRAG-REQ
IETF RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. See http://www.imc.org/rfc2045
The following examples are designed in general to address the potential reference scenarios described in XML Fragment Requirements Document.
The user has an XML document that represents a customer's set of purchases as a bookstore, and the part of that document that represents the purchase of a particular book needs to be represented as a fragment.
Here is the original XML document for the transaction:
<?xml version="1.0"?> <transaction TID="19990207-1234"> <purchase> <book> <Author>Frank Herbert</Author> <Title>Dune</Title> <Edition>Hardcover Reissue edition (April 1984)</Edition> <ISBN>0399128964</ISBN> <Price currency="USD">18.87</Price> <Quantity>1</Quantity> </book> <book> <Author>J. R. R. Tolkien</Author> <Title>The Book of Lost Tales (The History of Middle-Earth)</Title> <Edition>Mass Market Paperback Reprint edition (June 1992)</Edition> <ISBN>0345375211</ISBN> <Price currency="USD">4.79</Price> <Quantity>1</Quantity> </book> </purchase> <refund RID="19990115-2"> <reason TID="19981220-3214">Late delivery</reason> <value currency="USD">5.00</value> </refund> <payment> <client CID="123421"/> <value currency="USD">18.66</value> <creditcard type="MasterCard"> <bank>BankBoston</bank> <owner>Joe J. Bill</owner> <serial>1234567890</serial> <expires>5/99</expires> </creditcard> <status>Waiting for approval</status> </payment> </transaction>
Here is a fragment representing the second book
element from the above document (the fragbodyref
attribute on
the f:fcs
element is optional and is shown merely as an example):
<?xml version="1.0"?> <p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns=""> <f:fcs fragbodyref="http://sales.acme.com/trans/19990207-1234#root().child(1,purchase).child(2,book)"> <transaction> <purchase> <book/> <f:fragbody/> </purchase> </transaction> </p:fcs> <p:body> <book> <Author>J. R. R. Tolkien</Author> <Title>The Book of Lost Tales (The History of Middle-Earth)</Title> <Edition>Mass Market Paperback Reprint edition (June 1992)</Edition> <ISBN>0345375211</ISBN> <Price currency="USD">4.79</Price> <Quantity>1</Quantity> </book> </p:body> </p:package>
A user has an XML document that includes several external entities, and she wants to be able to interchange a fragment that includes a reference to the entities using MIME MIME packaging.
Here is the original document:
<?xml version="1.0"?> <!DOCTYPE book SYSTEM "http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd" [ <!ENTITY title "My Book"> <!ENTITY author "me"> <!ENTITY try SYSTEM "try.cgm" NDATA CGM-BINARY> ]> <book> <part> <title>&title;</title> <introduction>This is my book ...</introduction> <author>&author;</author> <chapter type="intro"> <sect1>The introduction ...</sect1> </chapter> <chapter>...</chapter> <chapter> <p>This is a paragraph within the third chapter within the first part of a Docbook <quote>book</quote> document.</p> <p>And this is a succeeding paragraph.</p> <p>And an internal text entity reference &author;.</p> <p>And a reference to an unparsed entity (a CGM graphic): <graphic entityref="try"></graphic></p> </chapter> <chapter>...</chapter> </part> </book>
Note that the DocBook DTD includes the following (which is therefore not included in the internal subset of this document):
<!NOTATION CGM-BINARY PUBLIC "ISO 8632/3//NOTATION Binary Encoding//EN">
Here is a fragment that represents the contents of the third chapter:
<?xml version="1.0"?> <p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd"> <f:fcs extref="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd" intref="mybook.decls"> <book> <part> <chapter type="intro"/> <chapter/> <chapter> <f:fragbody/> </chapter> </part> </book> </f:fcs> <p:body> <p>This is a paragraph within the third chapter within the first part of a Docbook <quote>book</quote> document.</p> <p>And this is a succeeding paragraph.</p> <p>And an internal text entity reference &author;.</p> <p>And a reference to an unparsed entity (a CGM graphic): <graphic entityref="try"></graphic></p> </p:body> </p:package>
Here is the associated internal subset:
<!ENTITY title "My Book"> <!ENTITY author "me"> <!ENTITY try SYSTEM "try.cgm" NDATA CGM-BINARY>
Here is the external entity (represented in Base 64 encoding, since this is really a binary entity):
ACEAABAiAAEQXwBEQyJTb3VyY2U6IEhTSSAvV01GLXRvLUNHTSBmaWx0ZXIg LyBWZXJzaW9uIDEuMzUgIiAiRGF0ZTogMTk5OS0wMS0xNyIRZgAB//8AARBi AAAQpgAAAAkAFxFGAAAA////EYQwIgAQEYogyAAAAAB//3//AAARvwC3C1RJ TUVTX1JPTUFODFRJTUVTX0lUQUxJQwpUSU1FU19CT0xEEVRJTUVTX0JPTERf SVRBTElDCUhFTFZFVElDQRFIRUxWRVRJQ0FfT0JMSVFVRQ5IRUxWRVRJQ0Ff Qk9MRBZIRUxWRVRJQ0FfQk9MRF9PQkxJUVVFB0NPVVJJRVIOQ09VUklFUl9J VEFMSUMMQ09VUklFUl9CT0xEE0NPVVJJRVJfQk9MRF9JVEFMSUMGU1lNQk9M ABHOAAABQgABAUEABAMqLToR4gABAGEAACAmAAE9NJ9IIEIAASBiAAAgggAA IKIAACDI95D0wAhqCzoAAACAQWj5cAa5/TEJikGGAogCUQGQUGIACEAo+dD/ +v7g+TpRYgACUkwAAQAEAAAAAAAAAABRgBxUggAAABkAGQAAFKCAAJAkAEg/ MoAAQlTb21lIFRleHQAoABA
And here is an example of MIME packaging used to transmit the fragment package and the internal subset within a single stream such as a mail message:
Content-Type: multipart/mixed; boundary="/04w6evG8XlLl3ft" --/04w6evG8XlLl3ft Content-Type: text/xml; charset=us-ascii Content-Disposition: attachment; filename="mybook.decls" <!ENTITY title "My Book"> <!ENTITY author "me"> <!ENTITY try SYSTEM "try.cgm" NDATA CGM-BINARY> --/04w6evG8XlLl3ft Content-Type: image/cgm Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="try.cgm" ACEAABAiAAEQXwBEQyJTb3VyY2U6IEhTSSAvV01GLXRvLUNHTSBmaWx0ZXIg LyBWZXJzaW9uIDEuMzUgIiAiRGF0ZTogMTk5OS0wMS0xNyIRZgAB//8AARBi AAAQpgAAAAkAFxFGAAAA////EYQwIgAQEYogyAAAAAB//3//AAARvwC3C1RJ TUVTX1JPTUFODFRJTUVTX0lUQUxJQwpUSU1FU19CT0xEEVRJTUVTX0JPTERf SVRBTElDCUhFTFZFVElDQRFIRUxWRVRJQ0FfT0JMSVFVRQ5IRUxWRVRJQ0Ff Qk9MRBZIRUxWRVRJQ0FfQk9MRF9PQkxJUVVFB0NPVVJJRVIOQ09VUklFUl9J VEFMSUMMQ09VUklFUl9CT0xEE0NPVVJJRVJfQk9MRF9JVEFMSUMGU1lNQk9M ABHOAAABQgABAUEABAMqLToR4gABAGEAACAmAAE9NJ9IIEIAASBiAAAgggAA IKIAACDI95D0wAhqCzoAAACAQWj5cAa5/TEJikGGAogCUQGQUGIACEAo+dD/ +v7g+TpRYgACUkwAAQAEAAAAAAAAAABRgBxUggAAABkAGQAAFKCAAJAkAEg/ MoAAQlTb21lIFRleHQAoABA --/04w6evG8XlLl3ft Content-Type: text/xml; charset=us-ascii <?xml version="1.0"?> <p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd"> <f:fcs extref="http://www.oasis-open.org/docbook/docbook/3.0/docbook.dtd" intref="mybook.decls"> <book> <part> <chapter type="intro"/> <chapter/> <chapter> <f:fragbody/> </chapter> </part> </book> </f:fcs> <p:body> <p>This is a paragraph within the third chapter within the first part of a Docbook <quote>book</quote> document.</p> <p>And this is a succeeding paragraph.</p> <p>And an internal text entity reference &author;.</p> <p>And a reference to an unparsed entity (a CGM graphic): <graphic entityref="try"></graphic></p> </p:body> </p:package> --/04w6evG8XlLl3ft--
The user has very large XML documents, possibly a gigabyte or more in size, and wishes to be able to view portions of the document without parsing the whole document. In order to do this the user creates an "index" for each document portion (fragment) that they wish to so address. The "index" consists of a fragment context specification in combination with a packaging mechanism designed for quick access to the fragment body. This should be used to view and browse document with a flat structure, like HTML, on devices where only a part of the document can be parsed or rendered.
<?xml version="1.0"?> <p:package xmlns:p="http://www.w3.org/XML/Package/1.0" xmlns:f="http://www.w3.org/XML/Fragment/1.0" xmlns=""> <f:fcs fragbodyref="http://www.w3.org/TR/REC-xml.html#sec-xml-and-sgml" extref="http://www.w3.org/TR/REC-html40-971218/loose.dtd"> <html> <head> <link rel='STYLESHEET' type='text/css' href='/StyleSheets/TR/rec.css'/> </head> <body> <h1>Extensible Markup Language (XML) 1.0</h1> <h2 ID='sec-intro'>1. Introduction</h2> <h3 ID='sec-origin-goals'>1.1 Origin and Goals</h3> <h3 ID='sec-terminology'>1.2 Terminology</h3> <h2 ID='sec-documents'>2. Documents</h2> <h3 ID='sec-well-formed'>2.1 Well-Formed XML Documents</h3> <h3 ID='charsets'>2.2 Characters</h3> <h3 ID='sec-common-syn'>2.3 Common Syntactic Constructs</h3> <h3 ID='syntax'>2.4 Character Data and Markup</h3> <h3 ID='sec-comments'>2.5 Comments</h3> <h3 ID='sec-pi'>2.6 Processing Instructions</h3> <h3 ID='sec-cdata-sect'>2.7 CDATA Sections</h3> <h3 ID='sec-prolog-dtd'>2.8 Prolog and Document Type Declaration</h3> <h3 ID='sec-rmd'>2.9 Standalone Document Declaration</h3> <h3 ID='sec-white-space'>2.10 White Space Handling</h3> <h3 ID='sec-line-ends'>2.11 End-of-Line Handling</h3> <h3 ID='sec-lang-tag'>2.12 Language Identification</h3> <h2 ID='sec-logical-struct'>3. Logical Structures</h2> <h3 ID='sec-starttags'>3.1 Start-Tags, End-Tags, and Empty-Element Tags</h3> <h3 ID='elemdecls'>3.2 Element Type Declarations</h3> <h4 ID='sec-element-content'>3.2.1 Element Content</h4> <h4 ID='sec-mixed-content'>3.2.2 Mixed Content</h4> <h3 ID='attdecls'>3.3 Attribute-List Declarations</h3> <h4 ID='sec-attribute-types'>3.3.1 Attribute Types</h4> <h4 ID='sec-attr-defaults'>3.3.2 Attribute Defaults</h4> <h4 ID='AVNormalize'>3.3.3 Attribute-Value Normalization</h4> <h3 ID='sec-condition-sect'>3.4 Conditional Sections</h3> <h2 ID='sec-physical-struct'>4. Physical Structures</h2> <h3 ID='sec-references'>4.1 Character and Entity References</h3> <h3 ID='sec-entity-decl'>4.2 Entity Declarations</h3> <h4 ID='sec-internal-ent'>4.2.1 Internal Entities</h4> <h4 ID='sec-external-ent'>4.2.2 External Entities</h4> <h3 ID='TextEntities'>4.3 Parsed Entities</h3> <h4 ID='sec-TextDecl'>4.3.1 The Text Declaration</h4> <h4 ID='wf-entities'>4.3.2 Well-Formed Parsed Entities</h4> <h4 ID='charencoding'>4.3.3 Character Encoding in Entities</h4> <h3 ID='entproc'>4.4 XML Processor Treatment of Entities and References</h3> <h4 ID='not-recognized'>4.4.1 Not Recognized</h4> <h4 ID='included'>4.4.2 Included</h4> <h4 ID='include-if-valid'>4.4.3 Included If Validating</h4> <h4 ID='forbidden'>4.4.4 Forbidden</h4> <h4 ID='inliteral'>4.4.5 Included in Literal</h4> <h4 ID='notify'>4.4.6 Notify</h4> <h4 ID='bypass'>4.4.7 Bypassed</h4> <h4 ID='as-PE'>4.4.8 Included as PE</h4> <h3 ID='intern-replacement'>4.5 Construction of Internal Entity Replacement Text</h3> <h3 ID='sec-predefined-ent'>4.6 Predefined Entities</h3> <h3 ID='Notations'>4.7 Notation Declarations</h3> <h3 ID='sec-doc-entity'>4.8 Document Entity</h3> <h2 ID='sec-conformance'>5. Conformance</h2> <h3 ID='proc-types'>5.1 Validating and Non-Validating Processors</h3> <h3 ID='safe-behavior'>5.2 Using XML Processors</h3> <h2 ID='sec-notation'>6. Notation</h2> <h3>Appendices</h3>A. <A ID='sec-bibliography'>References</A> <h3 ID='sec-existing-stds'>A.1 Normative References</h3> <h3 ID='null'>A.2 Other References</h3> <h2 ID='CharClasses'>B. Character Classes</h2> <f:fragbody/> <h2 ID='sec-entexpand'>D. Expansion of Entity and Character References (Non-Normative)</h2> <h2 ID='determinism'>E. Deterministic Content Models (Non-Normative)</h2> <h2 ID='sec-guessing'>F. Autodetection of Character Encodings (Non-Normative)</h2> <h2 ID='sec-xml-wg'>G. W3C XML Working Group (Non-Normative)</h2> </body> </html> </f:fcs> <p:body> <h2 ID='sec-xml-and-sgml'>C. XML and SGML (Non-Normative)</h2> <p>XML is designed to be a subset of SGML, in that every <a IDREF='#dt-valid'>valid</a> XML document should also be a conformant SGML document. For a detailed comparison of the additional restrictions that XML places on documents beyond those of SGML, see <a IDREF='#Clark'>[Clark]</a>. </p> </p:body> </p:package>
In the design of any language, trade-offs in the solution space are necessary. To aid in making these trade-offs the follow design principles were used (the order of these principles is not necessarily significant):
The following participated in the XML Fragment WG during the authoring of this Recommendation: