3. Conformance
Definition
This section is normative.
In order to ensure that XHTML-family documents are maximally
portable among XHTML-family user agents, this specification rigidly
defines conformance requirements for both of these and for
XHTML-family document types. While the conformance definitions can
be found in this section, they necessarily reference normative text
within this document, within the base XHTML specification [XHTML1], and within other related
specifications. It is only possible to fully comprehend the
conformance requirements of XHTML through a complete reading of all
normative references.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
3.1. XHTML Host Language Document Type Conformance
It is possible to modify existing document types and define
wholly new document types using both modules defined in this
specification and other modules. Such a document type is "XHTML
Host Language Conforming" when it meets the following criteria:
- The document type must be defined using one of the
implementation methods defined by the W3C. Currently this is
limited to XML DTDs, but XML Schema will be available soon. The
rest of this section refers to "DTDs" although other
implementations are possible.
- The DTD which defines the document type must have a unique
identifier as defined in Naming Rules that
uses the string "XHTML" in its first token of the public text
description.
- The DTD which defines the document type must include, at a
minimum, the Structure, Hypertext, Text, and List modules defined
in this specification.
- For each of the W3C-defined modules that are included, all of
the elements, attributes, types of attributes (including any
required enumerated value lists), and any required minimal content
models must be included (and optionally extended) in the document
type's content model. When content models are extended, all of the
elements and attributes (along with their types or any required
enumerated value lists) required in the original content model must
continue to be required.
- The DTD which defines the document type may define additional
elements and attributes. However, these must be in their own XML
namespace [XMLNAMES].
3.2. XHTML Integration Set
Document Type Conformance
It is also possible to define document types that are based upon
XHTML, but do not adhere to its structure. Such a document type is
"XHTML Integration Set Conforming" when it meets the following
criteria:
- The document type must be defined using one of the
implementation methods defined by the W3C. Currently this is
limited to XML DTDs, but XML Schema will be available soon. The
rest of this section refers to "DTDs" although other
implementations are possible.
- The DTD which defines the document type must have a unique
identifier as defined in Naming Rules that
uses the string "XHTML" NOT in its first token of the public text
description.
- The DTD which defines the document type must include, at a
minimum, the Hypertext, Text, and List modules defined in this
specification.
- For each of the W3C-defined modules that are included, all of
the elements, attributes, types of attributes (including any
required enumerated lists), and any required minimal content models
must be included (and optionally extended) in the document type's
content model. When content models are extended, all of the
elements and attributes (along with their types or any required
enumerated value lists) required in the original content model must
continue to be required.
- The DTD which defines the document type may define additional
elements and attributes. However, these must be in their own XML
namespace [XMLNAMES].
3.3. XHTML
Family Module Conformance
This specification defines a method for defining
XHTML-conforming modules. A module conforms to this specification
when it meets all of the following criteria:
- The document type must be defined using one of the
implementation methods defined by the W3C. Currently this is
limited to XML DTDs, but XML Schema will be available soon. The
rest of this section refers to "DTDs" although other
implementations are possible.
- The DTD which defines the module must have a unique identifier
as defined in Naming
Rules.
- When the module is defined using an XML DTD, the module must
insulate its parameter entity names through the use of unique
prefixes or other, similar methods.
- The module definition must have a prose definition that
describes the syntactic and semantic requirements of the elements,
attributes, and/or content models that it declares.
- The module definition must not reuse any element names that are
defined in other W3C-defined modules, except when the content model
and semantics of those elements are either identical to the
original or an extension of the original, or when the reused
element names are within their own namespace (see below).
- The module definition's elements and attributes must be part of
an XML namespace
[XMLNAMES]. If the module is defined by an organization other
than the W3C, this namespace must NOT be the same as the namespace
in which other W3C modules are defined.
3.4.
XHTML Family Document Conformance
A conforming XHTML family document is a valid instance of a
XHTML Host Language Conforming Document Type.
3.5. XHTML Family User Agent Conformance
A conforming user agent must meet all of the following criteria
(as defined in
[XHTML1]):
- In order to be consistent with the XML 1.0 Recommendation [XML], the user agent must parse
and evaluate an XHTML document for well-formedness. If the user
agent claims to be a validating user agent, it must also validate
documents against their referenced DTDs according to [XML].
- When the user agent claims to support facilities defined within
this specification or required by this specification through
normative reference, it must do so in ways consistent with the
facilities' definition.
- When a user agent processes an XHTML document as generic [XML], it shall only recognize
attributes of type
ID
(e.g., the id
attribute on most XHTML elements) as fragment identifiers.
- If a user agent encounters an element it does not recognize, it
must continue to process the children of that element. If the
content is text, the text must be presented to the user.
- If a user agent encounters an attribute it does not recognize,
it must ignore the entire attribute specification (i.e., the
attribute and its value).
- If a user agent encounters an attribute value it doesn't
recognize, it must use the default attribute value.
- If it encounters an entity reference (other than one of the
predefined entities) for which the user agent has processed no
declaration (which could happen if the declaration is in the
external subset which the user agent hasn't read), the entity
reference should be rendered as the characters (starting with the
ampersand and ending with the semi-colon) that make up the entity
reference.
- When rendering content, user agents that encounter characters
or character entity references that are recognized but not
renderable should display the document in such a way that it is
obvious to the user that normal rendering has not taken place.
- The user agent must process whitespace characters according to
the following rules. The following characters are defined in [XML] as whitespace characters:
- Space ( )
- Tab (	)
- Carriage return (
)
- Line feed (
)
The XML processor normalizes different systems' line end codes
into one single line feed character, that is passed up to the
application. The XHTML user agent in addition, must treat the
following characters as whitespace:
- Zero-width space (​)
Whitespace is handled according to the following rules:
- All whitespace surrounding block elements should be
removed.
- Comments are removed entirely and do not affect whitespace
handling. One whitespace character on either side of a comment is
treated as two whitespace characters.
- Leading and trailing whitespace inside a block element must be
removed.
- Line feed characters within a block element must be converted
into a space (except when the 'xml:space' attribute is set to
'preserve').
- A sequence of whitespace characters must be reduced to a single
space character (except when the 'xml:space' attribute is set to
'preserve').
- With regard to rendition, the user agent should render the
content in a manner appropriate to the language in which the
content is written. In languages whose primary script is Latinate,
the ASCII space character is typically used to encode both
grammatical word boundaries and typographic whitespace; in
languages whose script is related to Nagari (e.g., Sanskrit, Thai,
etc.), grammatical boundaries may be encoded using the ZW 'space'
character, but will not typically be represented by typographic
whitespace in rendered output; languages using Arabiform scripts
may encode typographic whitespace using a space character, but may
also use the ZW space character to delimit 'internal' grammatical
boundaries (what look like words in Arabic to an English eye
frequently encode several words, e.g., 'kitAbuhum' = 'kitAbu-hum' =
'book them' == their book); and languages in the Chinese script
tradition typically neither encode such delimiters nor use
typographic whitespace in this way.
Whitespace in attribute values is processed according to [XML].
3.6. Naming Rules
XHTML Host Language document types must adhere to strict naming
conventions so that it is possible for software and users to
readily determine the relationship of document types to XHTML. The
names for document types implemented as XML Document Type
Definitions are defined through Formal Public Identifiers (FPIs).
Within FPIs, fields are separated by double slash character
sequences (//
). The various fields must be composed as
follows:
- The leading field must be "-" to indicate a privately defined
resource.
- The second field must contain the name of the organization
responsible for maintaining the named item. There is no formal
registry for these organization names. Each organization should
define a name that is unique. The name used by the W3C is, for
example,
W3C
.
- The third field contains two constructs: the public text class
followed by the public text description. The first token in the
third field is the public text class which should adhere to ISO
8879 Clause 10.2.2.1 Public Text Class. Only XHTML Host Language
conforming documents should begin the public text description with
the token XHTML. The public text description should contain the
string XHTML if the document type is Integration Set conforming.
The field must also contain an organization-defined unique
identifier (e.g., MyML 1.0). This identifier should be composed of
a unique name and a version identifier that can be updated as the
document type evolves.
- The fourth field defines the language in which the item is
developed (e.g.,
EN
).
Using these rules, the name for an XHTML Host Language
conforming document type might be -//MyCompany//DTD XHTML
MyML 1.0//EN
. The name for an XHTML family conforming module
might be -//MyCompany//ELEMENTS XHTML MyElements
1.0//EN
. The name for an XHTML Integration Set conforming
document type might be -//MyCompany//DTD Special Markup with
XHTML//EN
.
3.7. XHTML Module Evolution
Each module defined in this specification is given a unique
identifier that adheres to the naming rules in the previous
section. Over time, a module may evolve. A logical ramification of
such evolution may be that some aspects of the module are no longer
compatible with its previous definition. To help ensure that
document types defined against modules defined in this
specification continue to operate, the identifiers associated with
a module that changes will be updated. Specifically, the Formal
Public Identifier and System Identifier of the module will be
changed by modifying the version identifier included in each.
Document types that wish to incorporate the updated functionality
will need to be similarly updated.
In addition, the earlier version(s) of the module will continue
to be available via its earlier, unique identifier(s). In this way,
document types developed using XHTML modules will continue to
function seamlessly using their original definitions even as the
collection expands and evolves. Similarly, document instances
written against such document types will continue to validate using
the earlier module definitions.
Other XHTML Family Module and Document Type authors are
encouraged to adopt a similar strategy to ensure the continued
functioning of document types based upon those modules and document
instances based upon those document types.