3. Document Object Model (XML) Level 1


3.4 Descriptions of objects related to the Document Type Definition

This section describes the objects that are used to represent the DTD of a document. The objects are not XML specific, though some attributes are specific to HTML DTD's. Such cases are clearly marked.

Interface DocumentType

Each document has a (possibly null) attribute that contains a reference to a DocumentType object. The DocumentType class provides an interface to access all of the entity declarations, notation declarations, and all the element type declarations.(ED: There is no way currently of accessing the list of entities declared within a DTD. This will be added once discussion about entity representation is completed.)

IDL Definition

interface DocumentType {
  attribute wstring        name;
  attribute Node           externalSubset;
  attribute Node           internalSubset;
  attribute Node           generalEntities;
  attribute Node           parameterEntities;
  attribute Node           notations;
  attribute Node           elementTypes;
};

Attribute name

The name attribute is a wstring that holds the name of DTD; i.e. the name immediately following the DOCTYPE keyword.

Attribute externalSubset

The externalSubset attribute's children reference the list of nodes (definitions) that occurred in the external subset of a document. In this example:

<!DOCTYPE ex SYSTEM "ex.dtd" [
<ex/>
it would iterate over all of the declarations that occurred within the ex.dtd external entity. Note: An iterator interface is used so as to not constrain implementations

Attribute internalSubset

The internal subset's children constitute all the definitions that occurred within the internal subset of a document (the part that appears within the document instance). For example

<!DOCTYPE ex SYSTEM "ex.dtd" [
<!ENTITY ex "example">
]>
&lt;ex/>
if would iterate over a single node: the definition of the ex entity. Note: An iterator interface is used so as to not constrain implementations

Attribute generalEntities

This is a Node whose children constitute the set of general entites that were defined within the external and the internal subset. For example in:

<!DOCTYPE ex SYSTEM "ex.dtd" [
<!ENTITY foo "foo">
<!ENTITY bar "bar">
<!ENTITY % baz "baz">
]>
<ex/>
the interface would provide access to foo and bar but not baz. All objects supporting the Node interface that are accessed though this attribute, will also support the Entity interface (defined below).

Attribute parameterEntities

This is a Node whose children constitute the set of parameter entites that were defined within the external and the internal subset. In the example above, the interface would provide access to baz but not foo or bar. All objects supporting the Node interface that are accessed though this attribute, will also support the Entity interface (defined below).

Attribute notations

This is a Node whose children constitute the set of notations that were defined within the external and the internal subset. All objects supporting the Node interface that are accessed though this attribute, will also support the Notation interface (defined below).

Attribute elementTypes

This is a Node whose children constitute the set of element types that were defined within the external and the internal subset. All objects supporting the Node interface that are accessed though this attribute, will also support the ElementDefinition interface (defined below).

Interface ElementDefinition

The definition of each element defined within the external or internal subset (providing it is parsed), will be available through the elementTypes attribute of the DocumentType object. The name, attribute list, and content model are all available for inspection.

IDL Definition

interface ElementDefinition : Node {
  // ContentType
  const int            EMPTY                = 1;
  const int            ANY                  = 2;
  const int            PCDATA               = 3;
  const int            MODEL_GROUP          = 4;

  attribute wstring        name;
  attribute int            contentType;
  attribute ModelGroup     contentModel;
  attribute Node           attributeDefinitions;
  attribute Node           inclusions;
  attribute Node           exceptions;
};

Definition group ContentType

(ED: TBD)
Defined Constants
EMPTY

The element is an empty element, and cannot have content.

ANY

The element may have character data, or any of the other elements defined within the DTD as content, in any order and sequence.

PCDATA

The element can have only PCDATA (Parsed Character Data) as content.

MODEL_GROUP

The element has a specific content model associated with it. The model is accessible through the contentModel attribute (below).

Attribute name

This is the name of the type of element being defined.

Attribute contentType

This attribute specifies the type of content of the element.

Attribute contentModel

If the contentType is MODEL_GROUP, then this will provide access to a ModelGroup (below) object that is the root of the content model object heirarchy for this element. For other content types, this will be null.

Attribute attributeDefinitions

The children of this Node consist of the attributes that were defined to be on an ElementDefinition. Each object supporting the Node interface that is accessed through this attribute will also support the AttributeDefinition interface.

Attribute inclusions

The children of this define a list of element type names that are included in the content model of this element by the SGML inclusion/exception mechanism (not available from XML, but used in HTML).

Attribute exceptions

The children of this node define a list of element type names that are excluded from the content model of this element by the SGML inclusion/exception mechanism (not available from XML, but used in HTML).

Interface PCDATAToken

Token type for the string #PCDATA

IDL Definition

interface PCDATAToken : Node {
};

Interface ElementToken

Token for an element declaration.

IDL Definition

interface ElementToken : Node {
  // OccurrenceType
  const int            OPT                  = 1;
  const int            PLUS                 = 2;
  const int            REP                  = 3;

  attribute wstring        name;
  attribute int            occurrence;
};

Definition group OccurrenceType

(ED: TBD)
Defined Constants
OPT

The ? occurrence indicator.

PLUS

The + occurrence indicator.

REP

The * occurrence indicator.

Attribute name

The element type name.

Attribute occurrence

The number of times this element can occur.

Interface ModelGroup

The ModelGroup object represents the content model of an ElementDefinition. The content model is represented as a tree, where each node specifies how its children are connected, and the number of times that it can occur within its parent. Leaf nodes in the tree are either PCDATAToken or ElementToken.

IDL Definition

interface ModelGroup : Node {
  // OccurrenceType
  const int            OPT                  = 1;
  const int            PLUS                 = 2;
  const int            REP                  = 3;

  // ConnectionType
  const int            OR                   = 1;
  const int            SEQ                  = 2;
  const int            AND                  = 3;

  attribute int            occurrence;
  attribute int            connector;
  attribute Node           tokens;
};

Definition group OccurrenceType

(ED: TBD)
Defined Constants
OPT

The ? occurrence indicator.

PLUS

The + occurrence indicator.

REP

The * occurrence indicator.

Definition group ConnectionType

(ED: TBD)
Defined Constants
OR

The | connection indicator.

SEQ

The , connection indicator.

AND

The ?? connection indicator.

Attribute occurrence

The number of times this model can occur.

Attribute connector

Describes how the tokens are connected together.

Attribute tokens

The children of this node define the list of tokens in this model group.

Interface AttributeDefinition

The AttributeDefinition interface is used to access information about a particular attribute definition on a given element. Object supporting this interface are available from the ElementDefinition object through the attributeDefinitions attribute.

IDL Definition

interface AttributeDefinition : Node {
  // DeclaredValueType
  const int            CDATA                = 1;
  const int            ID                   = 2;
  const int            IDREF                = 3;
  const int            IDREFS               = 4;
  const int            ENTITY               = 5;
  const int            ENTITIES             = 6;
  const int            NMTOKEN              = 7;
  const int            NMTOKENS             = 8;
  const int            NOTATION             = 9;
  const int            NAME_TOKEN_GROUP     = 10;

  // DefaultValueType
  const int            FIXED                = 1;
  const int            REQUIRED             = 2;
  const int            IMPLIED              = 3;

  attribute wstring        name;
  attribute StringList     allowedTokens;
  attribute int            declaredType;
  attribute int            defaultType;
  attribute Node           defaultValue;
};

Definition group DeclaredValueType

(ED: TBD)
Defined Constants
CDATA

(ED: TBD)

ID

(ED: TBD)

IDREF

(ED: TBD)

IDREFS

(ED: TBD)

ENTITY

(ED: TBD)

ENTITIES

(ED: TBD)

NMTOKEN

(ED: TBD)

NMTOKENS

(ED: TBD)

NOTATION

(ED: TBD)

NAME_TOKEN_GROUP

(ED: TBD)

Definition group DefaultValueType

(ED: TBD)
Defined Constants
FIXED

(ED: TBD)

REQUIRED

(ED: TBD)

IMPLIED

(ED: TBD)

Attribute name

The name of the attribute.

Attribute allowedTokens

The list of tokens that are allowed as values. For example, in

&lt;!DOCTYPE ex [
&lt;!ELEMENT ex (#PCDATA) >
&lt;!ATTLIST ex test (FOO|BAR) "FOO" >
]>
&lt;ex>&lt;/ex>
this would hold FOO and BAR.

Attribute declaredType

This attribute indicates the type of values the attribute may contain.

Attribute defaultType

This specifies whether the attribute must be specified in the instance, and if it is not, what the attribute value will be if not provided.

Attribute defaultValue

This provides an interface to a Node whose children make up the default value for an attribute. This value is used if the attribute was not given an explicit value in the document instance.

Interface Notation

The Notation object is used to represent the definition of a notation within a DTD.

IDL Definition

interface Notation : Node {
  attribute wstring        name;
  attribute boolean        isPublic;
  attribute string         publicIdentifier;
  attribute string         systemIdentifier;
};

Attribute name

This is the name of the notation.

Attribute isPublic

If a public identifier was specified in the notation declaration, this will be TRUE, and the publicIdentifier attribute will contain the string for the public identifier.

Attribute publicIdentifier

If a public identifier was specified in the notation declaration, this will hold the public identifier string, otherwise it will be null.

Attribute systemIdentifier

If a system identifier was specified in the notation declaration, this will hold the system identifier string, otherwise it will be null.