Mathematical Markup Language (MathML) Version 2.0
6 Entities, Characters and Fonts
7 The MathML Interface
7.1 Embedding MathML in HTML
7.1.1 The Top-Level math
Element
7.1.2 Requirements for a MathML Browser Interface
7.1.3 Invoking Embedded Objects as Renderers
7.1.4 Invoking Other Applications
7.1.5 Mixing and Linking MathML and HTML
7.2 Generating, Processing and Rendering MathML
7.2.1 MathML Compliance
7.2.2 Handling of Errors
7.2.3 Attribute for unspecified data
7.3 Future Extensions
7.3.1 Macros and Style Sheets
7.3.2 XML Extensions to MathML
8 Document Object Model for MathML
To be effective, MathML must work well with a wide variety of renderers, processors, translators and editors. This chapter addresses some of the interface issues involved in generating and rendering MathML. Since MathML exists primarily to encode mathematics in Web documents, perhaps the most important interface issues are related to embedding MathML in HTML.
There are three kinds of interface issues that arise in embedding MathML in HTML. First, MathML must be semantically integrated into HTML. Browsers must recognize MathML markup as embedded XML content, and not as an HTML syntax error. This is primarily a question of managing namespaces in XML.
Second, MathML rendering must be integrated into browser software. Some browsers already implement MathML rendering natively, and one can expect more browsers will do so in the future. At the same time, other browsers have developed infrastructure to facilitate the rendering of MathML and other embedded XML content by embedded elements. While substantial progress has been made, further improvement in coordination between browsers and embedded elements will be necessary. For example, better support for coordinating initialization and size negotiation is needed, as is better support for high-resolution printing.
Third, other tools for generating and processing MathML must be able to intercommunicate. A number of MathML tools have been or are being developed, include editors, translators, computer algebra systems, and other scientific software. However, since MathML expressions tend to be lengthy, and prone to error when entered by hand, special emphasis must be given to insuring that MathML can be easily generated by user-friendly conversion and authoring tools, and that these tools work together in a dependable, platform and vendor independent way.
The W3C Math working group is committed to providing support to software vendors developing all kinds of MathML tools. The working group monitors the public mailing list www-math@w3.org, and will attempt to answer questions about the MathML specification. The working group also intends to stimulate the formation of MathML developer and user groups. For current information about MathML tools, applications and user support activities, consult the home page of the W3C Math Working Group.
MathML specifies a single top-level math
element, which encapsulates each instance of MathML markup within an
HTML page. As such, the math
element provides
an attachment point for information that affects a MathML expression
as a whole.
In practice, the math
element also serves
as the interface for embedding MathML in HTML. In this capacity, the
math
element simultaneously signals the
semantic inclusion of MathML (XML) content in HTML, and provides the
necessary machinery for rendering its content in a browser either by
invoking an embedded element, or by specifying parameters for a native
renderer in the browser. Both semantic inclusion and rendering
present a number of issues that extend beyond the scope of this
specification.
In order to produce a complete and self-contained description of MathML,
this document only specifies the attributes and usage of the math
element as a top-level element for MathML, and not
as an interface element. The W3C Math working group will continue working
closely with other W3C activities to insure that emerging standards for
embedding XML in HTML accommodate seamless integration of MathML in HTML.
In section 7.1.2 [Requirements for a MathML Browser Interface] we list requirements an interface
element for MathML would have to meet in order to fully integrate MathML
into HTML. However, it is important to note that the MathML specification
is independent of embedding mechanisms.
math
ElementAs stated above, MathML specifies a single top-level
math
element. All other MathML content must be contained
in a math
element; equivalently, every valid, complete
MathML expression must be contained in <math>
tags. The math
element must always be the outermost
element in a MathML expression; it is an error for one
math
element to contain another.
Applications that return sub-expressions of other MathML
expressions, for example as the result of a cut-and-paste operation,
should always wrap them in <math>
tags. The
presence of enclosing <math>
tags should be a
reasonable heuristic test for MathML content. Similarly, applications
which insert MathML expressions in other MathML expressions must take
care to remove the <math>
tags from the inner
expressions.
The math
element can contain an arbitrary number
of children schemata. The children schemata render by default as if they
were contained in a mrow
element.
The attributes of the math
element are:
macros
attribute is provided to make possible future
development of more streamlined, MathML-specific macro mechanisms. The
value of this attribute is a sequence of URLs or URIs, separated by
whitespacemode
attribute specifies whether
the enclosed MathML expression should be rendered in a display style
or an in-line style. Allowed values are
display
and
inline
(default).
This attribute is deprecated in favor of the standard CSS2
`display' property with the analogous block
and
inline
values.
The top-level math
element described in the
preceding section is concerned with encapsulating MathML content and
defining attributes that affect the entire enclosed expression. It is, in
a sense, `inward looking'. However, to render MathML properly
in a browser, and to integrate it properly into an HTML document, an
`outward looking' interface element is also required. This
interface element must be aware of its surrounding environment, and provide
a mechanism for passing information between the browser and the MathML
renderer.
As noted above, the MathML interface element and the MathML
top-level element are in practice one and the same. The math
element must serve both to encapsulate MathML
content, and admit additional attributes for controlling how a MathML
renderer should interact with the surrounding context, typically a
browser.
While general mechanisms for embedding XML in HTML are beginning to be deployed, wide variations in strategy and level of implementation remain between vendors. Consequently, the remainder of this section describes attributes and functionality that would be highly desirable in a MathML interface element. In the near term, implementors attempting to provide interim solutions for rendering MathML in browsers should try to give authors some way of passing the following interface attributes to the renderer:
Attributes that apply to the MathML interface element necessarily
take effect when the document is first loaded, and therefore suffer
the limitation that they cannot change in response to reader
interaction unless they are exposed in the Document Object Model
(http://www.w3.org/TR/WD-DOM-Level-2)
and subject to programmatic control.
The height
and width
attributes
are good examples; if the reader changes the current font size, the
height and width of the embedded mathematical fragments also need to change.
At present, browser support for the DOM, and embedded element access to the DOM, is too limited to provide acceptable rendering for MathML. The W3C Math working group is working closely with the W3C DOM working group in an effort to provide better communication between embedded MathML renderers and browsers (see appendix E [Document Object Model for MathML]).
The basic requirements for communication between an embedded MathML renderer and a browser include:
In browsers where MathML is not natively supported, we anticipate that MathML rendering will be carried out via embedded objects such as plug-ins, applets, or helper applications. In the near term, the W3C Math working group advocates the use of MIME types to bind embedded MathML to renderers. Mechanisms for assigning MIME types already exist in HTML, and mechanisms for registering and automatically invoking embedded elements such as plug-ins based on MIME type already exist in Web browsers.
The type
attribute, described in the previous
section as a requirement for the MathML interface element, is intended to
associate a MIME type with its content. The HTML element META
is proposed as a means of specifying document-wide
default MIME types for an element.
We propose a simple naming convention for MIME types that is flexible enough to accommodate several common situations:
We propose that generic MathML be assigned the MIME type
text/mathml
, and for browser registry, we suggest the
standard file extension .mml
be used. To invoke specific
renderers, we suggest assigning a MIME type of the following format:
text/mathml-renderer
A user downloads and installs renderer A, and registers it with the
browser for the text/mathml
MIME type to process generic
MathML. However renderer A also accepts TEX as an input syntax, and
therefore during the installation process, it requests to be registered for
application/x-tex
as well. Later, the user discovers renderer
B provides additional features, such as the capability to cut and
paste. Therefore, the user downloads, installs and registers renderer B for
the MIME type text/mathml-rendererB
.
An author then creates a document that contains the the following line in the document header:
<META Content-math-Type="text/mathml">
Later, the document contains the following expressions:
<math> <msup><mi>x</mi><mn>2</mn></msup> </math> <math type="text/mathml-rendererB"> <mi>α</mi><mo>=</mo><mn>0.4</mn> </math>
When our hypothetical reader views this document, renderer A is
invoked to process the first expression, while renderer B is invoked
for the second. Later, when our hypothetical reader later views a
document with MIME type application/x-tex
, renderer A is
again invoke, this time in TEX processing mode.
Although rendering MathML expressions typically occurs in place in a Web browser, other MathML processing functions take place more naturally in other applications. Particularly common tasks include opening a MathML expression in an equation editor or computer algebra system.
At present, there is no standard way of specifying that embedded content should be rendered with one application, edited in another, and evaluated by a third. As work progresses on coordination between browsers and embedded elements and the Document Object Model (DOM), providing this kind of functionality should be a priority. Both authors and readers should be able to indicate a preference about what MathML application to use in a given context. For example, one might imagine that some mouse gesture over a MathML expression causes a browser to present the reader with a pop-up menu, showing the various kinds of MathML processing available on the system, and the MathML processors recommended by the author.
Since MathML will probably be widely generated by authoring tools, it is
particularly important that opening a MathML expression in an editor should
be easy to do and to implement. In many cases, it will be desirable for an
authoring tool to record some information about its internal state along
with a MathML expression, so that an author can pick up editing where he or
she left off. The MathML specification does not explicitly contain
provisions for recording information about the authoring tool. In some
circumstances, it may be possible to include authoring tool information
that applies to an entire document in the form of meta-data; interested
readers are encouraged to consult the W3C Metadata Activity for current
information about metadata and resource definition. For encoding authoring
tool state information that applies to a particular MathML instance,
readers are referred to the possible use of the semantics
element for this purpose.
In order to be fully integrated into HTML, it should be possible not only to embed MathML in HTML, but also to embed HTML in MathML. However, the problem of supporting HTML in MathML presents many difficulties. Moreover, the problems are not specific to MathML; they are problems for XML applications in HTML generally. Therefore, at present, the MathML specification does not permit any HTML elements within a MathML expression, although this may be subject to change in a future revision of MathML, when mechanisms for embedding XML in HTML have been further developed.
In most cases, HTML elements either do not apply in mathematical contexts (headings, paragraphs, lists, etcetera), or MathML already provides equivalent or better functionality specifically tailored to mathematical content (tables, style changes, etcetera). However, there are two notable exceptions.
MathML has no element that corresponds to the HTML anchor element a. In HTML, anchors are used both to make links, and to provide locations to link to. MathML, as an XML application, defines links by the use of the XLink mechanism [XLink]. However, MathML at present does not provide a way for other documents to make links into a MathML expression. One reason for this omission is that linking into embedded XML content is better addressed as part of a general mechanism for embedding XML in HTML. Moreover, until browsers either natively implement MathML rendering, or substantially better coordination between embedded elements and browsers becomes possible, there is no reasonable way of implementing links into MathML expressions.
MathML linking elements are generic XML linking elements as described in the [XLink]. The reader is cautioned that this is as present still a working draft, and is therefore subject to future revision. Since the MathML linking mechanism is defined in terms of the XML linking specification, the same proviso holds for it as well.
A MathML element is designated as a link by the presence of the
attribute xlink:href
. To use the attribute xlink:href
, it is also necessary to declare the
appropriate namespace. Thus, a typical MathML link might look like:
<mrow xmlns:xlink="http://www.w3.org/XML/XLink/0.9" xlink:href="sample.xml"> ... </mrow>
MathML designates that almost all elements can be used as an XML linking
element. The only elements that cannot serve as linking elements are those
such as the sep
element, which exist primarily to
disambiguate other MathML constructs and in general do not correspond to
any part of a typical visual rendering. The full list of exceptional
elements that cannot be used as linking elements is given in the table
below.
mprescripts |
none |
sep |
malignmark |
maligngroup |
The IMG
element has no MathML equivalent. The
decision to omit a general mechanism for image inclusion from MathML was
based on several factors. First, a simple mechanism for including images in
MathML along the lines of the IMG
element would not
be more closely tied to mathematical content or notation than the HTML IMG
element itself. Therefore, such an element would
likely be superseded by the IMG
element if it
becomes possible to mix XML and HTML generally.
Another reason for not providing an image facility is that MathML takes great pains to make the notational structure and mathematical content it encodes easily available to processors, whereas information contained in images is only available to a human reader looking at a visual representation. Thus, for example, in the MathML paradigm, it would be preferable to introduce new glyphs by the creation of special symbol fonts, rather than simply including them as images.
Finally, apart from the introduction of new glyphs, many of the situations where one might be inclined to use an image amount to some sort of labeled diagram. For example, knot diagrams, Venn diagrams, Dynkin diagrams, Feynman diagrams and complicated commutative diagrams all fall into this category. As such, their content would be better encoded via some combination of structured graphics and MathML markup. Because of the generality of the `labeled diagram' construction, the definition of a markup language to encode such constructions extends beyond the scope of the current W3C Math activity. (See http://www.w3.org/Graphics for further W3C activity in this area.)
Information is increasingly generated, processed and rendered by software tools. The exponential growth of the Web is fueling the development of advanced systems for automatically searching, categorizing, and interconnecting information. Thus, although MathML can be written by hand and read by humans, the future of MathML is also tied to the ability to process it with software tools.
There are many different kinds of MathML editors, translators, processors and renderers. What it means to support MathML varies widely between applications. For example, the issues that arise with a MathML-compliant validating parser are very different from those for a MathML-compliant equation editor.
In this section, guidelines are given for describing different types of MathML support, and for quantifying the extent of MathML support in a given application. Developers, users and reviewers are encouraged to use these guidelines in characterizing products. The intention behind these guidelines is to facilitate reuse and interoperability between MathML applications by accurately characterizing their capabilities in quantifiable terms.
A valid MathML expression is an XML construct determined by the MathML DTD together with the additional requirements given in the specifications of the MathML document.
We define a `MathML processor' to mean any application that can accept, produce, or `roundtrip' a valid MathML expression. An example of an application that might round-trip a MathML expression might be an editor that writes a new file even though no modifications are made.
We specify three forms of MathML compliance:
Beyond the above definitions, the MathML specification makes no demands of individual processors. In order to guide developers, the MathML specification includes advisory material; for example, there are suggested rendering rules included in chapter 3 [Presentation Markup]. However, in general, developers are given wide latitude in interpreting what kind of MathML implementation is meaningful for their own particular application.
To clarify the difference between compliance and interpretation of what is meaningful, consider some examples:
As the previous examples show, to be useful, the concept of MathML compliance frequently involves a judgment about what parts of the language are meaningfully implemented, as opposed to parts that are merely processed in a technically correct way with respect to the definitions of compliance. This requires some mechanism for giving a quantitative statement about which parts of MathML are meaningfully implemented by a given application. To this end, the W3C Math working group has provided a test suite of MathML expressions at http://www.w3.org/Math/testsuite.
The test suite consists of a large number of MathML expressions categorized by markup category and dominant MathML element being tested. The existence of this test suite makes is possible, for example, to characterize quantitatively the hypothetical computer algebra interface mentioned above by saying that it is a MathML-input compliant processor which meaningfully implements MathML content markup, including all of the expressions given under http://www.w3.org/Math/testsuite/tests/4.
Developers who choose not to implement parts of the MathML specification in a meaningful way are encouraged to itemize the parts they leave out by referring to specific categories in the test suite.
For MathML-output-compliant processors, there is also a MathML validator online at http://www.w3.org/Math/validator. Developers of MathML-output-compliant processors are encouraged to verify their output using this validator.
Customers of MathML applications who wish to verify claims as to which parts of the MathML specification are implemented by an application are encouraged to use the test suites as a part of their decision processes.
If a MathML-input-compliant application receives input containing one or
more elements with an illegal number or type of attributes or child
schemata, it should nonetheless attempt to render all the input in an
intelligible way, i.e. to render normally those parts of the input that
were valid, and to render error messages (rendered as if enclosed in an merror
element) in place of invalid expressions.
MathML-output-compliant applications such as editors and translators may
choose to generate merror
expressions to signal
errors in their input. This is usually preferable to generating valid, but
possibly erroneous, MathML.
The MathML attributes described in the MathML specification are necessary for presentation and content markup. Ideally, the MathML attributes should be an open-ended list so that users can add specific attributes for specific renderers. However, this cannot be done within the confines of a single XML DTD. Although it can be done using extensions of the standard DTD, some authors will wish to use non-standard attributes while remaining strictly in compliance with the standard DTD.
To allow this, this specification also allows the attribute other
for all elements, for use as a hook to pass on
renderer-specific information. In particular, it can be used as a hook for
passing information to audio renderers, computer algebra systems, and for
pattern matching in any future macro/extension mechanism. This idea is used
in other languages. For example, PostScript comments are widely used to
pass information that is not part of PostScript.
At the same time, the intent of the other
attribute is not to encourage software developers to use this
as a loop-hole for circumventing the core conventions for MathML markup. We
trust both authors and applications will use the other
attribute judiciously.
The value of the other
attribute should be a string
containing an attribute list in valid XML format, e.g.
attr1="val1" attr2="val2"
or
attr1='val1' attr2='val2'
with appropriate escaping of quotes appearing inside the attribute values).
Renderers that accept non-standard attributes directly should also accept
them when they occur within the string value of the other
attribute. This is not required for attributes
specifically documented by the MathML standard.
MathML is in its infancy; it is to be expected that MathML will need to be extended and revised in various ways. Some of these extensions can be easily foreseen; as noted repeatedly in this chapter, the mechanisms for fully integrating MathML into HTML are not yet developed, and these mechanisms may have a significant impact on some aspects of MathML.
Similarly, there are several kinds of functionality that are fairly obvious candidates for future MathML extensions. These include macros, style sheets, and perhaps a general facility for `labeled diagrams'. However, there will no doubt be other desirable extensions to MathML that will only emerge as MathML is widely used. For these extensions, the W3C Math working group relies on the extensible architecture of XML, and the common sense of the larger Web community.
The development of style-sheet mechanisms for XML is part of the ongoing XML activity of the World Wide Web Consortium. Both XSL and CSS are working to incorporate greater support for mathematics. Further, XSL can be used to provide a basic macro capability as well.
Macros, however, play a very important and useful role in encoding mathematical content and meaning. Moreover, it is difficult to devise a coherent, general macro system for MathML, because there are so many distinct applications for MathML macros. Therefore, a good direction for further work is the definition of a macro mechanism specifically tailored to MathML, in addition to participating in general ongoing activities in the areas of XML style sheets and macro facilities.
Some of the possible uses of MathML macros include:
<msubsup>
element as `second derivative with respect to x of f'.
The set of elements and attributes specified in the MathML specification are necessary for rendering common mathematical expressions. It is recognized that not all mathematical notation is covered by this set of elements, that new notations are continually invented, and that sub-communities within mathematics often have specialized notations; and furthermore that the explicit extension of a standard is a necessarily slow and conservative process. This implies that the MathML standard could never explicitly cover all the presentational forms used by every sub-community of authors and readers of mathematics, much less encode all mathematical content.
In order to facilitate the use of MathML by the widest possible audience, and to enable its smooth evolution to encompass more notational forms and more mathematical content (perhaps eventually covered by explicit extensions to the standard), the set of tags and attributes is open-ended, in the sense described in this section.
MathML is described by an XML DTD, which necessarily limits the elements and attributes to those occurring in the DTD. Renderers desiring to accept non-standard elements or attributes, and authors desiring to include these in documents, should accept or produce documents that conform to an appropriately extended XML DTD that has the standard MathML DTD as a subset.
MathML-compliant renderers are allowed, but not required, to accept
non-standard elements and attributes, and to render them in any way. If a
renderer does not accept some or all non-standard tags, it is encouraged
either to handle them as errors as described above for elements with the
wrong number of arguments, or to render their arguments as if they were
arguments to an mrow
, in either case rendering all
standard parts of the input in the normal way.