MathML

Last updated

MathML
Mathematical Markup Language
AbbreviationMathML
Native name
  • Mathematical Markup Language
  • ISO/IEC 40314 [1]
StatusW3C Recommendation [2]
First publishedApril 1998 (1998-04)
Latest version3.0 [2]
April 10, 2014;9 years ago (2014-04-10) [2]
Organization W3C, ISO, IEC [1]
Committee
Editors
  • David Carlisle [2]
  • Patrick Ion [2]
  • Robert Miner [2]
  • Frédéric Wang [3]
Authors
Principal authors
    • Ron Ausbrooks
    • Stephen Buswell
    • David Carlisle
    • Giorgi Chavchanidze
    • Stéphane Dalmas
    • Stan Devitt
    • Angel Diaz
    • Sam Dooley
    • Roger Hunter
    • Patrick Ion
    • Michael Kohlhase
    • Azzeddine Lazrek
    • Paul Libbrecht
    • Bruce Miller
    • Robert Miner
    • Chris Rowley
    • Murray Sargent
    • Bruce Smith
    • Neil Soiffer
    • Robert Sutor
    • Stephen Watt
[2]
Base standards XML
Related standards OpenMath, Office Open XML, OMDoc
Website

Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content, and is one of a number of mathematical markup languages. Its aim is to natively integrate mathematical formulae into World Wide Web pages and other documents. It is part of HTML5 and standardised by ISO/IEC since 2015. [1]

Contents

History

Following some experiments in the Arena browser based on proposals for mathematical markup in HTML, [4] MathML 1 was released as a W3C recommendation in April 1998 as the first XML language to be recommended by the W3C. Version 1.01 of the format was released in July 1999 and version 2.0 appeared in February 2001. Implementations of the specification appeared in Amaya 1.1, Mozilla 1.0 and Opera 9.5. [5] [6] In October 2003, the second edition of MathML Version 2.0 was published as the final release by the W3C Math Working Group.

MathML was originally designed before the finalization of XML namespaces. However, it was assigned a namespace immediately after the Namespace Recommendation was completed, and for XML use, the elements should be in the namespace with namespace URL http://www.w3.org/1998/Math/MathML. When MathML is used in HTML (as opposed to XML) this namespace is automatically inferred by the HTML parser and need not be specified in the document. [7]

MathML version 3

Version 3 of the MathML specification was released as a W3C recommendation on 20 October 2010. A recommendation of A MathML for CSS Profile was later released on 7 June 2011; [8] this is a subset of MathML suitable for CSS formatting. Another subset, Strict Content MathML, provides a subset of content MathML with a uniform structure and is designed to be compatible with OpenMath. Other content elements are defined in terms of a transformation to the strict subset. New content elements include <bind> which associates bound variables (<bvar>) to expressions, for example a summation index. The new <share> element allows structure sharing. [9]

The development of MathML 3.0 went through a number of stages. In June 2006, the W3C rechartered the MathML Working Group to produce a MathML 3 Recommendation until February 2008, and in November 2008 extended the charter to April 2010. A sixth Working Draft of the MathML 3 revision was published in June 2009. On 10 August 2010 version 3 graduated to become a "Proposed Recommendation" rather than a draft. [9] An implementation of MathML 2 landed in WebKit around this same time, [10] with a Chromium implementation following a couple of years later, [11] although that implementation was removed from Chromium after less than a year. [12]

The Second Edition of MathML 3.0 was published as a W3C Recommendation on 10 April 2014. [2] The specification was approved as an ISO/IEC international standard 40314:2015 on 23 June 2015. [13] Also in 2015, the MathML Association was founded to support the adoption of the MathML standard. [14] At that time, according to a member of the MathJax team, none of the major browser makers paid any of their developers for any MathML-rendering work; whatever support existed was overwhelmingly the result of unpaid volunteer time/work. [15]

MathML Core

In August 2021, a new specification called MathML Core was published, described as the “core subset of Mathematical Markup Language, or MathML, that is suitable for browser implementation.” [16] MathML Core set itself apart from MathML 3.0 by including detailed rendering rules and integration with CSS, automated browser support testing resources, and focusing on a fundamental subset of MathML. An implementation was added to Chromium at the beginning of 2023. [17]

Presentation and semantics

Generic MathML
Filename extension
.mml [18] [19]
Internet media type
application/mathml+xml [18]
Type code MML
Uniform Type Identifier (UTI) public.mathml
UTI conformationpublic.xml
Developed by World Wide Web Consortium
Type of format Mathematical markup language
Extended from XML
Extended to
Standard
Open format?Yes

MathML deals not only with the presentation but also the meaning of formula components (the latter part of MathML is known as "Content MathML"). Because the meaning of the equation is preserved separate from the presentation, how the content is communicated can be left up to the user. For example, web pages with MathML embedded in them can be viewed as normal web pages with many browsers, but visually impaired users can also have the same MathML read to them through the use of screen readers (e.g. using the VoiceOver in Safari). JAWS from version 16 onward supports MathML voicing as well as braille output. [20]

The quality of rendering of MathML in a browser depends on the installed fonts. The STIX Fonts project have released a comprehensive set of mathematical fonts under an open license. The Cambria Math font supplied with Microsoft Windows had slightly more limited support. [21]

A valid MathML document typically consists of the XML declaration, DOCTYPE declaration, and document element. The document body then contains MathML expressions which appear in < math > elements as needed in the document. Often, MathML will be embedded in more general documents, such as HTML, DocBook, or other XML-based formats.

Presentation MathML

Presentation MathML
Internet media type
application/mathml-presentation+xml [18]
Type code MMLp
Uniform Type Identifier (UTI) public.mathml.presentation
UTI conformationpublic.mathml
Extended from Generic MathML

Presentation MathML focuses on the display of an equation, and has about 30 elements. The elements' names all begin with m. A Presentation MathML expression is built up out of tokens that are combined using higher-level elements, which control their layout. Finer details of presentation are affected by close to 50 attributes.

Token elements generally only contain characters (not other elements). They include:

Note, however, that these token elements may be used as extension points, allowing markup in host languages. MathML in HTML5 allows most inline HTML markup in mtext, and <mtext><b>non</b>zero</mtext> is conforming, with the HTML markup being used within the MathML to mark up the embedded text (making the first word bold in this example).

These are combined using layout elements, that generally contain only elements. They include:

As usual in HTML and XML, many entities are available for specifying special symbols by name, such as &pi; and &RightArrow;. An interesting feature of MathML is that entities also exist to express normally-invisible operators, such as &InvisibleTimes; (or the shorthand &it;) for implicit multiplication. They are:

The full specification of MathML entities [22] is closely coordinated with the corresponding specifications for use with HTML and XML in general. [23]

Thus, the expression requires two layout elements: one to create the overall horizontal row and one for the superscripted exponent. However, the individual tokens also have to be identified as identifiers (<mi>), operators (<mo>), or numbers (<mn>). Adding the token markup, the full form ends up as

<mrow><mi>a</mi><mo>&InvisibleTimes;</mo><msup><mi>x</mi><mn>2</mn></msup><mo>+</mo><mi>b</mi><mo>&InvisibleTimes;</mo><mi>x</mi><mo>+</mo><mi>c</mi></mrow>

A complete document that consists of just the MathML example above, is shown here:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/mathml2.dtd"><mathxmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>a</mi><mo>&InvisibleTimes;</mo><msup><mi>x</mi><mn>2</mn></msup><mo>+</mo><mi>b</mi><mo>&InvisibleTimes;</mo><mi>x</mi><mo>+</mo><mi>c</mi></mrow></math>

Content MathML

Content MathML
Internet media type
application/mathml-content+xml
Type code MMLc
Uniform Type Identifier (UTI) public.mathml.content
UTI conformationpublic.mathml
Extended from Generic MathML

Content MathML focuses on the semantics, or meaning, of the expression rather than its layout. Central to Content MathML is the <apply> element that represents function application. The function being applied is the first child element under <apply>, and its operands or parameters are the remaining child elements. Content MathML uses only a few attributes.

Tokens such as identifiers and numbers are individually marked up, much as for Presentation MathML, but with elements such as <ci> and <cn>. Rather than being merely another type of token, operators are represented by specific elements, whose mathematical semantics are known to MathML: <times>, <power>, etc. There are over a hundred different elements for different functions and operators. [24]

For example, <apply><sin/><ci>x</ci></apply> represents and <apply><plus/><ci>x</ci><cn>5</cn></apply> represents . The elements representing operators and functions are empty elements, because their operands are the other elements under the containing <apply>.

The expression could be represented as

<math><apply><plus/><apply><times/><ci>a</ci><apply><power/><ci>x</ci><cn>2</cn></apply></apply><apply><times/><ci>b</ci><ci>x</ci></apply><ci>c</ci></apply></math>

Content MathML is nearly isomorphic to expressions in a functional language such as Scheme and other dialects of Lisp. <apply>...</apply> amounts to Scheme's (...), and the many operator and function elements amount to Scheme functions. With this trivial literal transformation, plus un-tagging the individual tokens, the example above becomes:

(plus(timesa(powerx2))(timesbx)c)

This reflects the long-known close relationship between XML element structures, and LISP or Scheme S-expressions. [25] [26]

Wikidata annotation in Content MathML

According to the OM Society, [27] OpenMath Content Dictionaries can be employed as collections of symbols and identifiers with declarations of their semantics names, descriptions and rules. A 2018 paper presented at the SIGIR conference [28] proposed that the semantic knowledge base Wikidata could be used as an OpenMath Content Dictionary to link semantic elements of a mathematical formula to unique and language-independent Wikidata items.

Example

The well-known quadratic formula could be represented in Presentation MathML as an expression tree made up from layout elements like <mfrac> or <msqrt>:

<mathmode="display"xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mfrac><mrow><moform="prefix">&minus;</mo><mi>b</mi><mo>&pm;</mo><msqrt><msup><mi>b</mi><mn>2</mn></msup><mo>&minus;</mo><mn>4</mn><mo>&it;</mo><mi>a</mi><mo>&it;</mo><mi>c</mi></msqrt></mrow><mrow><mn>2</mn><mo>&it;</mo><mi>a</mi></mrow></mfrac></mrow><annotationencoding="application/x-tex"><!-- TeX -->x=\frac{-b\pm\sqrt{b^2-4ac}}{2a} </annotation><annotationencoding="StarMath 5.0">x={-bplusminussqrt{b^2-4ac}}over{2a} </annotation><!-- More annotations can be written: application/x-troff-eqn for eqn, application/x-asciimath for AsciiMath... --><!-- Semantic MathML go under <annotation-xml encoding="MathML-Content">. --></semantics></math>

This example uses the <annotation> element, which can be used to embed a semantic annotation in non-XML format, for example to store the formula in the format used by an equation editor such as StarMath or the markup using LaTeX syntax. The encoding field is usually a MIME type, although most of the equation encodings don't have such a registration; freeform text may be used in such cases.

Although less compact than other formats, the XML structuring of MathML makes its content widely usable and accessible, allows near-instant display in applications such as web browsers, and facilitates an interpretation of its meaning in mathematical software products. MathML is not intended to be written or edited directly by humans. [29]

Embedding MathML in HTML/XHTML files

MathML, being XML, can be embedded inside other XML files such as XHTML files using XML namespaces.

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd"><htmlxmlns="http://www.w3.org/1999/xhtml"xml:lang="en"><head><title>Example of MathML embedded in an XHTML file</title><metaname="description"content="Example of MathML embedded in an XHTML file"/></head><body><h1>Example of MathML embedded in an XHTML file</h1><p>    The area of a circle is     <mathxmlns="http://www.w3.org/1998/Math/MathML"><mi>&#x03C0;<!-- π --></mi><mo>&#x2062;<!-- &InvisibleTimes; --></mo><msup><mi>r</mi><mn>2</mn></msup></math>.   </p></body></html>
A rendering of the formula for a circle in MathML+XHTML using Firefox 22 on Mac OS X MathMLxhtml.png
A rendering of the formula for a circle in MathML+XHTML using Firefox 22 on Mac OS X

Inline MathML is also supported in HTML5 files. There is no need to specify namespaces as there was in XHTML.

<!DOCTYPE html><htmllang="en"><head><metacharset="utf-8"><title>Example of MathML embedded in an HTML5 file</title></head><body><h1>Example of MathML embedded in an HTML5 file</h1><p>    The area of a circle is     <math><mi>&pi;</mi><mo>&InvisibleTimes;</mo><msup><mi>r</mi><mn>2</mn></msup></math>.   </p></body></html>

Other standards

Another standard called OpenMath that has been more specifically designed (largely by the same people who devised Content MathML) for storing formulae semantically can be used to complement MathML. OpenMath data can be embedded in MathML using the <annotation-xmlencoding="OpenMath"> element. OpenMath content dictionaries can be used to define the meaning of <csymbol> elements. The following would define P1(x) to be the first Legendre polynomial:

<apply><csymbolencoding="OpenMath"definitionURL="http://www.openmath.org/cd/contrib/cd/orthpoly1.xhtml#legendreP"><msub><mi>P</mi><mn>1</mn></msub></csymbol><ci>x</ci></apply>

The OMDoc format has been created for markup of larger mathematical structures than formulae, from statements like definitions, theorems, proofs, and examples, to complete theories and even entire text books. Formulae in OMDoc documents can either be written in Content MathML or in OpenMath; for presentation, they are converted to Presentation MathML.

The ISO/IEC standard Office Open XML (OOXML) defines a different XML math syntax, derived from Microsoft Office products. However, it is partially compatible [30] through XSL Transformations.

See also

Related Research Articles

A document type definition (DTD) is a specification file that contains set of markup declarations that define a document type for an SGML-family markup language. The DTD specification file can be used to validate documents.

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.

<span class="mw-page-title-main">Markup language</span> Modern system for annotating a document

A markuplanguage is a text-encoding system which specifies the structure and formatting of a document and potentially the relationship between its parts. Markup is often used to control the display of the document or to enrich its content to facilitate automated processing.

XSD, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.

XHTML modularization is a methodology for producing modularized markup languages in a number of different schema languages so that the modules can easily be plugged together to create markup languages.

XML Linking Language, or XLink, is an XML markup language and W3C specification that provides methods for creating internal and external links within XML documents, and associating metadata with those links.

In web development, "tag soup" is a pejorative for syntactically or structurally incorrect HTML written for a web page. Because web browsers have historically treated structural or syntax errors in HTML leniently, there has been little pressure for web developers to follow published standards, and therefore there is a need for all browser implementations to provide mechanisms to cope with the appearance of "tag soup", accepting and correcting for invalid syntax and structure where possible.

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.

GRDDL is a markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. The GRDDL specification shows examples using XSLT, however it was intended to be abstract enough to allow for other implementations as well. It became a Recommendation on September 11, 2007.

RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

<span class="mw-page-title-main">HTML5</span> Fifth and current version of hypertext markup language

HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML Living Standard. It is maintained by the Web Hypertext Application Technology Working Group (WHATWG), a consortium of the major browser vendors.

The Internationalization Tag Set (ITS) is a set of attributes and elements designed to provide internationalization and localization support in XML documents.

A Formal Public Identifier (FPI) is a short piece of text with a particular structure that may be used to uniquely identify a product, specification or document. FPIs were introduced as part of Standard Generalized Markup Language (SGML), and serve particular purposes in formats historically derived from SGML. Some of their most common uses are as part of document type declarations (DOCTYPEs) and document type definitions (DTDs) in SGML, XML and historically HTML, but they are also used in the vCard and iCalendar file formats to identify the software product which generated the file.

Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulas, graphics, bibliographies etc.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

A document type declaration, or DOCTYPE, is an instruction that associates a particular XML or SGML document with a document type definition (DTD). In the serialized form of the document, it manifests as a short string of markup that conforms to a particular syntax.

References

  1. 1 2 3 4 5 "ISO - ISO/IEC 40314:2016 - Information technology — Mathematical Markup Language (MathML) Version 3.0 2nd Edition". ISO. 2016. Retrieved 6 April 2021.
  2. 1 2 3 4 5 6 7 8 Carlisle, David; Ion, Patrick; Miner, Robert, eds. (10 April 2014). "Mathematical Markup Language (MathML) Version 3.0 2nd Edition". W3C. Retrieved 6 April 2021.
  3. Carlisle, David; Wang, Frédéric, eds. (4 May 2022). "MathML Core". W3C. Retrieved 3 March 2023.
  4. "12 - Mathematical Equations". 8 November 1993.
  5. "Mozilla 1.0 Released!". 5 June 2002. Retrieved 3 March 2023.
  6. McCathieNevile, Charles (27 September 2007), Can Kestrels do Math? MathML support in Opera Kestrel, Opera
  7. "HTML Living Standard" . Retrieved 3 March 2023.
  8. "A MathML for CSS Profile". W3C. 7 June 2011. Retrieved 25 July 2013.
  9. 1 2 3 "Mathematical Markup Language Version 3.0 W3C Recommendation". W3.org. Retrieved 9 May 2012.
  10. Dakin, Beth (17 August 2010). "Announcing…MathML!" . Retrieved 3 March 2023.
  11. "A web developer's guide to the latest Chrome Beta". 8 November 2012. Retrieved 3 March 2023.
  12. "Comment 32 on Issue 152430: Enabling support for MathML". 5 February 2013. Retrieved 3 March 2023.
  13. "W3C MathML 3.0 Approved as ISO/IEC International Standard". W3.org. 23 June 2015. Retrieved 12 June 2015.
  14. Deyan Ginev; Michael Kohlhase; Moritz Schubotz; Raniere Silva; Frédéric Wang, Mondial Association for Tools Handling MathML , retrieved 20 June 2016
  15. Krautzberger, Peter (1 November 2013). "MathML forges on". oreilly.com. Retrieved 22 November 2014.
  16. "MathML Core". 4 May 2022. Retrieved 3 March 2023.
  17. "Igalia Brings MathML Back to Chromium". Igalia News. 10 January 2023. Retrieved 10 January 2023.
  18. 1 2 3 Libbrecht, Paul (1 September 2023). "MathML Media-type Declarations". W3C. Retrieved 2 September 2023.
  19. "The MathML Interface". W3C. 21 October 2003. Retrieved 2 September 2023. The W3C Math Working Group recommends the standard file extension .mml used for browser registry.
  20. "JAWS Version 16" . Retrieved 7 September 2023.
  21. Vismor, Timothy, Viewing Mathematics on the Internet , retrieved 13 April 2011
  22. "Characters, Entities and Fonts". W3.org.
  23. "XML Entity Definitions for Characters (2nd Edition)". W3.org.
  24. "Content Markup". W3.org.
  25. Steven DeRose. The SGML FAQ Book: Understanding the Relationship of SGML and XML, Kluwer Academic Publishers, 1997. ISBN   978-0-7923-9943-8.
  26. Canonical S-expressions#cite note-0
  27. "OpenMath Home · OpenMath". www.openmath.org.
  28. Schubotz, Moritz; Scharpf, Philipp; Gipp, Bela (2018). "Representing Mathematical Formulae in Content MathML using Wikidata" (PDF). Birndl@sigir.
  29. Buswell, Steven; Devitt, Stan; Diaz, Angel; et al. (7 July 1999). "Mathematical Markup Language (MathML) 1.01 Specification (Abstract)" . Retrieved 26 September 2006. While MathML is human-readable it is anticipated that, in all but the simplest cases, authors will use equation editors, conversion programs, and other specialized software tools to generate MathML.
  30. Carlisle, David (10 April 2007). "XHTML and MathML from Office 2007". Blogspot . Retrieved 20 September 2007.

Further reading

Specifications