6129
INFORMATIONAL

The 'application/tei+xml' Media Type

Authors: L. Romary, S. Lundberg
Date: February 2011
Working Group: NON WORKING GROUP
Stream: IETF

Abstract

This document defines the 'application/tei+xml' media type for markup languages defined in accordance with the Text Encoding and Interchange guidelines. This document is not an Internet Standards Track specification; it is published for informational purposes.

RFC 6129: The 'application/tei+xml' Media Type [RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Errata] [Info page]

INFORMATIONAL
Errata Exist
Internet Engineering Task Force (IETF)                         L. Romary
Request for Comments: 6129                      TEI Consortium and INRIA
Category: Informational                                      S. Lundberg
ISSN: 2070-1721                            The Royal Library, Copenhagen
                                                           February 2011


                  <span class="h1">The 'application/tei+xml' Media Type</span>

Abstract

   This document defines the 'application/tei+xml' media type for markup
   languages defined in accordance with the Text Encoding and
   Interchange guidelines.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see <a href="./rfc5741#section-2">Section 2 of RFC 5741</a>.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   <a href="https://www.rfc-editor.org/info/rfc6129">http://www.rfc-editor.org/info/rfc6129</a>.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (<a href="http://trustee.ietf.org/license-info">http://trustee.ietf.org/license-info</a>) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.






<span class="grey">Romary & Lundberg             Informational                     [Page 1]</span>

<span id="page-2" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


Table of Contents

   <a href="#section-1">1</a>.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-2">2</a>
   <a href="#section-2">2</a>.  Recognizing TEI Files . . . . . . . . . . . . . . . . . . . . . <a href="#page-2">2</a>
   <a href="#section-3">3</a>.  Fragment Identifier . . . . . . . . . . . . . . . . . . . . . . <a href="#page-4">4</a>
   <a href="#section-4">4</a>.  Security Considerations . . . . . . . . . . . . . . . . . . . . <a href="#page-4">4</a>
     <a href="#section-4.1">4.1</a>.  Harmful Content . . . . . . . . . . . . . . . . . . . . . . <a href="#page-4">4</a>
     <a href="#section-4.2">4.2</a>.  Intellectual Property Rights  . . . . . . . . . . . . . . . <a href="#page-4">4</a>
     <a href="#section-4.3">4.3</a>.  Authenticity and confidentiality  . . . . . . . . . . . . . <a href="#page-5">5</a>
   <a href="#section-5">5</a>.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . <a href="#page-5">5</a>
     <a href="#section-5.1">5.1</a>.  Registration of MIME Type 'application/tei+xml' . . . . . . <a href="#page-5">5</a>
   <a href="#section-6">6</a>.  References  . . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-6">6</a>
     <a href="#section-6.1">6.1</a>.  Normative References  . . . . . . . . . . . . . . . . . . . <a href="#page-6">6</a>
     <a href="#section-6.2">6.2</a>.  Informative References  . . . . . . . . . . . . . . . . . . <a href="#page-7">7</a>

<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>.  Introduction</span>

   Text Encoding and Interchange (TEI) is an international and
   interdisciplinary standard that is widely used by libraries, museums,
   publishers, and individual scholars to represent all kinds of textual
   material for online research and teaching [<a href="#ref-TEI" title=""TEI Guidelines"">TEI</a>].

   This document defines the 'application/tei+xml' media type in
   accordance with [<a href="./rfc3023" title=""XML Media Types"">RFC3023</a>] in order to enable generic processing of
   such documents on the Internet using eXtensible Markup Language (XML)
   [<a href="#ref-W3C.REC-xml-20081126">W3C.REC-xml-20081126</a>] technologies.

<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>.  Recognizing TEI Files</span>

   TEI files are XML documents or fragments having the root element (as
   defined in [<a href="#ref-W3C.REC-xml-20081126">W3C.REC-xml-20081126</a>]) in a TEI namespace.  TEI namespace
   names are defined as a Universal Resource Identifier (URI) [<a href="./rfc3986" title=""Uniform Resource Identifier (URI): Generic Syntax"">RFC3986</a>]
   in accordance with [<a href="#ref-W3C.REC-xml-names-20091208">W3C.REC-xml-names-20091208</a>] and begins with
   <a href="http://www.tei-c.org/ns/">http://www.tei-c.org/ns/</a> followed by the version number of the
   namespace.  The current namespace is <a href="http://www.tei-c.org/ns/1.0">http://www.tei-c.org/ns/1.0</a>

   The most common root element names for TEI documents are

      <TEI>

      <teiCorpus>










<span class="grey">Romary & Lundberg             Informational                     [Page 2]</span>

<span id="page-3" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


   The teiCorpus documents provide the ability to bundle multiple
   documents into a single file.

   Examples:

      A document having <TEI> root element

               <?xml version="1.0" encoding="UTF-8" ?>
               <TEI xmlns="http://www.tei-c.org/ns/1.0">
                  <teiHeader>
                  ...
                  </teiHeader>
                  <text>
                  ...
                  </text>
               </TEI>

      A document having <teiCorpus> root element

               <?xml version="1.0" encoding="UTF-8" ?>
               <teiCorpus xmlns="http://www.tei-c.org/ns/1.0">
                  <teiHeader>
                  ...
                  </teiHeader>
                  <TEI>
                     <teiHeader>
                     ...
                     </teiHeader>
                     <text>
                     ...
                     </text>
                  </TEI>
                  <TEI>
                  ... second document ...
                  </TEI>
                  <TEI>
                  ... third document  ...
                  </TEI>
               </teiCorpus>

   TEI and teiCorpus files are often given the extensions .tei and
   .teiCorpus, respectively.  There is a third type of file, which often
   is given the suffix .odd.  ODD ("One Document Does it All") is a TEI
   XML document that includes schema fragments, prose documentation, and
   reference documentation.  It is used for the definition and
   documentation of XML-based languages, and primarily for the TEI
   Guidelines [<a href="#ref-ODD" title=""Getting Started with P5 ODDs"">ODD</a>].  In other words, ODD files do not differ from other
   TEI files in syntax, only in function.



<span class="grey">Romary & Lundberg             Informational                     [Page 3]</span>

<span id="page-4" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>.  Fragment Identifier</span>

   Documents having the media type 'application/tei+xml' use the
   fragment identifier notation as specified in [<a href="./rfc3023" title=""XML Media Types"">RFC3023</a>] for the media
   type 'application/xml'.

<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>.  Security Considerations</span>

   An XML resource does not in itself compromise data security.  When
   being available on a network simply through the dereferencing of an
   Internationalized Resource Identifier (IRI) [<a href="./rfc3987" title=""Internationalized Resource Identifiers (IRIs)"">RFC3987</a>] or a URI, care
   must be taken to properly interpret the data to prevent unintended
   access.  Hence the security issues of <a href="./rfc3986#section-7">[RFC3986], Section 7</a>, apply.
   In addition, as this media type uses the "+xml" convention, it shares
   the same security considerations as described in <a href="./rfc3023">RFC 3023</a> <a href="./rfc3023#section-10">[RFC3023],
   Section 10</a>.  In general, security issues related to the use of XML in
   IETF protocols are treated in <a href="./rfc3470">RFC 3470</a> <a href="./rfc3470#section-7">[RFC3470], Section 7</a>.  We will
   not try to duplicate this material, but review some aspects that are
   important for document-centric XML as applied to text encoding.

<span class="h3"><a class="selflink" id="section-4.1" href="#section-4.1">4.1</a>.  Harmful Content</span>

   Any application accepting submitted or retrieving TEI XML for
   processing has to be aware of risks connected with injection of
   harmful scripts and executable XML.  XML inclusion
   [<a href="#ref-W3C.REC-xinclude-20061115">W3C.REC-xinclude-20061115</a>] and the use of external entities are
   vulnerable to various forms of spoofing, and can also reveal aspects
   of a service in a way that may compromise its security.  Any
   vulnerability of these kinds are, however, application specific.  The
   TEI namespaces do not contain such elements.

<span class="h3"><a class="selflink" id="section-4.2" href="#section-4.2">4.2</a>.  Intellectual Property Rights</span>

   TEI documents often arise in digitization of cultural heritage
   materials.  Texts made accessible in TEI format may be unrestricted
   in the sense that their distribution may be unlimited by Digital
   Rights Management [<a href="#ref-DRM" title=""Digital rights management"">DRM</a>] or Intellectual Property Rights [<a href="#ref-IPR" title=""Intellectual property"">IPR</a>]
   constraints.  However, TEI documents are heterogeneous.  Some parts
   of a document may be unrestricted, whereas others, such as editorial
   text and annotations, may be subject to DRM restrictions.

   The TEI format provides means for highly granular attribution, down
   to the content of individual XML elements.  Software agents
   participating in the exchange or processing TEI may be required to
   honour markup of this kind.  Even when there are no IPR constraints,
   intellectual property attribution alone requires that document users
   be able to tell the difference between content from different
   sources.



<span class="grey">Romary & Lundberg             Informational                     [Page 4]</span>

<span id="page-5" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


<span class="h3"><a class="selflink" id="section-4.3" href="#section-4.3">4.3</a>.  Authenticity and confidentiality</span>

   Historical archival records are often encoded in TEI and legal
   document may be binding centuries after they were written.
   Digitization and encoding of legal texts may require technologies for
   assuring authenticity, such as cryptographic checksums and electronic
   signatures.

   Similarly, historical documents may in part or in their entirety be
   confidential.  This may be required by law or by the terms and
   conditions, such as in the case of donated or deposited text from
   private sources.  A text archive may need content filtering or
   cryptographic technologies to meet such requirements.

<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>.  IANA Considerations</span>

<span class="h3"><a class="selflink" id="section-5.1" href="#section-5.1">5.1</a>.  Registration of MIME Type 'application/tei+xml'</span>

      MIME media type name: application

      MIME subtype name: tei+xml

      Required parameters: None

      Optional parameters: charset

         the parameter has identical semantics to the charset parameter
         of the "application/xml" media type as specified in <a href="./rfc3023">RFC 3023</a>
         [<a href="./rfc3023" title=""XML Media Types"">RFC3023</a>].

      Encoding considerations:

         Identical to those for 'application/xml'.  See <a href="./rfc3023">RFC 3023</a>
         <a href="./rfc3023#section-3.2">[RFC3023], Section 3.2</a>.

      Security considerations:

         See Security Considerations (<a href="#section-4">Section 4</a>) in this specification.

      Interoperability considerations:

         TEI documents are often given the extension '.xml', which is
         not uncommon for other XML document formats.

      Published specification:

         This media type registration is for TEI documents [<a href="#ref-TEI" title=""TEI Guidelines"">TEI</a>] as
         described here.  TEI syntax is defined in a schema [<a href="#ref-TEIschema">TEIschema</a>].



<span class="grey">Romary & Lundberg             Informational                     [Page 5]</span>

<span id="page-6" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


      Applications which use this media type:

         There are currently no known applications using the media type
         'application/tei+xml'.

      Additional information:

         Magic number(s):

            There is no single initial octet sequence that is always
            present in TEI documents.

         file extension(s):

            Common extensions are '.tei', '.teiCorpus' and '.odd'.  See
            Recognizing TEI files (<a href="#section-2">Section 2</a>) in this specification.

         Macintosh File Type Code(s)

            TEXT

         Object Identifier(s) or OID(s)

            Not applicable

<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>.  References</span>

<span class="h3"><a class="selflink" id="section-6.1" href="#section-6.1">6.1</a>.  Normative References</span>

   [<a id="ref-RFC3023">RFC3023</a>]  Murata, M., St. Laurent, S., and D. Kohn, "XML Media
              Types", <a href="./rfc3023">RFC 3023</a>, January 2001.

   [<a id="ref-RFC3470">RFC3470</a>]  Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for
              the Use of Extensible Markup Language (XML)
              within IETF Protocols", <a href="https://www.rfc-editor.org/bcp/bcp70">BCP 70</a>, <a href="./rfc3470">RFC 3470</a>, January 2003.

   [<a id="ref-RFC3986">RFC3986</a>]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              <a href="./rfc3986">RFC 3986</a>, January 2005.

   [<a id="ref-RFC3987">RFC3987</a>]  Duerst, M. and M. Suignard, "Internationalized Resource
              Identifiers (IRIs)", <a href="./rfc3987">RFC 3987</a>, January 2005.

   [<a id="ref-TEI">TEI</a>]      "TEI Guidelines", <<a href="http://www.tei-c.org/Vault/P5/1.8.0/doc/tei-p5-doc/en/html/">http://www.tei-c.org/Vault/P5/1.8.0/</a>
              <a href="http://www.tei-c.org/Vault/P5/1.8.0/doc/tei-p5-doc/en/html/">doc/tei-p5-doc/en/html/</a>>.






<span class="grey">Romary & Lundberg             Informational                     [Page 6]</span>

<span id="page-7" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


   [<a id="ref-TEIschema">TEIschema</a>]
              "Schema generated from ODD source", <<a href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng">http://www.tei-c.org/</a>
              <a href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng">release/xml/tei/custom/schema/relaxng/tei_all.rng</a>>.

   [<a id="ref-W3C.REC-xml-20081126">W3C.REC-xml-20081126</a>]
              Paoli, J., Yergeau, F., Sperberg-McQueen, C., Maler, E.,
              and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth
              Edition)", World Wide Web Consortium Recommendation REC-
              xml-20081126, November 2008,
              <<a href="http://www.w3.org/TR/2008/REC-xml-20081126">http://www.w3.org/TR/2008/REC-xml-20081126</a>>.

   [<a id="ref-W3C.REC-xml-names-20091208">W3C.REC-xml-names-20091208</a>]
              Bray, T., Hollander, D., Layman, A., Tobin, R., and H.
              Thompson, "Namespaces in XML 1.0 (Third Edition)", World
              Wide Web Consortium Recommendation REC-xml-names-20091208,
              December 2009,
              <<a href="http://www.w3.org/TR/2009/REC-xml-names-20091208">http://www.w3.org/TR/2009/REC-xml-names-20091208</a>>.

<span class="h3"><a class="selflink" id="section-6.2" href="#section-6.2">6.2</a>.  Informative References</span>

   [<a id="ref-DRM">DRM</a>]      "Digital rights management", <<a href="https://en.wikipedia.org/w/">http://en.wikipedia.org/w/</a>
              index.php?title=Digital_rights_management&
              oldid=412653591>.

   [<a id="ref-IPR">IPR</a>]      "Intellectual property", <<a href="https://en.wikipedia.org/w/index.php?title=Intellectual_property&oldid=411690322">http://en.wikipedia.org/w/</a>
              <a href="https://en.wikipedia.org/w/index.php?title=Intellectual_property&oldid=411690322">index.php?title=Intellectual_property&oldid=411690322</a>>.

   [<a id="ref-ODD">ODD</a>]      "Getting Started with P5 ODDs",
              <<a href="http://www.tei-c.org/Guidelines/Customization/odds.xml">http://www.tei-c.org/Guidelines/Customization/odds.xml</a>>.

   [<a id="ref-W3C.REC-xinclude-20061115">W3C.REC-xinclude-20061115</a>]
              Marsh, J., Orchard, D., and D. Veillard, "XML Inclusions
              (XInclude) Version 1.0 (Second Edition)", World Wide Web
              Consortium Recommendation REC-xinclude-20061115,
              November 2006,
              <<a href="http://www.w3.org/TR/2006/REC-xinclude-20061115">http://www.w3.org/TR/2006/REC-xinclude-20061115</a>>.















<span class="grey">Romary & Lundberg             Informational                     [Page 7]</span>

<span id="page-8" ></span>
<span class="grey"><a href="./rfc6129">RFC 6129</a>          The 'application/tei+xml' Media Type     February 2011</span>


Authors' Addresses

   Laurent Romary
   TEI Consortium and INRIA

   EMail: [email protected]
   URI:   <a href="http://www.tei-c.org/">http://www.tei-c.org/</a>


   Sigfrid Lundberg
   The Royal Library, Copenhagen
   Postbox 2149
   1016 Koebenhavn K
   Denmark

   EMail: [email protected]
   URI:   <a href="http://sigfrid-lundberg.se/">http://sigfrid-lundberg.se/</a>


































Romary & Lundberg             Informational                     [Page 8]

Additional Resources