Misplaced Pages

EPUB

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

EPUB is an e-book file format that uses the ".epub" file extension . The term is short for electronic publication and is sometimes stylized as ePUB . EPUB is supported by many e-readers , and compatible software is available for most smartphones, tablets, and computers. EPUB is a technical standard published by the International Digital Publishing Forum (IDPF). It became an official standard of the IDPF in September 2007, superseding the older Open eBook (OEB) standard.

#729270

66-481: The Book Industry Study Group endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. Technically, a file in the EPUB format is a ZIP archive file consisting of XHTML files carrying the content, along with images and other supporting files. EPUB is the most widely supported vendor-independent XML -based e-book format; it

132-510: A few publishers and manufacturers met informally with representatives of several trade associations to discuss the urgent need to improve the industry's research capability. Once begun, this small group invited others to join in sponsoring a seminal study of book industry information needs on which a future program could be based. BISG was incorporated as a not-for-profit corporation in February 1976 and its Report on Book Industry Information Needs

198-519: A file called rights.xml within the META-INF directory at the root level of the ZIP container. EPUB is widely used on software readers such as Google Play Books on Android and Apple Books on iOS and macOS and Amazon Kindle 's e-readers, but not by associated apps for other platforms. iBooks also supports the proprietary iBook format, which is based on the EPUB format but depends upon code from

264-427: A format for DRM. The EPUB specification does not enforce or suggest a particular DRM scheme. This could affect the level of support for various DRM systems on devices and the portability of purchased e-books. Consequently, such DRM incompatibility may segment the EPUB format along the lines of DRM systems, undermining the advantages of a single standard format and confusing the consumer. DRMed EPUB files must contain

330-416: A language tag that does not necessarily serve to identify a language. One use for extensions is to encode locale information, such as calendar and currency. Extension subtags are composed of multiple hyphen-separated character strings, starting with a single character (other than x ), called a singleton . Each extension is described in its own IETF RFC , which identifies a Registration Authority to manage

396-531: A language tag. For example, es is preferred over es-Latn , as Spanish is fully expected to be written in the Latin script; ja is preferred over ja-JP , as Japanese as used in Japan does not differ markedly from Japanese as used elsewhere. Not all linguistic regions can be represented with a valid region subtag: the subnational regional dialects of a primary language are registered as variant subtags. For example,

462-532: A maintenance update (2.0.1) intended to clarify and correct errata in the specifications being approved in September 2010. EPUB version 2.0.1 consists of three specifications: EPUB internally uses XHTML or DTBook (an XML standard provided by the DAISY Consortium) to represent the text and structure of the content document, and a subset of CSS to provide layout and formatting. XML is used to create

528-567: A monthly basis. BISG Committees : Metadata and Identification Committee, Rights Committee, Subject Codes Committee, Supply Chain Committee, Workflow Committee BISAC Working Groups : Rights Taxonomy Working Group IETF language tag This is an accepted version of this page An IETF BCP 47 language tag is a standardized code that is used to identify human languages on the Internet. The tag structure has been standardized by

594-427: A more structured format for language tags, added the use of ISO 15924 four-letter script codes and UN M.49 three-digit geographical region codes, and replaced the old registry of tags with a new registry of subtags. The small number of previously defined tags that did not conform to the new structure were grandfathered in order to maintain compatibility with RFC 3066. The current version of the specification, RFC 5646,

660-508: A primary language subtag for a collection may be ambiguous as to whether the collection is intended to be inclusive or exclusive. ISO 639-5 does not define precisely which languages are members of these collections; only the hierarchical classification of collections is defined, using the inclusive definition of these collections. Because of this, RFC 5646 does not recommend the use of subtags for language collections for most applications, although they are still preferred over subtags whose meaning

726-403: A property named "Suppress-Script" which indicates the cases where a single script can usually be assumed by default for the language, even if it can be written with another script. When this is the case, it is preferable to omit the script subtag, to improve the likelihood of successful matching. A different script subtag can still be appended to make the distinction when necessary. For example, yi

SECTION 10

#1732772218730

792-473: A root element package and four child elements: metadata , manifest , spine , and guide . Furthermore, the package node must have the unique-identifier attribute. The .opf file's mimetype is application/oebps-package+xml . The metadata element contains all the metadata information for a particular EPUB file. Three metadata tags are required (though many more are available): title , language , and identifier . title contains

858-597: A script subtag instead of a region subtag; in this example, zh-Hans and zh-Hant should be used instead of zh-CN/zh-SG/zh-MY and zh-TW/zh-HK/zh-MO . When a distinct language subtag exists for a language that could be considered a regional variety, it is often preferable to use the more specific subtag instead of a language-region combination. For example, ar-DZ ( Arabic as used in Algeria ) may be better expressed as arq for Algerian Spoken Arabic . Disagreements about language identification may extend to BCP 47 and to

924-480: A security warning: Authors need to be aware that scripting in an EPUB Publication can create security considerations that are different from scripting within a Web browser. For example, typical same-origin policies are not applicable to content that has been downloaded to a user's local system. Therefore, it is strongly encouraged that scripting be limited to container constrained contexts. Book Industry Study Group The Book Industry Study Group, Inc. ( BISG )

990-514: A set of four specifications: The EPUB 3.0 format was intended to address the following criticisms: On June 26, 2014, the IDPF published EPUB 3.0.1 as a final Recommended Specification. In November 2014, EPUB 3.0 was published by the ISO / IEC as ISO/IEC TS 30135 (parts 1–7). In January 2020, EPUB 3.0.1 was published by the ISO / IEC as ISO/IEC 23736 (parts 1–6). EPUB 3.2 was announced in 2018, and

1056-719: A subtag derived from a code assigned by ISO 639 , ISO 15924 , ISO 3166 , or UN M49 remains a valid (though deprecated) subtag even if the code is withdrawn from the corresponding core standard. If the standard later assigns a new meaning to the withdrawn code, the corresponding subtag will still retain its old meaning. This stability was introduced in RFC 4646. RFC 4646 defined the concept of an "extended language subtag" (sometimes referred to as extlang ), although no such subtags were registered at that time. RFC 5645 and RFC 5646 added primary language subtags corresponding to ISO 639-3 codes for all languages that did not already exist in

1122-433: Is application/xhtml+xml . Styling and layout are performed using a subset of CSS 2.0, referred to as OPS Style Sheets . This specialized syntax requires that reading systems support only a portion of CSS properties and adds a few custom properties. Custom properties include oeb-page-head, oeb-page-foot, and oeb-column-number . Font-embedding can be accomplished using the @font-face property, as well as including

1188-485: Is 3.2, effective May 8, 2019. The (text of) format specification underwent reorganization and clean-up; format supports remotely hosted resources and new font formats ( WOFF 2.0 and SFNT ) and uses more pure HTML and CSS . In May 2016 IDPF members approved World Wide Web Consortium (W3C) merger, "to fully align the publishing industry and core Web technology". EPUB 2.0 was approved in October 2007, with

1254-584: Is a U.S. trade association for policy , technical standards and research related to books and similar products. The mission of BISG is to simplify logistics for publishers, manufacturers, suppliers, wholesalers, retailers, librarians and others engaged in the business of print and electronic media. The Book Industry Study Group, Inc. (BISG) began at the annual conference of the Book Manufacturers Institute in November 1975. Here,

1320-531: Is a list of some of the more commonly used primary language subtags. The list represents only a small subset (less than 2 percent) of primary language subtags; for full information, the Language Subtag Registry should be consulted directly. Although some types of subtags are derived from ISO or UN core standards, they do not follow these standards absolutely, as this could lead to the meaning of language tags changing over time. In particular,

1386-518: Is called "extlang form" and is new in RFC 5646. Whole tags that were registered prior to RFC 4646 and are now classified as "grandfathered" or "redundant" (depending on whether they fit the new syntax) are deprecated in favor of the corresponding ISO 639-3–based language subtag, if one exists. To list a few examples, nan is preferred over zh-min-nan for Min Nan Chinese; hak is preferred over i-hak and zh-hakka for Hakka Chinese ; and ase

SECTION 20

#1732772218730

1452-570: Is even less specific, such as "Multiple languages" and "Undetermined". In contrast, the classification of individual languages within their macrolanguage is standardized, in both ISO 639-3 and the Language Subtag Registry. Script subtags were first added to the Language Subtag Registry when RFC 4646 was published, from the list of codes defined in ISO 15924 . They are encoded in the language tag after primary and extended language subtags, but before other types of subtag, including region and variant subtags. Some primary language subtags are defined with

1518-478: Is maintained because it is significant. ISO 15924 includes some codes for script variants (for example, Hans and Hant for simplified and traditional forms of Chinese characters) that are unified within Unicode and ISO/IEC 10646 . These script variants are most often encoded for bibliographic purposes, but are not always significant from a linguistic point of view (for example, Latf and Latg script codes for

1584-460: Is named "Afro-Asiatic languages" and includes all such languages. ISO 639-2 changed the exclusive names in 2009 to match the inclusive ISO 639-5 names. To avoid breaking implementations that may still depend on the older (exclusive) definition of these collections, ISO 639-5 defines a grouping type attribute for all collections that were already encoded in ISO 639-2 (such grouping type is not defined for

1650-400: Is preferred over sgn-US for American Sign Language . Windows Vista and later versions of Microsoft Windows have RFC 4646 support. ISO 639-5 defines language collections with alpha-3 codes in a different way than they were initially encoded in ISO 639-2 (including one code already present in ISO 639-1, Bihari coded inclusively as bh in ISO 639-1 and bih in ISO 639-2). Specifically,

1716-577: Is preferred over yi-Hebr in most contexts, because the Hebrew script subtag is assumed for the Yiddish language. As another example, zh-Hans-SG may be considered equivalent to zh-Hans , because the region code is probably not significant; the written form of Chinese used in Singapore uses the same simplified Chinese characters as in other countries where Chinese is written. However, the script subtag

1782-654: Is supported by almost all hardware readers and many software readers and mobile apps . A successor to the Open eBook Publication Structure , EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) approved in September 2010. The EPUB 3.0 specification became effective in October 2011, superseded by a minor maintenance update (3.0.1) in June 2014. New major features include support for precise layout or specialized formatting (Fixed Layout Documents), such as for comic books, and MathML support. The current version of EPUB

1848-541: Is the text that appears in the table of contents generated by reading systems that use the .ncx. navPoint 's content element points to a content document listed in the manifest and can also include an element identifier (e.g. #section1 ). A description of certain exceptions to the NCX specification as used in EPUB is in Section 2.4.1 of the specification. The complete specification for NCX can be found in Section 8 of

1914-403: Is to "[define] the mechanism by which the various components of an OPS publication are tied together and provides additional structure and semantics to the electronic publication". This is accomplished by two XML files with the extensions .opf and .ncx . The OPF file, traditionally named content.opf , houses the EPUB book's metadata, file manifest, and linear reading order. This file has

1980-461: Is used by computing standards such as HTTP , HTML , XML and PNG . 󠀁 IETF language tags were first defined in RFC 1766, edited by Harald Tveit Alvestrand , published in March 1995. The tags used ISO 639 two-letter language codes and ISO 3166 two-letter country codes, and allowed registration of whole tags that included variant or script subtags of three to eight letters. In January 2001, this

2046-590: The HTML5 , JavaScript , CSS, SVG formats, making EPUB readers use the same technology as web browsers. Such formats are associated with various types of security issues and privacy-breaching behaviors e.g. Web beacons , CSRF , XSHM due to their complexity and flexibility. Such vulnerabilities can be used to implement web tracking and cross-device tracking on EPUB files. Security researchers also identified attacks leading to local files and other user data being uploaded. The "EPUB 3.1 Overview" document provides

EPUB - Misplaced Pages Continue

2112-1319: The Internet Engineering Task Force (IETF) in Best Current Practice (BCP) 47 ; the subtags are maintained by the IANA Language Subtag Registry . To distinguish language variants for countries, regions , or writing systems (scripts), IETF language tags combine subtags from other standards such as ISO 639 , ISO 15924 , ISO 3166-1 and UN M.49 . For example, the tag en stands for English ; es-419 for Latin American Spanish ; rm-sursilv for Romansh Sursilvan ; sr-Cyrl for Serbian written in Cyrillic script; nan-Hant-TW for Min Nan Chinese using traditional Han characters , as spoken in Taiwan ; yue-Hant-HK for Cantonese using traditional Han characters , as spoken in Hong Kong ; and gsw-u-sd-chzh for Zürich German . It

2178-689: The Specifications for the Digital Talking Book . An example .ncx file: An EPUB file is a group of files that conform to the OPS/OPF standards and are wrapped in a ZIP file. The OCF specifies how to organize these files in the ZIP, and defines two additional files that must be included. The mimetype file must be a text document in ASCII that contains the string application/epub+zip . It must also be uncompressed, unencrypted, and

2244-500: The WebP and Opus media formats. The format and many readers support the following: An EPUB file can optionally contain DRM as an additional layer, but it is not required by the specifications. In addition, the specification does not name any particular DRM system to use, so publishers can choose a DRM scheme to their liking. However, future versions of EPUB (specifically OCF) may specify

2310-905: The valencia variant subtag for the Valencian variant of the Catalan is registered in the Language Subtag Registry with the prefix ca . As this dialect is spoken almost exclusively in Spain, the region subtag ES can normally be omitted. Furthermore, there are script tags that do not refer to traditional scripts such as Latin, or even scripts at all, and these usually begin with a Z. For example, Zsye refers to emojis , Zmth to mathematical notation , Zxxx to unwritten documents and Zyyy to undetermined scripts. IETF language tags have been used as locale identifiers in many applications. It may be necessary for these applications to establish their own strategy for defining, encoding and matching locales if

2376-636: The BISAC Subject Headings, which are a mainstay in the industry and required for participation in many databases. BISAC Subject Headings are also making inroads into library classification . Brian O'Leary began his tenure as executive director of BISG on October 3, 2016. In 2020, the International Green Book Supply Chain Alliance was formed as a partnership between BISG in the U.S., BookNet Canada , and Book Industry Communication (BIS) in

2442-485: The EPUB specification. The NCX file has a mimetype of application/x-dtbncx+xml . Of note here is that the values for the docTitle , docAuthor , and meta name="dtb:uid" elements should match their analogs in the OPF file. Also, the meta name="dtb:depth" element is set equal to the depth of the navMap element. navPoint elements can be nested to create a hierarchical table of contents. navLabel 's content

2508-588: The Fraktur and Gaelic variants of the Latin script, which are mostly encoded with regular Latin letters in Unicode and ISO/IEC 10646). They may occasionally be useful in language tags to expose orthographic or semantic differences, with different analysis of letters, diacritics, and digraphs/trigraphs as default grapheme clusters, or differences in letter casing rules. Two-letter region subtags are based on codes assigned, or "exceptionally reserved", in ISO 3166-1 . If

2574-485: The ISO 3166 Maintenance Agency were to reassign a code that had previously been assigned to a different country, the existing BCP 47 subtag corresponding to that code would retain its meaning, and a new region subtag based on UN M.49 would be registered for the new country. UN M.49 is also the source for numeric region subtags for geographical regions, such as 005 for South America. The UN M.49 codes for economic regions are not allowed. Region subtags are used to specify

2640-682: The Independent Book Publishers Association and the Evangelical Christian Publishers Association have all been long standing active members of BISG. Over the years BISG has published many research reports in response to the needs of its members. Among these are studies of paper availability, book distribution, elementary/high school adoptions, printing capacity cycles, book sales through non-traditional book markets, consumer book buying habits and an informational guide to

2706-490: The Registry. In addition, codes for languages encompassed by certain macrolanguages were registered as extended language subtags. Sign languages were also registered as extlangs, with the prefix sgn . These languages may be represented either with the subtag for the encompassed language alone ( cmn for Mandarin) or with a language-extlang combination ( zh-cmn ). The first option is preferred for most purposes. The second option

EPUB - Misplaced Pages Continue

2772-556: The United Kingdom. These three associations for the book publishing supply chain have formed the consortium in order to address the impact of the industry on the environment. A range of committees and working groups provide BISG members with an ongoing platform for the identification, assessment, and resolution of book industry issues, at times through the development of standards and best practices. These committees and working groups are actively managed by BISG members and meet on

2838-552: The core standards that inform it. For example, some speakers of Punjabi believe that the ISO 639-3 distinction between [pan] "Panjabi" and [pnb] "Western Panjabi" is spurious (i.e. they feel the two are the same language ); that sub-varieties of the Arabic script should be encoded separately in ISO 15924 (as, for example, the Fraktur and Gaelic styles of the Latin script are); and that BCP 47 should reflect these views and/or overrule

2904-426: The core standards with regard to them. BCP 47 delegates this type of judgment to the core standards, and does not attempt to overrule or supersede them. Variant subtags and (theoretically) primary language subtags may be registered individually, but not in a way that contradicts the core standards. Extension subtags (not to be confused with extended language subtags ) allow additional information to be attached to

2970-487: The data for that extension. IANA is responsible for allocating singletons. Two extensions have been assigned as of January 2014. Extension T allows a language tag to include information on how the tagged data was transliterated, transcribed, or otherwise transformed. For example, the tag en-t-jp could be used for content in English that was translated from the original Japanese. Additional substrings could indicate that

3036-433: The document manifest, table of contents , and EPUB metadata . Finally, the files are bundled in a zip file as a packaging format. An EPUB file uses XHTML 1.1 (or DTBook) to construct the content of a book as of version 2.0.1. This is different from previous versions ( OEBPS 1.2 and earlier), which used a subset of XHTML. There are, however, a few restrictions on certain elements. The mimetype for XHTML documents in EPUB

3102-414: The final specification was released in 2019. A notable change is the removal of a specialized subset of CSS, enabling the use of non-epub-prefixed properties. The references to HTML and SVG standards are also updated to "newest version available", as opposed to a fixed version in time. The W3C announced version 3.3 on May 25, 2023. Changes included stricter security and privacy standards; and the adoption of

3168-478: The first file in the ZIP archive. This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub extension. Also, there must be a folder named META-INF , which contains the required file container.xml . This XML file points to the file defining the contents of the book. This is the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml ,

3234-424: The font file in the OPF's manifest (see below). The mimetype for CSS documents in EPUB is text/css . EPUB also requires that PNG , JPEG , GIF , and SVG images be supported using the mimetypes image/png, image/jpeg, image/gif, image/svg+xml . Other media types are allowed, but creators must include alternative renditions using supported types. For a table of all required mimetypes, see Section 1.3.7 of

3300-511: The iBooks app to function. EPUB is a popular format for electronic data interchange because it can be an open format and is based on HTML, as opposed to Amazon's proprietary format for Kindle readers. Popular EPUB producers of public domain and open licensed content include Project Gutenberg , Standard Ebooks , PubMed Central , SciELO and others. In 2022, Amazon 's Send to Kindle service removed support for its own Kindle File Format in favor of EPUB. EPUB requires readers to support

3366-502: The language collections are now all defined in ISO 639-5 as inclusive, rather than some of them being defined exclusively. This means that language collections have a broader scope than before, in some cases where they could encompass languages that were already encoded separately within ISO 639-2. For example, the ISO 639-2 code afa was previously associated with the name "Afro-Asiatic (Other)", excluding languages such as Arabic that already had their own code. In ISO 639-5, this collection

SECTION 50

#1732772218730

3432-474: The manifest, and are allowed to have an element identifier (e.g. #figures in the example). An example OPF file: The NCX file ( N avigation C ontrol file for X ML), traditionally named toc.ncx , contains the hierarchical table of contents for the EPUB file. The specification for NCX was developed for Digital Talking Book (DTB), is maintained by the DAISY Consortium , and is not a part of

3498-432: The new collections added only in ISO 639-5). BCP 47 defines a "Scope" property to identify subtags for language collections. However, it does not define any given collection as inclusive or exclusive, and does not use the ISO 639-5 grouping type attribute, although the description fields in the Language Subtag Registry for these subtags match the ISO 639-5 (inclusive) names. As a consequence, BCP 47 language tags that include

3564-497: The old Language Tag Registry), subtags occur in the following order: Subtags are not case-sensitive , but the specification recommends using the same case as in the Language Subtag Registry, where region subtags are UPPERCASE , script subtags are Title Case , and all other subtags are lowercase . This capitalization follows the recommendations of the underlying ISO standards. Optional script and region subtags are preferred to be omitted when they add no distinguishing information to

3630-426: The other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS . An example file structure: An example container.xml, given the above file structure: The EPUB 3.0 Recommended Specification was approved on 11 October 2011. On June 26, 2014, EPUB 3.0.1 was approved as a minor maintenance update to EPUB 3.0. EPUB 3.0 supersedes the previous release 2.0.1. EPUB 3 consists of

3696-557: The package. Each file is represented by an item element, and has the attributes id , href , media-type . All XHTML (content documents), stylesheets, images or other media, embedded fonts, and the NCX file should be listed here. Only the .opf file itself, the container.xml , and the mimetype files should not be included. The spine element lists all the XHTML content documents in their linear reading order. Also, any content document that can be reached through linking or

3762-482: The specification. Unicode is required, and content producers must use either UTF-8 or UTF-16 encoding. This is to support international and multilingual books. However, reading systems are not required to provide the fonts necessary to display every Unicode character, though they are required to display at least a placeholder for characters that cannot be displayed fully. An example skeleton of an XHTML file for EPUB looks like this: The OPF specification's purpose

3828-457: The strategy described in RFC 4647 is not adequate. The use, interpretation and matching of IETF language tags is currently defined in RFC 5646 and RFC 4647. The Language Subtag Registry lists all currently valid public subtags. Private-use subtags are not included in the Registry as they are implementation-dependent and subject to private agreements between third parties using them. These private agreements are out of scope of BCP 47. The following

3894-487: The table of contents must be listed as well. The toc attribute of spine must contain the id of the NCX file listed in the manifest. Each itemref element's idref is set to the id of its respective content document. The guide element is an optional element for the purpose of identifying fundamental structural components of the book. Each reference element has the attributes type , title , href . Files referenced in href must be listed in

3960-469: The title of the book, language contains the language of the book's contents in RFC 3066 format or its successors, such as the newer RFC 4646 and identifier contains a unique identifier for the book, such as its ISBN or a URL . The identifier 's id attribute should equal the unique-identifier attribute from the package element. The manifest element lists all the files contained in

4026-451: The top 50 relevant corporate and community sponsored education programs. In conjunction with other organizations, BISG has produced reports on African-American book buyers, small and independent book publishers, and the state of used book sales in the U.S. Through BISAC (Book Industry Standards and Communications), BISG has been involved with technological advances such as bar codes and electronic business communications formats. It developed

SECTION 60

#1732772218730

4092-704: The translation was done mechanically, or in accordance with a published standard. Extension T is described in the informational RFC 6497, published in February 2012. The Registration Authority is the Unicode Consortium . Extension U allows a wide variety of locale attributes found in the Common Locale Data Repository (CLDR) to be embedded in language tags. These attributes include country subdivisions, calendar and time zone data, collation order, currency, number system, and keyboard identification. Some examples include: Extension U

4158-421: The variety of a language "as used in" a particular region. They are appropriate when the variety is regional in nature, and can be captured adequately by identifying the countries involved, as when distinguishing British English ( en-GB ) from American English ( en-US ). When the difference is one of script or script variety, as for simplified versus traditional Chinese characters, it should be expressed with

4224-627: Was completed and published in April 1976. The report confirmed the feasibility of a program of major research studies by and about the industry. As an organization BISG is concerned with the publishing industry as a whole and its membership consists of companies from all sectors of the industry. Trade and professional associations such as the Association of American Publishers , the American Booksellers Association ,

4290-579: Was published in September 2009. The main purpose of this revision was to incorporate three-letter codes from ISO 639-3 and 639-5 into the Language Subtag Registry, in order to increase the interoperability between ISO 639 and BCP 47. Each language tag is composed of one or more "subtags" separated by hyphens (-). Each subtag is composed of basic Latin letters or digits only. With the exceptions of private-use language tags beginning with an x- prefix and grandfathered language tags (including those starting with an i- prefix and those previously registered in

4356-458: Was updated by RFC 3066, which added the use of ISO 639-2 three-letter codes, permitted subtags with digits, and adopted the concept of language ranges from HTTP/1.1 to help with matching of language tags. The next revision of the specification came in September 2006 with the publication of RFC 4646 (the main part of the specification), edited by Addison Philips and Mark Davis and RFC 4647 (which deals with matching behaviour). RFC 4646 introduced

#729270