Tuesday, March 5, 2019

Announcing The Unicode® Standard, Version 12.0

Medinet Habu Temple Ceiling (Wikipedia)_with Text Version 12.0 of the Unicode Standard is now available, including the core specification, annexes, and data files. This version adds 554 characters, for a total of 137,929 characters. These additions include four new scripts, for a total of 150 scripts, as well as 61 new emoji characters.

The new scripts and characters in Version 12.0 add support for lesser-used languages and unique written requirements worldwide, including:
  • Elymaic, historically used to write Achaemenid Aramaic in the southwestern portion of modern-day Iran
  • Nandinagari, historically used to write Sanskrit and Kannada in southern India
  • Nyiakeng Puachue Hmong, used to write modern White Hmong and Green Hmong languages in Laos, Thailand, Vietnam, France, Australia, Canada, and the United States
  • Wancho, used to write the modern Wancho language in India, Myanmar, and Bhutan
Additional support for lesser-used languages and scholarly work was extended worldwide, including:
  • Miao script additions to write several Miao and Yi dialects in China
  • Hiragana and Katakana small letters, used to write archaic Japanese
  • Tamil historic fractions and symbols, used in South India
  • Lao letters used to write Pali
  • Latin letters used in Egyptological and Ugaritic transliteration
  • Hieroglyph format controls, enabling full formatting of quadrats for Egyptian Hieroglyphs
The Egyptian temple ceiling painting shown above (from the Wikipedia article on Medinet Habu) includes a line of hieroglyphic text. That exact text is rendered again below the painting, represented in Unicode plain text, illustrating the use of the new hieroglyphic format controls, as well as cartouche brackets and directional controls. The example was developed by Andrew Glass, based on Microsoft’s Segoe UI Historic font, with outlines designed by James P. Allen.

Popular symbol additions include:
  • 61 emoji characters, including several new emoji for accessibility
  • Marca registrada sign
  • Heterodox and fairy chess symbols
For the full list of new emoji characters, see emoji additions for Unicode 12.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji. Version 12.0 also includes additional guidelines on gender and skin tone included in UTS #51 and data files.

Also in Version 12.0, the following Unicode Standard Annexes have notable modifications, often in coordination with changes to character properties. In particular, there are changes to:
Three other important Unicode specifications have been updated for Version 12.0:
The Unicode Standard is the foundation for all modern software and communications around the world, including operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Over 130,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages