Tuesday, September 14, 2021

Announcing The Unicode® Standard, Version 14.0

Vithkuqi Sample Version 14.0 of the Unicode Standard is now available, including the core specification, annexes, and data files. This version adds 838 characters, for a total of 144,697 characters. These additions include five new scripts, for a total of 159 scripts, as well as 37 new emoji characters.

The new scripts and characters in Version 14.0 add support for modern language groups in Bosnia, India, Indonesia, Iran, Java, Malaysia, Mongolia, Myanmar, Pakistan, and the Philippines, plus other languages in Africa and North America, including:
  • Arabic script additions that include honorifics and additions for Quranic use, and characters used to write languages across Africa, the Balkans, and South and Southeast Asia
  • The Vithkuqi script historically used to write Albanian and currently undergoing a modern revival
  • The Tangsa script used to write the Tangsa language, spoken in India and Myanmar
  • The Toto script used to write the Toto language in northeast India
  • Many Latin script additions for extended IPA
Popular symbol additions include:
  • 37 emoji characters, including several new emoji for emotion and hand gestures (smileys, hands, animals and nature, food and drink, transport, and activities). For the full list of new emoji characters, see emoji additions for Unicode 14.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji.
Other symbol and notational additions include:
  • The som currency sign used in the Kyrgyz Republic
  • Znamenny musical notation developed in Russia
Support for other modern languages and scholarly work extends worldwide, including:
  • Cypro-Minoan, historically used primarily on the island of Cyprus
  • Old Uyghur, historically used in Central Asia and elsewhere to write Turkic, Chinese, Mongolian, Tibetan, and Arabic languages
  • Ahom, Balinese, Brahmi, Canadian aboriginal languages, Glagolitic, Kaithi, Kannada, Mongolian, Tagalog, Takri, and Telugu
  • Arabic support for Hausa, Wolof, Hindko, and Punjabi, and Ethiopic support for Gurage
Important chart font updates, including:
  • Significant updates to the CJK auxiliary blocks and enclosed alphanumerics
Unicode properties and specifications determine the behavior of text on computers and phones. Changes in Version 14.0 include the following Unicode Standard Annexes and Technical Standards that have notable modifications:

Five important Unicode annexes updated for Version 14.0:
Three important Unicode specifications updated for Version 14.0:
The Unicode Standard is the foundation for all modern software and communications around the world, including operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages