Tuesday, May 21, 2024

Unicode 16.0 Beta Review Open

The beta review period for Unicode® 16.0 has started and is open until July 2,2024.

The beta is intended primarily for review of character property data and changes to algorithm specifications (Unicode Standard Annexes). Also, for the first time, a complete draft of the core specification text is available for review during the beta period.

At this phase of a release, the character repertoire is considered stable. For this release, 5,185 new characters will be added, bringing the total number of encoded characters in Unicode 16.0 to 154,998. The new additions include seven new scripts:
  • Garay is a modern-use script from West Africa
  • Gurung Khema, Kirat Rai, Ol Onal, and Sunuwar are four modern-use scripts from Northeast India and Nepal
  • Todhri is an historic script used for Albanian
  • Tulu-Tigalari is an historic script from Southwest India
Other character additions include seven new emoji characters plus 3,995 additional Egyptian Hieroglyphs and over 700 symbols from legacy computing environments. See the delta code charts for details on all the new scripts and characters.

In addition to new characters, new “Moji Jōhō Kiban” (文字情報盤) Japanese source references will be added for over 36,000 CJK unified ideographs. This will be reflected in the code charts for virtually all CJK unified ideograph blocks by additional representative glyphs in the “J” column. Note that these glyph additions are not reflected in the delta charts mentioned above, but can be seen in the main (“single-block”) charts for the Unicode 16.0 Beta.

Various changes to properties, algorithms, and Unicode Standard Annexes will be made for Unicode 16.0. This version will add two new Unicode Standard Annexes:
  • UAX #53, Unicode Arabic Mark Rendering, provides a specification for interoperable font and shaping implementations for Arabic script. (This was previously published separately from the Unicode Standard as a technical report.)
  • UAX #57, Unicode Egyptian Hieroglyph Database (Unikemet), provides data essential for understanding the identity of over 5,100 Egyptian Hieroglyph characters encoded in Unicode 16.0. (This is similar to data for CJK unified ideographs provided in UAX #38.)
A new UCD file, DoNotEmit.txt, will provide data in machine readable form that can be useful for software implementations but that previously was provided only as tables within the core specification text. See the Unicode 16.0 Beta landing page for other noteworthy property and algorithm changes.

For full details regarding the Beta, see Public Review Issue #502. Feedback should be reported under PRI #502 using the Unicode contact form by July 2, 2024.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock