Monday, February 5, 2024

Highlights from UTC Meeting #178

Unicode Technical Committee (UTC) meeting #178 was held January 23 to 25 in Sunnyvale, California. Many thanks for Google for hosting. Here are some highlights from the meeting.


Preparing Unicode 16.0 Alpha

UTC made final decisions regarding the draft character repertoire for Unicode 16.0 and approved the alpha release. The alpha will be available for public review on February 6th.
UTC had previously approved 1,192 characters for Unicode 16.0, but also anticipated inclusion of a large set of Egyptian Hieroglyph extensions. Those were approved at this meeting — 3,995 additional characters, bringing the number of new characters for Unicode 16.0 to 5,187. See the Pipeline page for all characters currently approved for Unicode 16.0, along with code points provisionally assigned for future encoding.
There was some discussion about certain of the characters being added in Unicode 16.0 for new scripts (Kirat Rai, Tulu-Tigalari, Gurung Khema) because of normalization behaviour not previously seen that affected normalization optimizations in ICU and could affect other normalization implementations. This had raised a question as to whether to revisit the encoding model for those scripts, or to keep the encoding that UTC had already accepted and make adjustments in ICU. For various reasons, it was decided to do the latter. For more info, see section F.1 in L2/24-009.
UTC also approved a new data file to be added to UCD: DoNotEmit.txt will capture in machine-readable form information already included in various chapters of the core spec  regarding characters or sequences of characters that could occur in data but, in fact, should not be used. For example, certain sequences of Devanagari character could appear visually identical to a Devanagari letter but not be canonically equivalent and should not be used. See section 19 of L2/24-013 for more information.
Future of UCD #42 UCDXML
Because the people who previously maintained UCDXML were no longer going to be continue that going forward, UTC #177 decided on a plan to stabilize UCDXML at version 15.1. However, there was public review feedback that several projects continue to depend on UCDXML. Seeing that, John Wilcock of Microsoft volunteered to take over maintenance of UCDXML. Thus, UCDXM will be updated for Unicode 16.0 with the latest character repertoire and properties, and will continue to be maintained for future versions, as long as John is available to do that.
Text Terminal Working Group
The Text Terminal Working Group was created by UTC in April 2023 to develop specifications for supporting Unicode in text terminal environments. After a few months, however, the chair of the group no longer had time available to chair the group. During last week’s UTC meeting, a new chair was nominated and has since been confirmed by Unicode officers: Fraser Gordon. Fraser’s work involving Unicode began many years ago, extending LiveCode to support Unicode. He is currently also active in the C++ standards committee’s Unicode Study Group (ISO/IEC JTC 1/SC 22/SG 16).
Full details on these and other outcomes are provided in the minutes—see L2/24-006.


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock