Tuesday, May 2, 2023

UTC #175 Highlights

by Peter Constable, UTC Chair

We had another productive Unicode Technical Committee (UTC) meeting last week,hosted at Adobe headquarters in downtown San Jose, California. Here are some highlights from the meeting.

Unicode 15.1 Beta

UTC has authorized the Beta release for Unicode 15.1. There were various, relatively minor technical changes to be made based on feedback during the Alpha review period, plus one major change that I’ll describe below. The Beta is scheduled for release on May 23, for a six week public review period to end July 4. That closing date will provide time for working groups to review feedback and provide recommendations for the next UTC meeting July 25 – 27.

CJK Extension I & GB 18030

A major change for Unicode 15.1 that was decided on was to encode 603 characters in a new CJK Unified Ideographs Extension I block. (See L2/23-106.) This was part of long discussions about GB 18030-2022 and Amendment 1 of that standard which China is currently developing. China has an urgent need for these characters, and the draft of their amendment has them allocated in reserved code positions of Unicode and ISO/IEC 10646, which is not viable from the perspective of the international standards. So, UTC has taken initiative to have China's need accommodated in a standards-conforming manner.

There was discussion as to whether the new characters should be added to Unicode 15.1 or to Unicode 16.0: it was generally preferred to wait for 16.0, but 15.1 was tentatively chosen in case that makes a significant difference for China’s process.

UTC recommended the addition of CJK Extension I to the INCITS/CS&I committee (mirror for JTC 1/SC 2—also met last week) who agreed to recommend to SC 2 the addition of that block in Amendment 2 of ISO/IEC 10646. See L2/23-114 and L2/23-115 for more information.

Orthographic syllable support in UAX #14

Another significant addition for Unicode 15.1 is that UTC approved extending UAX #14 Unicode Line Breaking Algorithm to support breaking of various South and Southeast Asian scripts at orthographic syllable boundaries. The algorithm for this is based on a proposal from Norbert Lindenberg and others (see L2/22-086), with details for incorporation into UAX #14 provided by Robin Leroy (see L2/23-072). A prototype implementation had been created as a public review issue (see PRI #472), and feedback had been positive. This will be a very significant enhancement in Unicode 15.1 providing important improvements in support for several South and Southeast Asian scripts.

Unicode display in text terminals

A new UTC project was initiated at this meeting to develop specifications for supporting display of scripts that require complex shaping in text terminals. This was introduced with a presentation by Renzhi Li and Dustin Howett of Microsoft (see L2/23-107). Even though the majority of computing device usage today is via GUIs, text terminals are still used in many scenarios. Thus, there was considerable interest among UTC participants in this proposal. An ad-hoc working group, chaired by Dustin Howett, will be formed to develop specifications. If interested in participating, let me know and I’ll connect you with Dustin.

Full details on these and other outcomes will be provided in the draft minutes that will be available soon (as L2/23-076 in the document registry).

Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.