Tuesday, November 26, 2024

UTC #181 Highlights

Unicode Technical Committee (UTC) meeting #181 was held November 6 – 8 in Cupertino, hosted by Apple. Here are some highlights.

Starting the Unicode 17.0 cycle

UTC approved a plan and timeline for the Unicode 17.0 release. Here’s a summary of the timeline:

 

  • November 2024: UTC #181 approved new character repertoire
  • January 2025: UTC #182 will finalize content for the alpha release
  • February – March: alpha release for public review
  • April: UTC #183 will finalize content for the beta release
  • May – June: beta release for public review
  • July: UTC #184 will finalize 17.0 content
  • September: Unicode 17.0 release

 

Unicode 17.0 character and emoji repertoire

UTC #179 had previously approved 4,301 CJK ideographs for Unicode 17.0, including the addition of the CJK Unified Ideographs Extension J block. At this UTC meeting, a number of additional characters and symbols were approved for Unicode 17.0, including five new scripts:

 

  • Beria Erfe is a modern-use script used for the Zaghawa language in eastern Africa.
  • Chisoi is a modern-use script used for the Kurmali language in eastern India.
  • Sidetic is an historic script that was used in ancient Anatolia.
  • Tai Yo is the traditional script for the Tai Yo language, spoken in Vietnam and Laos.
  • Tolong Siki is a modern-use script used for the Kurukh language in eastern India.

 

A few changes were made to the approved new CJK ideographs repertoire: two ideographs from the CJK Extension J block were removed, while four ideographs were added. UTC also approved 297 other non-emoji character additions for already encoded scripts or symbol blocks.

 

UTC #181 also approved 8 new emoji characters for Unicode 17.0, along with a number of emoji ZWJ sequences; see document L2/24-226R for details.

 

Besides characters approved for Unicode 17.0, code points were provisionally assigned for 365 new characters that are candidates for encoding in a future Unicode version.

 

See the Pipeline page for all characters currently approved for Unicode 17.0, along with code points provisionally assigned for future encoding.

 

Algorithm specs

UTC approved some significant changes related to algorithm specifications for Unicode 17.0. Notably, in UAX #14, a new Line_Break property value was approved — Unambiguous_Hyphen —along with related changes to various rules of the line-breaking algorithm. Also, for UTS #10, Unicode Collation Algorithm, information about conformance tests had previously been published in a companion document, but this will be incorporated into UTS #10 for version 17.0. New public review issues will be posted soon to get feedback on the planned changes.

 

UTC also approved proposed drafts for two new algorithm specifications:

 

  • Proposed Draft UTS #58, Unicode Linkification: this proposed standard will specify a mechanism for detecting URLs that contain Unicode characters.
  • Proposed Draft UTR #59, East Asian Spacing: this proposed technical report will specify an algorithm for established typographic conventions in East Asian text for spacing between runs of text from different scripts.

 

A public review issue has been posted for review of PD UTS #58 (see PRI #509). A public review issue for PD UTR #59 will be posted soon.

 

Update on Text Terminal Working Group

At UTC #175, a temporary working group was formed to work on improving support for Unicode text in text terminal environments. After a slow start due to the original chairperson no longer being available, Fraser Gordon was chosen as a new chair for the group, and it has started to function with several interested participants. Fraser Gordon reported on the group’s activity and requested feedback from UTC on some technical questions the working group was facing, including whether it could be in scope to propose requirements for fonts or a text protocol for signaling between applications and terminals — UTC feedback was that either of these could be considered. See L2/24-264 for more details.

 

UTC coming to Eastern US

Earlier this year, UTC started discussing the possibility of trying new locations to make it easier for people in other regions or time zones to participate. Between having people interested from many parts of the world as well as travel constraints on regular participants, there is no perfect answer. However, we received a generous offer from the University of New Hampshire to host a meeting there, and so UTC has decided to switch the location of the July 2025 meeting from Redmond, WA to Manchester, New Hampshire (about an hour drive north of Boston). Some preliminary logistic info will be provided soon to give plenty of time to consider travel plans.

 

For complete details on outcomes from UTC #181, see the draft minutes.


________________________________________________

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.