Tuesday, December 2, 2025

UTC #185 Highlights

 Unicode Technical Committee meeting #185 was held October 27 – 29 in Cupertino, CA, hosted by Apple. Here are some highlights.

Starting the Unicode 18.0 cycle

As we've been following an annual September release cycle for the Unicode Standard, the Q4 UTC meeting is the first meeting during a new cycle. While some decisions targeting the release might have been taken at a previous meeting, this is the first meeting in which the next release has particular focus. One of the decisions taken is to plan out the key milestones and dates for the next new cycle. Here's a summary of the timeline for Unicode 18.0:

  • November 2025: UTC #185 approved new character repertoire

  • January 2026: UTC #186 will finalize content for the alpha release

  • February – March: alpha release open for public review

  • April: UTC #187 will review alpha feedback and finalize content for the beta release

  • May – June: beta release open for public review

  • July: UTC #188 will finalize 18.0 content

  • September: Unicode 18.0 release

Unicode 18.0 character and emoji repertoire

During a release cycle, the primary focus for the alpha review is on the new character repertoire. The repertoire for the alpha review can be updated at the January UTC meeting; but we like to have that planned repertoire largely determined by the Q4 meeting so that working groups can focus early on preparing content that will be needed for the alpha.

UTC #184 had approved around 60 characters for publication in Unicode 18.0. (Some of those had been planned for Unicode 17.0 but, for various reasons, needed to be postponed.) These included the UAE Dirham sign, and the first tranche of a large set of symbols from the writings of Gottfried Leibniz for which proposals are in development. At UTC #185, nearly 13,000 additional characters were approved for encoding in Unicode 18.0. 

The approved additions include encoding of Small Seal script ("Seal"), a repertoire of 11,328 ideographic characters. Seal is distinct from modern Han ideographs (aka, "CJK"), but is an important precursor of CJK resulting from the first efforts to standardize writing across Chinese-speaking regions during China's Qin Dynasty. As such, Seal has important cultural significance in China and for Chinese speakers throughout the world.

Other additions included 1,276 characters allocated in three new blocks: Archaic Cuneiform Numerals — 311 Cuneiform characters from the fourth millenium BCE; and Jurchen and Jurchen Radicals — 965 ideographic characters that were used for writing the Jurchen language in the12th – 13th century CE. 

In addition, 321 other characters were approved as additions to a number of existing blocks. This includes many characters for Arabic and Latin scripts, many characters used in phonetic transcription, a number of symbols used in music notation, and a second set of the Leibniz symbols.

Finally, the new characters approved for Unicode 18.0 includes nine new emoji characters. Note that many emoji are represented as character sequences, so mentioning the new emoji characters doesn't provide a complete picture. Look for more information about Unicode 18.0 emoji in the coming months.

CJK & Unihan

UTC works on CJK character encoding in collaboration with IRG (Ideographic Research Group), a working group under ISO/IEC JTC 1/SC 2. There are over 100,000 CJK ideographs now encoded in Unicode, and with such a large repertoire of characters there are refinements to the already-encoded characters that continue to be made. At UTC #185, recommendations arising from a recent IRG meeting were reviewed, and a number of changes were approved for Unicode 18.0. Some of these are technical details that are not so visible, such as corrections to source references for certain characters (the references cited when the characters were encoded providing evidence of their usage and identity as distinct characters). Among the significant and visible changes approved by UTC are over 700 horizontal extensions, which will be reflected in the Unicode 18.0 code charts with additional glyphs for already-encoded characters.

For complete details on outcomes from UTC #185, see the draft minutes.

About the Unicode Standard

The world relies on digital communications. The Unicode Standard is a vital building block for global digital communications, providing the encoding for more than 155,000 characters used by thousands of languages and scripts throughout the world. 

Each character—letter, diacritic, symbol, emoji, etc.—is represented by a unique numeric code, and has defined properties data that define how characters behave in several text processing algorithms. 

With this combination, The Unicode Standard provides the foundation for implementations to support the world's writing systems, enabling billions of people across the globe to seamlessly communicate with one another across platforms and devices. The Standard is also the foundation for the suite of code, libraries, data, and products that the Unicode Consortium delivers for robust language support.


----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock