Tuesday, February 6, 2024

Unicode 16.0 Alpha Review Opens for Feedback

The repertoire for Unicode Version 16.0 is now open for early review and comment until April 2. As a reminder, during alpha review the repertoire is reasonably mature and stable, but is not yet completely locked down. Discussion regarding whether certain characters should be removed from the repertoire for publication is welcome. Character names and code point assignments are reasonably firm, but suggestions for improvement may still be entertained.

This early review is provided so that reviewers may consider the character repertoire and data file issues prior to the start of beta review (currently scheduled to start in May 2024). Once beta review begins, the repertoire, code points, and character names will all be locked down, and no longer be subject to changes.

Notable Changes

Unicode Version 16.0 adds 5,187 new characters, bringing the total number of characters to 155,000. The most significant addition for this release is 3,995 additional Egyptian Hieroglyph characters. There are also seven new scripts and many new symbols. See The Pipeline and the delta code charts for details.


In addition, new “Moji Jōhō Kiban” (文字情報基盤) Japanese source references will be added for over 36,000 CJK unified ideographs. This will be reflected in the code charts for virtually all CJK unified ideograph blocks by additional representative glyphs in the “J” column.


Unicode Emoji 16.0 will include eight new emoji—see PRI #498 Unicode Emoji 16.0 Alpha Candidates.


Some of the new scripts in Unicode 16.0 (Kirat Rai, Tulu-Tigalari, Gurung Khema) include characters that have normalization behavior not seen in earlier versions, which could affect optimized implementations of Unicode normalization, and implementations using “quick check” properties. The relevant data files are available as part of the Unicode 16.0 alpha to allow early review.


Feedback for the alpha review should be reported under PRI #497 using the Unicode contact form by April 2, 2024.


____________________________

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock




Monday, February 5, 2024

Highlights from UTC Meeting #178

Unicode Technical Committee (UTC) meeting #178 was held January 23 to 25 in Sunnyvale, California. Many thanks for Google for hosting. Here are some highlights from the meeting.

 

Preparing Unicode 16.0 Alpha

UTC made final decisions regarding the draft character repertoire for Unicode 16.0 and approved the alpha release. The alpha will be available for public review on February 6th.
 
UTC had previously approved 1,192 characters for Unicode 16.0, but also anticipated inclusion of a large set of Egyptian Hieroglyph extensions. Those were approved at this meeting — 3,995 additional characters, bringing the number of new characters for Unicode 16.0 to 5,187. See the Pipeline page for all characters currently approved for Unicode 16.0, along with code points provisionally assigned for future encoding.
 
There was some discussion about certain of the characters being added in Unicode 16.0 for new scripts (Kirat Rai, Tulu-Tigalari, Gurung Khema) because of normalization behaviour not previously seen that affected normalization optimizations in ICU and could affect other normalization implementations. This had raised a question as to whether to revisit the encoding model for those scripts, or to keep the encoding that UTC had already accepted and make adjustments in ICU. For various reasons, it was decided to do the latter. For more info, see section F.1 in L2/24-009.
 
UTC also approved a new data file to be added to UCD: DoNotEmit.txt will capture in machine-readable form information already included in various chapters of the core spec  regarding characters or sequences of characters that could occur in data but, in fact, should not be used. For example, certain sequences of Devanagari character could appear visually identical to a Devanagari letter but not be canonically equivalent and should not be used. See section 19 of L2/24-013 for more information.
 
Future of UCD #42 UCDXML
Because the people who previously maintained UCDXML were no longer going to be continue that going forward, UTC #177 decided on a plan to stabilize UCDXML at version 15.1. However, there was public review feedback that several projects continue to depend on UCDXML. Seeing that, John Wilcock of Microsoft volunteered to take over maintenance of UCDXML. Thus, UCDXM will be updated for Unicode 16.0 with the latest character repertoire and properties, and will continue to be maintained for future versions, as long as John is available to do that.
 
Text Terminal Working Group
The Text Terminal Working Group was created by UTC in April 2023 to develop specifications for supporting Unicode in text terminal environments. After a few months, however, the chair of the group no longer had time available to chair the group. During last week’s UTC meeting, a new chair was nominated and has since been confirmed by Unicode officers: Fraser Gordon. Fraser’s work involving Unicode began many years ago, extending LiveCode to support Unicode. He is currently also active in the C++ standards committee’s Unicode Study Group (ISO/IEC JTC 1/SC 22/SG 16).
 
Full details on these and other outcomes are provided in the minutes—see L2/24-006.



_______________________________________

Adopt a Character and Support Unicode’s Mission


Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock




Wednesday, January 31, 2024

NEW Event on February 20 – Virtual Open House on MessageFormat

 Registration is Now Open!

MessageFormat is a critical API for anyone interested in building fluent, accessible, and well-localized applications. Any part of the user interface that displays data or varies dynamically at runtime needs to provide for the formatting requirements of the locale and the grammatical needs of the user’s language. As such, MessageFormat is “table stakes” for internationalizing applications.

The MessageFormat Working Group is a part of the CLDR Technical Committee of Unicode. After several years of work, they have produced a Technical Preview for MessageFormat 2.0, a next generation specification designed to address critical gaps in current formatting solutions, provide access to new internationalization APIs rooted in CLDR data, and build a syntax that is portable across many programming languages and runtime environments.


Now that the specification is close to being stabilized, the MessageFormat Working Group would like to engage with interested members of the internationalization, developer, localization, and translation communities.


Who: If you are a platform, framework, and programming language developer, localization manager, engineer, or translator, you will want to join us for this virtual Open House event to hear more about the progress achieved, and to bring your questions to the people involved. 


When: Tuesday, February 20, 2024 starting at 9am (San Francisco), 12pm (New York), and 6pm (Berlin).


Register Now! Please note this session will be recorded and available via the Unicode YouTube channel.

Getting Started with Message Formatting


MessageFormat GitHub Repo

Why MessageFormat v2?

Goals & Deliverables

Draft Specification and Syntax

UTW MessageFormat v2 (Video)





About the Unicode Consortium

The Unicode Consortium is the premier non-profit open source, open standards body for the internationalization of all software and services. 


For more than 30 years, the Unicode Consortium has coordinated the efforts of a worldwide team of volunteer programmers and linguists to standardize, evolve, and maintain a global software foundation that allows virtually every computer system and service to help people connect using their native language. 


For additional information about Unicode, visit home.unicode.org.



Adopt a Character and Support Unicode's Mission


Looking to give that special someone a special something? Or maybe 

something to treat yourself?

🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode's mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause


You can also donate funds or gift stock.


Wednesday, January 17, 2024

Unicode Welcomes New Board Members!

Giammarresi (left) and Chilana (right)
The Unicode Consortium is pleased to welcome Salvatore “Salvo” Giammarresi of Airbnb and Kulpreet Chilana of Apple to its Board of Directors effective this month.

At its annual member meeting last November, the representatives of Unicode’s Full Level members unanimously elected Salvo and Kulpreet and renewed the terms of Brent Getlin (Adobe) and Teresa Marshall (Salesforce).

Salvo is the Head of Localization at Airbnb and a Board Member at Clear Global (formerly known as Translators without Borders). Previously he held global leadership roles at several technology companies including PayPal and Yahoo. He is a published author and speaker on numerous topics, including localization, internationalization, global program management, and international product management.

Kulpreet has worked in software localization at Apple for 8 years and has more than 12 years of experience in the localization and internationalization industry. He is passionate about using all parts of the software stack to preserve the richness of human culture. He currently manages a team of software engineers that evangelize localization across Apple’s platforms, build features for Apple’s international users and own the localization infrastructure in Xcode.

“We’re excited to have Salvo and Kulpreet join us — they both bring extensive experience in localization and internationalization to the Unicode board,” said Mark Davis, Unicode’s board chair and co-founder, “as well as providing different perspectives on technology and priorities. Speaking for the board, I’d also like to thank David Singer, who has retired from the board after 6 years. Aside from many other contributions, David has helped immensely with pivotal transitions in governance.”

Further information on the Unicode Board can be found here.


Adopt a Character and Support Unicode's Mission

Looking to give that special someone a special something? Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀🍀
Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode's mission to ensure everyone can communicate in their own languages across all devices.
Each adoption includes a digital badge and certificate that you can proudly display!
As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.