Friday, February 26, 2021

Unicode 14.0 Alpha Review

Vithkuqi chart image The repertoire for Unicode 14.0 is now open for early review and comment. During alpha review the repertoire is reasonably mature and stable, but is not yet completely locked down. Discussion regarding whether certain characters should be removed from the repertoire for publication is welcome. Character names and code point assignments are reasonably firm, but suggestions for improvement may still be entertained.

This early review is provided so that reviewers may consider the character repertoire issues prior to the start of beta review (currently scheduled to start in June, 2021). Once beta review begins, the repertoire, code points, and character names will all be locked down, and no longer be subject to changes.

Feedback for the alpha review should be reported under PRI #428 using the Unicode contact form by April 12, 2021.


Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Wednesday, February 24, 2021

Enhancements to Unicode Regular Expressions

Regex image A Proposed Update UTS #18, Unicode Regular Expressions is now available for review and feedback.

Regular expressions are a key tool in software development. Back in 2000, few regular expression engines supported Unicode, even at a basic level. UTS #18 set out to raise the bar, describing how regular expression engines could be adapted to deal with Unicode correctly and completely. Since that time, major programming languages and libraries have adopted level 1 features (supporting all Unicode literals, basic character properties, subtraction, intersection, ...), and some also adopted some level 2 features (full character properties, grapheme clusters, ...).

A major enhancement to UTS #18 in 2020 focused on the addition of Character Classes with strings. The initial impetus for this was to handle emoji effectively in browsers, as most emoji consist of more than one code point. Supporting strings directly in character classes frees up programs from having to download large amounts of data or handle complicated syntax. Using a property like RGI_Emoji allows a regular expression to match both individual codes such as "😁" and multi-codepoint strings such as "🇫🇷". This extension to strings is also important for internationalization. For example, the alphabets used by many languages contain multi-code-point strings, so this extension allows them to be handled easily.

Additional enhancements are in progress this year, based on working with members of the ECMAScript committee, including more clarifications, better guidance on implementation, and addressing some tricky issues dealing with complementing (inverting) Character Classes. The end goal of all of these enhancements in 2020 and 2021 is to significantly raise the level of Unicode support in programming languages and libraries.

For more information, see https://www.unicode.org/review/pri427/.


Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Tuesday, February 2, 2021

Unicode Consortium looking to hire an Executive Director

Since its founding, the Unicode Consortium has grown and expanded its charter and scope. We’re embarking on a new chapter in the evolution of the Consortium by initiating the search for a leader with proven executive talents to fill the newly-created position of Executive Director. Learn more: https://www.unicode.org/consortium/edappinfo.html


Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Thursday, January 28, 2021

Unicode Consortium Elects New Directors to its Board

The Consortium is pleased to announce the following Board of Directors election results from its annual Member’s meeting:

Elected to new 3-year board terms:

Brent Getlin, Director of Product Development and General Manager, Fonts and Type, Adobe, Inc.
Brent is the Director of Product Development and General Manager for Adobe Fonts and Type at Adobe. Previously, Brent managed Adobe's mobile gaming engineering and Macromedia Flash video encoder. Brent holds a Bachelor of Science degree in Computer Engineering from Southern Methodist University.

Teresa Marshall, VP of Globalization and Localization, Salesforce, Inc.
As VP of Globalization and Localization, Teresa drives globalization efforts across Salesforce, including internationalization, international product management and localization. She started her career as a German linguist and has held program and operational management positions at a number of Silicon Valley companies as well as academic positions in the field of language translation. Teresa holds a MA in Translation and Interpreting from the Monterey Institute of International Studies.

Re-elected to another 3-year term on the board:

David Singer, Apple, Inc.
David Singer is the senior engineer who coordinates standards activity for software engineering at Apple. In this role, he serves directly in both technical roles (multimedia systems at MPEG and 3GPP) and strategic roles (Advisory Committee and Advisory Board at the W3C, past Blu-ray Director), and indirectly oversees Apple’s involvement in a wide range of standards bodies and consortia, including ITU-T and ITU-R, SMPTE, and INCITS. David holds a BA and PhD from the University of Cambridge, England.

Newly elected to a 2-year term:

Dr. Mark Davis, Google, Inc.
Dr. Mark Davis co-founded the Unicode project and has been the president of the Unicode Consortium since its incorporation in 1991. Having held positions at IBM and Apple, Mark joined Google in 2006 where he has been working on software internationalization focusing on effective and secure use of Unicode (especially in the index and search pipeline), the software internationalization libraries (including ICU), and stable international identifiers.

“We also wish to thank retiring directors Marypat Meuli and James Robertson for their combined many years of service to the Consortium as board members.” said Davis.


Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Tuesday, January 26, 2021

Salesforce Joins as Full Member of the Unicode Consortium

The Unicode Consortium is pleased to announce that Salesforce has joined as a full member.

“Salesforce is pleased to join the Unicode Consortium and advance the ability for software to reach people in their native and local languages,” said Teresa Marshall, VP, Globalization and Localization, Salesforce.

In addition to Salesforce joining as a full member, Marshall was also elected to the Consortium’s Board of Directors. At Salesforce, Marshall drives globalization efforts, including internationalization, international product management and localization. She started her career as a German linguist and has held program and operational management positions at a number of Silicon Valley companies, as well as academic positions in the field of language translation. Teresa holds an MA in Translation and Interpreting from the Monterey Institute of International Studies.

“We are delighted to have Salesforce join as our newest full member,” said Mark Davis, President, Unicode Consortium. “As platforms grow to serve a growing international user base, it becomes increasingly important to invest in and develop standards that allow efficient support of local languages. Unicode exists for precisely that purpose.”

Full members of the Consortium have a vote in all technical committees, and in the governance of the Consortium. A full list of Consortium members can be found here: https://home.unicode.org/membership/members/

Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]