Friday, September 27, 2024

Unicode Technology Workshop on October 22-23 – Program Updates!

By the UTW 2024 Program Committee


Join us for two days of community building around the Unicode technology that makes software work for billions of people. With a deeper emphasis on case studies, unconference, and workshop-style sessions, this event will enable participants to collaborate and learn from each other to tackle the latest challenges. Register Now for this in-person-only event hosted at Google in Sunnyvale, CA. The full program, including session details and bios, is available here:

UTW 2024 Event Website !

⭐ Highlights:

  • Build connections within the internationalization community
  • Learn best practices from peers and case studies
  • Network with the developers and users to help shape the future of Unicode technology
  • Deepen knowledge of how to solve tough problems in the i18n and l10n space and how to engineer products that work better for global users

๐Ÿ“ข Confirmed Sessions!

  • “A User-Centric Approach to a Bidi Text Interface” with Adil Allawi
  • “Common Locale Data Repository - Using the Survey Tool to Expand Language Coverage” with Conrad Nied
  • "Talking Emoji ๐Ÿ”ฅ๐Ÿ˜ฎ‍๐Ÿ’จ๐Ÿ„๐Ÿชฆ๐Ÿ’€๐Ÿท๐Ÿ™๐Ÿ˜ค" with Jennifer Daniel
  • “Design Deep-Dive” with Mark Davis
  • “How Would You Like Your Text Today?” with John Hudson
  • "Indic Script Policy & Planning in the Digital Age" with Karthik Malli
  • “Language and Direction Metadata on the Web” with Addison Phillips
  • “MessageFormat 2 Technical Preview: Where Are We Now?” with Addison Phillips
  • “Tracking Language Digitization in the UNESCO World Atlas” with Jeannette Stewart and Tex Texin
  • “Why Does Unicode Do That?” with Mark Davis
  • "Volunteers for Keyboards for Indigenous Language Communities" with Tex Texin
  • "Optimizing Glyphs for Real-Time Vector Rendering" with Eric Lengyel
  • "How To Not Run Towards The Bear: Directionality & Emoji" with Kamilรฉ Demir and Ben Joeng (Yang)
  • "What is a Valid Person Name?" with Michael McKenna
  • “Case Study - Solving Inflection” with Nebojลกa ฤ†iriฤ‡ (Chair of the Unicode ICU-Language Inflection Working Group) and George Rhoten
  • “Bridging Languages in ICU4X: How Diplomat Brings i18n to the Web and Beyond” with Tyler Knowlton
  • “We Need a New Message Resource Format” with Eemeli Aro
  • "New in CLDR/ICU" with Mark Davis
  • "Could You Give Me an Example? Simplifying the CLDR Survey Tool" with Helena Aytenfisu and Emiyare Ikwut-Ukwa
  • “ICU4X 2.0: Next Level i18n” with Shane F Carr (Chair of the Unicode ICU4X Technical Committee)
  • "From Oral to Digital in One Generation - An Exploration of Amazonian Languages and Their Path to Digital Inclusion" with Samuel Minev-Benzecry
  • “Encoding Expectations: How Long Does It Really Take?” Anushah Hossain and Ahad Bashir
  • "Indic Script Policy & Planning in the Digital Age" with Karthik Malli
  • "Date, Time, and Timezone for Netflix Live Events” with Shawn Xu and Chester Fung
  • "Behind the Curtains: Unicode Technical Groups” with Mark Davis (Unicode Co-founder and CTO)
  • “Ask Unicode Anything” with Toral Cowieson, Mark Davis, Cathy Wissink
Please note that sessions are continually being added for the two tracks.

๐Ÿ‘ Expect workshops, seminars, free-form discussions, and lightning talks on:

  • i18n libraries
  • locale data frameworks
  • globalization tooling
  • input methods
  • text rendering
  • localization pipelines

❓Who should attend?:

  • Whether you’re an experienced GILT professional, an internationalization or Unicode enthusiast, just starting out, or a student, the UTW 2024 sessions will enrich your understanding of key issues!

❗Space is limited so be sure to secure your spot today!

  • Discounts are available for Unicode members and students. Registration fees include continental breakfast, lunch, refreshments, and Mix & Mingle at the end of the first day.

Register Now !


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Thursday, September 26, 2024

Unicode CLDR 46 Beta available for specification review

 

The Unicode CLDR 46 Beta is now available for specification review and integration testing. The release is planned for October 24rd, but any feedback on the specification needs to be submitted well in advance of that date. The beta specification is available at Draft LDML Modifications. The biggest changes in the specification are the updates to Message Format in tech preview, the updates to conformance, and the new tech preview section on semantic skeletons. See also the Migration section of the new release page.


CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.) For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)


Via the Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.


The beta has already been integrated into the development version of ICU 76. We would especially appreciate feedback from non-ICU consumers of CLDR data and on Migration issues. 


Feedback can be filed at CLDR Tickets.


The most significant changes to the data in this release are:


  • Updates to Unicode 16.0 (including major changes to collation), 

  • Substantial additions and modifications of Emoji search keyword data, 

  • ‘Upleveling’ the locale coverage (see below).


For the details, see  the draft CLDR 46 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — will cover Migration issues.

New / Upleveled Locales

±

New Level

Locales

๐Ÿ“ˆ

Modern

Nigerian Pidgin, Tigrinya

๐Ÿ“ˆ

Moderate

Akan, Baluchi (Latin), Kangri, Tajik, Tatar, Wolof

๐Ÿ“ˆ

Basic

Ewe, Ga, Kinyarwanda, Konkani (Latin), Northern Sotho, Oromo, Sichuan Yi, Southern Sotho, Tswana

๐Ÿ“‰

Basic*

Chuvash, Anii

For more information


See the draft CLDR 46 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.



Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Monday, September 16, 2024

Bloomberg Joins as Supporting Member of the Unicode Consortium

[image] The Unicode Consortium is pleased to announce that Bloomberg has joined as a Supporting Member.

Bloomberg is a global leader in business and financial information, delivering trusted data, news, and insights that bring transparency, efficiency, and fairness to markets. The company helps connect influential communities across the global financial ecosystem via reliable technology solutions, such as the Bloomberg Terminal, that enable financial professionals to make more informed decisions and foster better collaboration.

For more than four decades, the Bloomberg Terminal has revolutionized the financial services industry by bringing transparency and innovation to the capital markets. Trusted by the world’s most influential decision-makers, the Terminal provides real-time access to news, data, insights and trading tools that help our customers turn knowledge into action.

“Bloomberg is excited to join the Unicode Consortium and to help advance the state of internationalized software for global financial applications,” said Matt O’Conor, leader of Bloomberg’s Internationalization Infrastructure Engineering team. “Unicode and its associated technologies are instrumental to the continued success of modern computing. The Bloomberg Terminal is built on Unicode, in order to support our users who speak a variety of languages. We are honored to take part in this worldwide community and to share our own insights and expertise in the global markets with others, as well as to learn from the greater Unicode community as we continue providing first-class products and services to our clients in their respective locales.”

“We are pleased to welcome Bloomberg as a Unicode Consortium member, recognizing the company’s pivotal role in global financial communications and data exchange,” said Toral Cowieson, CEO of Unicode. “As the first financial technology firm to join Unicode, Bloomberg’s expertise and commitment to Unicode’s work will greatly enhance our collective efforts to ensure internationalization standards meet the evolving needs of industries worldwide.”

Supporting Members of the Consortium have representation on up to two technical committees, and a half vote in each one.

Information on Full, Supporting, and Associate memberships and benefits can be found on Unicode’s website along with the list of current members.


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Tuesday, September 10, 2024

Announcing The Unicode® Standard, Version 16.0

[image] Version 16.0 of the Unicode Standard is now available. This is a major version update that includes new characters and code charts, new data files and annexes, an updated core specification, and updated annexes and synchronized standards.

This version adds 5,185 new characters, including 3,995 additional Egyptian Hieroglyph characters plus seven new scripts, seven new emoji characters, and over 700 symbols from legacy computing environments, for a total of 154,998 characters. See the delta code charts for details on all the new scripts and characters. For additional details regarding new emoji, see Emoji Recently Added, v16.0.

In addition to new characters, new “Moji Jลhล Kiban” (ๆ–‡ๅญ—ๆƒ…ๅ ฑๅŸบ็›ค) Japanese source references have been added for over 36,000 CJK unified ideographs. This is reflected in the code charts for virtually all CJK unified ideograph blocks by additional representative glyphs in the “J” column.

The core specification for Version 16.0 is now available for browsing online as per-chapter web pages with “breadcrumb” and other links for easy navigation.

Two new annexes have been added to this version:
  • UAX #53, Unicode Arabic Mark Rendering: This annex, which was previously published as a Technical Report, specifies an algorithm for handling combining marks when rendering to ensure correct and consistent display of Arabic script text.
  • UAX #57, Unicode Egyptian Hieroglyph Database (Unikemet): This annex documents the format of the Unikemet.txt data file, which provides information clarifying the identity of Egyptian Hieroglyph characters and properties useful for implementations.
For complete details on Unicode Version 16.0, see https://www.unicode.org/versions/Unicode16.0.0/.


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Thursday, September 5, 2024

Unicode CLDR v46 Alpha available for testing

 

The Unicode CLDR v46 Alpha is now available for integration testing. 


CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.) For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)


Via the Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.


The alpha has already been integrated into the development version of ICU 76. We would especially appreciate feedback from non-ICU consumers of CLDR data and on Migration issues. Feedback can be filed at CLDR Tickets.


The most significant changes in this release were:


  • Updates to Unicode 16.0 (including major changes to collation), 

  • Further revisions to the Message Format 2.0 tech preview, 

  • Substantial additions and modifications of Emoji search keyword data, 

  • ‘Upleveling’ the locale coverage (see below).


For the details, see  the draft CLDR v46 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — will cover Migration issues.

New / Upleveled Locales

±

New Level

Locales

๐Ÿ“ˆ

Modern

Nigerian Pidgin, Tigrinya

๐Ÿ“ˆ

Moderate

Akan, Baluchi (Latin), Kangri, Tajik, Tatar, Wolof

๐Ÿ“ˆ

Basic

Ewe, Ga, Kinyarwanda, Konkani (Latin), Northern Sotho, Oromo, Sichuan Yi, Southern Sotho, Tswana

๐Ÿ“‰

Basic*

Chuvash, Anii



Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Unicode Consortium - Updated Terms of Use

The Unicode Consortium has recently updated its Terms of Use. These Terms of Use govern the use of Unicode products and services available on Unicode’s website and from third-party hosting sites such as Github. They also govern participation in and contribution to Unicode Consortium activities. 

The updated Terms of Use do not materially change the terms and conditions under which Unicode products and services are made available to users.

These changes:
  • accommodate the evolution in Unicode’s operations and distribution, such as the migration of many Unicode projects to Github and other third-party websites,
  • consolidate the permissions and licenses applicable to Unicode’s various products and services, making them easier to find and understand,
  • update and more clearly communicate Unicode’s technical contribution policies, 
  • incorporate Unicode’s Code of Conduct and other policies, and
  • otherwise clarify language and improve consistency.


Unicode remains committed to its mission to enable people around the world to use computers in any language by making freely available the standards, specifications, software, and data that form the foundation for software internationalization. 

Your use of Unicode’s products, services, and website, as well as your participation in Unicode activities, constitutes your agreement to these Terms of Use. Accordingly, we encourage you to review them.




Tuesday, August 27, 2024

Highlights from Unicode Technical Meeting #180

by Peter Constable, UTC Chair

Unicode Technical Committee (UTC) meeting #180 was held July 23 – 25 in Redmond, Washington, hosted by Microsoft. Here are some highlights.

Finalizing Unicode 16.0

One priority was to finalize technical decisions for Unicode 16.0 in preparation for a September 10 release. Beta feedback and a small number of new proposals were considered and various decisions affecting 16.0 were taken. Regarding the set of encoded characters and emoji sequences for Unicode 16.0, no changes were made from the Beta.

Unicode 16.0 will include major additions and improvements for Egyptian Hieroglyphs, most of which were already included in the Beta. One aspect of the improvements is a refinement in the encoding model for rotational variants using variation sequences. Since the Beta, it was recognized that ten of the Egyptian Hieroglyph encoded as characters in Unicode 5.2 would be better represented using rotational variation sequences. This led to some new UTC decisions affecting the 16.0 release:
  • Ten standardized variation sequences for Egyptian Hieroglyph rotational variants were added, while one standardized variation sequence that had been added in Unicode 15.0 was rescinded.
  • In the Unikemet.txt data file with Egyptian Hieroglyph properties, the kEH_Core property has been changed from a binary property to having an enumeration of values, one of which is “L(egacy)” indicating characters encoded in Unicode 5.2 that are not part of the core set and are not expected to be supported in fonts.
Another significant change affecting the 16.0 release is a glyph change for U+0620 ARABIC LETTER KASHMIRI YEH, and a change to its joining group in ArabicShaping.txt (180-C23, 180-C24). This affects not only the glyph shown in the code chart, but also the positional forms shown in the Arabic section of the core spec. The need for this arose from incorrect information in the core spec resulting in fonts that don’t provide a final form that matches users’ expectations. See L2/24-152 for background details.

While no further changes were made to the set of emoji in Unicode 16.0, a change will be made in how emoji characters are displayed in the code charts. The technology used to produce the chart pages is not able to display full-color emoji, and up to now the code charts have not made it clear when pictographic symbols have the Emoji property. In Unicode 16.0, characters with the Emoji property will be indicated in the code charts with a small triangular badge in the top left corner of the cell. A white triangle will indicate an emoji character that should have default emoji (full color) presentation:

A black and white sign with a clock
A black triangle will indicate an emoji character that should have default text (monochrome) presentation:

A rectangular sign with a grid and numbers
The script descriptions in the core spec are used to provide background information on each script as well as information to guide implementations. For many scripts, it has been a challenge to provide comprehensive guidance for implementations, particularly when there are complex rendering requirements. However, some implementers have written Unicode Technical Notes providing guidance for implementation of a particular script. Although these are not normative specifications approved by UTC, they can still be valuable information conducive to interoperable implementations. For Unicode 16.0, UTC decided to have two existing UTNs referenced within the core spec sections for the respective scripts:
As mentioned for the Beta, the core spec for Unicode 16.0 will be published as per-chapter HTML pages.

Characters for future versions

At UTC #180, code points were provisionally assigned for 1,063 new characters, including 38 Arabic characters, 45 characters for phonetic transcription, and 965 ideographs and radicals for Jurchen script. With these characters in the pipeline, work can get started on property data, charts, and other content that will be needed for them to be encoded in a future version of the standard.

Some initial decisions were also taken on the character additions for Unicode 17.0: as IRG had finalized its recommendations for CJK Unified Ideographs Extension J, that block of 4,300 new ideographs has been approved for encoding in Unicode 17.0.

Also, a proposal was approved to disunify one existing CJK unified ideograph character, U+5CC0 in Unicode 17.0. When U+5CC0 was encoded in Unicode 1.1, it was deemed that two similar ideographs should be unified. The proposal demonstrated that this unification should not have been made, and that was confirmed earlier this year by IRG. The changes for 17.0 will include encoding of a new character, U+2B73A, and revision of the source references for 5CC0, 2B73A and 2F879. A complication in this case is that ideographic variation sequences for the two distinct glyphs have been registered for use in Japan. No changes in “J” source references will be made, and it is not expected that implementations for Japanese will be affected. For additional information, see section 7 of L2/24-165.

Variation sequences and historic scripts

People working with historic scripts often deal with glyph variations. Variation sequences seem like an appropriate encoding mechanism to use in such cases, though asking UTC to standardize variation sequences for many historic variations could seem like a challenge. With that in mind, a proposal was presented to encode a block of additional "user-defined" variation selector characters. These would be additional PUA characters with a constraint that they would only be used as variation selectors.

That proposed solution is problematic: existing stability policies and commitments prevent assigning more PUA code points and also prevent constraining existing PUA for certain uses. At the same time, there was opinion within UTC that the need expressed was reasonable, and there was openness to considering alternative solutions. One potential alternative that gained some interest was to establish a registration process, similar to what is defined in UTS #37 for ideographic variation sequences but intended for use with historic scripts.

For complete details on outcomes from UTC #180, see the draft minutes.


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Thursday, July 18, 2024

Bidirectional Text (Part 2): Delving into Bidi -- Registration Now Open!


When: Tuesday, August 13, 2024 starting at 10:00 (San Francisco), 13:00 (New York), 17:00 (UTC) and 19:00 (Berlin).

About the Webinar

A number of scripts, such as Hebrew, Arabic and Urdu, write their letters horizontally on a page or screen, running right to left. A complication for these scripts is that other characters, such as digits, flow left-to-right, and can occur on the same line, or even alongside other left-to-right text, such as Latin. Text that handles both right-to-left and left-to-right text is called “bidirectional” text (“bidi” in short).

In the second of a three-part series, Part 2 will delve deeper into more advanced topics and practical applications of bidirectional text processing. Three expert speakers will be on hand to provide detailed insights and answer your questions, ensuring you gain a comprehensive understanding of the subject.

Who Should Register?
  • Translators/Localizers
  • Localization Tooling makers
  • I18n infrastructure developers
  • Linguists and language researchers
  • Application developers
  • Content authors
Session Presenters

Adil Allawi has worked for over 40 years in multilingual engineering. One of his early projects was working on one of the first implementations of right to left text in a personal computer on the Apple II. As such he feels personally responsible for all the bidi problems that have happened since. Adil has been a regular contributor to Unicode, consulting on the definition of Bidi isolates and auto direction and the encoding of Arabic Mathematical Symbols. He currently works at Apple where he has contributed to right-to-left support in multiple products including iWork, App Store, Music and Apple TV.

Ayman Aldahleh began his career in the late 1980s at a small software company named 4C in Kuwait, where he focused on developing PC-DOS applications that supported the Arabic language. He soon moved to Microsoft, where he led the Arabization of several early Microsoft products, including DOS, Works, Windows, and Word. At Microsoft, Ayman’s role expanded to include support for bidirectional and complex script languages, text rendering, font management, and accessibility. He eventually managed the engineering team that scaled the internationalization platform for all Microsoft Office applications, enhancing multilingual and machine translation features. His final role at Microsoft involved overseeing cross-platform user experience for the Microsoft Fluent design. Ayman retired from Microsoft in late 2023 but remains an enthusiastic advocate for technology and internationalization. He has been a member of the Unicode Board of Directors since 2017. Ayman earned a Bachelor of Science in Computer Engineering from the University of Arizona.

Roozbeh Pournader is an internationalization engineer who has been contributing to the Unicode Standard since 1999. He started his internationalization career in Iran in 1994 when he was a high school student. After moving to the United States, he has worked at companies such as Google and WhatsApp. He has received a Unicode Bulldog award for his contributions to Unicode and CLDR’s support for complex scripts, and is Vice Chair of the Unicode Script Encoding Working Group.

Registration: Registration is Open Now! Please note this session will also be recorded and available via the Unicode YouTube channel.



Supporting Resources for Bidirectional Text (Part 2): Delving into Bidi

Bidirectional Text (Part 1): The Basics of Bidi Video Recording: https://youtu.be/TWfvRdS_7x0

Frequently Asked Questions: https://unicode.org/faq/bidi.html

Articles:
Additional Articles from W3C:
About the Unicode Consortium

The Unicode Consortium is the premier non-profit open source, open standards body for the internationalization of all software and services.

For more than 30 years, the Unicode Consortium has coordinated the efforts of a worldwide team of volunteer programmers and linguists to standardize, evolve, and maintain a global software foundation that allows virtually every computer system and service to help people connect using their native language.

For additional information about Unicode, visit home.unicode.org.

Unicode Resources

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Thursday, July 11, 2024

Unicode Technology Workshop 2 — Call for Submissions Now Open!


Event Dates: October 22-23, 2024

Where: San Francisco Bay Area (Hosted at Google’s Sunnyvale, California campus). For planning purposes, the closest airports are San Francisco International Airport (SFO) and San Jose Mineta International Airport (SJC) and the recommended public transportation option is via VTA Light Rail to the Lockheed Martin Station.

The Second Annual Unicode Technology Workshop (UTW 2) builds upon the success of last year’s event, which brought together more than 80 internationalization enthusiasts for two days of connecting, learning, and envisioning the possibilities. 

UTW 2 is designed to be accessible to GILT professionals and students, while providing enough information and depth to be relevant to established internationalization experts. The primary goals of the workshop are to strengthen the internationalization community and further the adoption of Unicode standards and technology. For this year’s workshop, Unicode will also be adding case studies and speed networking to provide an even more engaging experience for all attendees. 

Call for Submissions Now Open!

Unicode is pleased to announce that session proposals for UTW 2 are now being accepted!

We are seeking proposals for workshops, seminars, free-form discussions, and lightning talks that center around Unicode i18n libraries, locale data frameworks, globalization tooling, localization pipelines, input methods, and text rendering. Come connect with other Unicode users, share your knowledge and experience, and help us envision the future of Unicode technology. You will come away with deeper knowledge on how to solve tough problems in the i18n and l10n space and how to engineer products that work better for global users. Note: To encourage maximum collaboration amongst the attendees, this is an in-person-only event.

If you have an idea for a session that you would like to lead, you can register your interest in contributing by using the following link: Submissions. We are interested in proposals that represent a variety of perspectives, so i18n and l10n technical specialists, working GILT professionals, and university students are all encouraged to propose ideas. Deadline for submissions is August 12, 2024 by 5:00PM PT. Proposals will be reviewed in late August and session hosts will be notified mid-September.

Sponsorship Opportunities

Sponsorship opportunities are available at various levels. Sponsorship benefits include complimentary registrations, opportunities to lead a session or workshop, recognition on the event website, program and event materials, visibility on social media, and much more. Specific offerings vary by sponsorship level.

If you want to demonstrate your industry leadership, enhance your brand, share your knowledge, promote your products and services, and foster community building, contact events@unicode.org today to learn more. Sponsorship discounts are available to Unicode Full and Supporting Members.

About the Unicode Consortium

The Unicode Consortium is the premier non-profit open source, open standards body for the internationalization of all software and services. 

For more than 30 years, the Unicode Consortium has coordinated the efforts of a world-wide team of volunteer programmers and linguists to standardize, evolve, and maintain a global software foundation that allows virtually every computer system and service to help people connect using their native language.

For additional information, visit home.unicode.org.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Tuesday, June 11, 2024

Announcing Updated Technical Group Procedures at Unicode

We are pleased to announce that the Unicode Consortium has updated its Technical Group Procedures. Last revised in January 2022, these new procedures are more closely aligned with our current practices and the evolving needs of our technical governance. This change comes as a result of the collaborative efforts from our technical leadership and Board Governance Committee. Additional input was solicited from the Technical Committee delegates of Unicode’s membership and the Unicode Board. 

Aligning Practices with Needs

Over time, it has become evident that our former procedures were not fully reflective of what our technical groups needed. As a non-profit open source organization at the forefront of developing standards, code, and data that enable people around the world to use computers in any language, our technical procedures must facilitate effective and efficient governance.

Clarity and Efficiency in Governance

The newly revised procedures are designed to provide clear and comprehensive guidelines that will support our technical leadership in managing and directing our technical work more effectively. It also rationalizes the structures and terminology: for example, so that there are two technical groups:
  • The Technical Committees (TCs), responsible for all of the decisions
  • Their Work Groups (WGs) that carry out TC actions, investigate and research issues, and make recommendations to their TCs
The updates to procedures are crucial for ensuring that our decision-making processes remain transparent and aligned with our organizational goals.

Thanks to Our Community

We extend our deepest thanks to everyone involved in the drafting of these updated procedures. The collaboration and dedication of our community members ensure that the Unicode Consortium continues to be a leader in the internationalization of software and services.  We also appreciate the Unicode Board for its unanimous consent to adopt these new procedures.



Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Wednesday, June 5, 2024

Interview with Unicode Volunteer -
Addison Phillips

[image Addison Phillips] Throughout the year, we engage in in-depth discussions with Unicode contributors to spotlight their vital contributions and share their stories. These conversations are a key part of our initiative to recognize the often unseen efforts of our volunteers, offering a more personal glimpse into the lives of those who drive our mission forward.

In our latest feature, meet Addison Phillips, a dedicated volunteer who brings a wealth of enthusiasm, expertise, and passion to the Unicode community.

Q: What do you do now and what’s your role with Unicode?

A: I am currently Chair of the Unicode Message Format Working Group. I retired having spent the last 14 years at Amazon as well as a variety of other organizations including Yahoo, Web Methods (part of Software AG), and AT&T/Lucent Technologies. I’m primarily an “internationalization architect”, but I’ve worked in the localization, tools, and consulting space.

Q: How long have you been a volunteer at Unicode? Is there an area of focus currently?

A: A long time. I have volunteered on different levels with Unicode since the early 2000s. I had some less consistent involvement in the late 1990s.

Most recently, I’ve been the Chair of the Message Format Working Group, which is part of the CLDR project. We just released our Technical Preview a couple of months ago, as part of LDML45.

A lot of locale data is focused on individual APIs–how do you format a number? How do you format a time? How do you call “January” or “Tuesday” or “Morocco” in a given language? But Message Format, to me, is a much better starting point for the people building the software–and the people localizing it. It’s a format that lets developers make easy-to-translate, grammatical messages and saves all these people from having to learn all the low-level formatting minutiae. Building an open, shared, consistent standard for formatting will unlock so much.

Q: How did you first become involved with Unicode?

A: Initially, I attended Unicode conferences, as I was working in localization and the i18n consulting space. I was lobbying for Unicode support, for example from browser makers such as Netscape. I started presenting at Unicode conferences, including the Introduction to Internationalization tutorial. In the early 2000s, I joined the conference review committee and the editorial committee and also engaged my employers to become members of Unicode. In March 2003, I attended the Unicode conference in Prague, where Mark Davis (Unicode’s cofounder) and I cooked up a plan to address issues with locale identifiers–then a hot topic–which resulted in BCP47. That work is a cornerstone of the locale data work that, today, is CLDR. On-going, I had steady but what I would consider “lower tier involvement” with Unicode including lots of communication about needed fixes.

Q: What do you enjoy about contributing to Unicode?

A: The camaraderie. These are the people who “speak my language”: they share the same concerns, and face the same problems. Unicode as an organization has been really effective at delivering impactful things, both as a consumer and promoter of these technologies. It’s been a powerful way to effect change. Really early in my career, I was working on an overseas mainframe project at AT&T. It was scary: I needed to find a system-specific encoding map. There was one guy who was rumored to have the mapping. I had to call the Adobe switchboard and hope they would connect me to this “Ken Lunde” person (luckily he took the call!). It was a tricky world to live in, with every company having its own operating system and each operating system having its own set of character encodings per language. Everything was bespoke. Because of Unicode, this issue no longer exists. Unicode changed how computing works and how it’s thought of; having CLDR data and ICU as an implementation of that, it has made life so much easier.

Q: Do you have a favorite Unicode project you’ve worked on? Why?

A: I have really liked a lot of the projects. I am most excited by the growth of the community engagement area. Education and awareness is the biggest problem we have in the internationalization space. The encoding of text and the support of different languages and cultures is now widely available, but nobody is aware of it. No one learns it in computer science courses. Engineers are busy and they generate this kind of “disinformation bubble” of quick hacks that localization teams in particular have to overcome.

In my roles with previous employers and in my consulting–and the reason I did the tutorials at Unicode conferences–was, before we can actually move forward, everyone needs to be on common ground, with common understanding and a common vocabulary. I couldn’t be happier than to see Unicode reaching the community with information and providing standard information so everyone, no matter what environment they come from, can learn this stuff–the right way.

Q: What contribution(s) to the Unicode community are you most proud of?

A: The locale identifier work (BCP47) was pretty impactful. The personal things and making people aware that Unicode is there and a reliable source of information. Promoting Unicode has been an impactful thing. Over the years, I’ve taught the internationalization tutorial to thousands of people which I believe has had a long-term impact.

Q: How did you become involved in computer science?

A: I had a job in the 1980s at a company that built shopping centers, and, among other things, operated a bookstore. They had developed a retail system running on minicomputers that they sold to other independent bookstores, and I worked for the owner developing that system: that was my first professional job and it laid the groundwork for everything. Later, I had a job with the localization/internationalization group at AT&T: once you’ve shipped “not English” there’s no going back. I followed that to internationalization consulting, working with Bill Hall.

Q: What is your favorite book?

A: I am an avid reader so it’s hard to pick just one book. My preferred genres are fantasy and science fiction. (Fun fact: I went to Amazon to work on Kindle!)

Q: Where did you grow up?

A: Born in France, but my parents are Americans. When I was young, I lived in France and Germany. I went to high school and spent my formative years in Carmel, CA.

Q: Beach or Mountains?

A: Beach. I mean, it pretty much has to be “beach”, since I live in a town called “Dillon’s Beach”.

Q: Any advice for anyone interested in volunteering at Unicode?

A: Two things. First, jump in, the water’s fine. The Unicode space can be heavy with jargon and seem full of insider knowledge, but don’t be put off by that. Ask questions, because people are always super excited to share. There truly are no dumb questions. Don’t think just because things are operating a certain way, that you can’t question it, as there might be a new, better, or different way to do it. Maybe nobody said anything before! If you come with well-thought out questions, there will always be a positive reception. Unicode is a happy and helpful space to work in.

Second, give back. Unicode is an incredibly small organization. The number of contributors is way smaller than the impact Unicode has. And Unicode could do so much more, if only we had more people contributing. Linguistic and cultural support in software could be so much more powerful, if only we had contributions.

Q: Anything else you’d like to share?

A: I’ve spent 25 years at W3C and I’ve been the Chair of the Internationalization Working Group for most of that time. We are a partner organization in promoting internationalization. We need help there too.


Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
๐Ÿ•‰️๐Ÿ’—๐ŸŽ️๐Ÿจ๐Ÿ”ฅ๐Ÿš€็ˆฑ₿♜๐Ÿ€

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock


As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.