Tuesday, September 14, 2021

Announcing The Unicode® Standard, Version 14.0

Vithkuqi Sample Version 14.0 of the Unicode Standard is now available, including the core specification, annexes, and data files. This version adds 838 characters, for a total of 144,697 characters. These additions include five new scripts, for a total of 159 scripts, as well as 37 new emoji characters.

The new scripts and characters in Version 14.0 add support for modern language groups in Bosnia, India, Indonesia, Iran, Java, Malaysia, Mongolia, Myanmar, Pakistan, and the Philippines, plus other languages in Africa and North America, including:
  • Arabic script additions that include honorifics and additions for Quranic use, and characters used to write languages across Africa, the Balkans, and South and Southeast Asia
  • The Vithkuqi script historically used to write Albanian and currently undergoing a modern revival
  • The Tangsa script used to write the Tangsa language, spoken in India and Myanmar
  • The Toto script used to write the Toto language in northeast India
  • Many Latin script additions for extended IPA
Popular symbol additions include:
  • 37 emoji characters, including several new emoji for emotion and hand gestures (smileys, hands, animals and nature, food and drink, transport, and activities). For the full list of new emoji characters, see emoji additions for Unicode 14.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji.
Other symbol and notational additions include:
  • The som currency sign used in the Kyrgyz Republic
  • Znamenny musical notation developed in Russia
Support for other modern languages and scholarly work extends worldwide, including:
  • Cypro-Minoan, historically used primarily on the island of Cyprus
  • Old Uyghur, historically used in Central Asia and elsewhere to write Turkic, Chinese, Mongolian, Tibetan, and Arabic languages
  • Ahom, Balinese, Brahmi, Canadian aboriginal languages, Glagolitic, Kaithi, Kannada, Mongolian, Tagalog, Takri, and Telugu
  • Arabic support for Hausa, Wolof, Hindko, and Punjabi, and Ethiopic support for Gurage
Important chart font updates, including:
  • Significant updates to the CJK auxiliary blocks and enclosed alphanumerics
Unicode properties and specifications determine the behavior of text on computers and phones. Changes in Version 14.0 include the following Unicode Standard Annexes and Technical Standards that have notable modifications:

Five important Unicode annexes updated for Version 14.0:
Three important Unicode specifications updated for Version 14.0:
The Unicode Standard is the foundation for all modern software and communications around the world, including operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages


Thursday, September 9, 2021

Unicode CLDR v40 Alpha available for testing

construction image The Unicode CLDR v40 Alpha is now available for testing. The alpha has already been integrated into the development version of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR data. Feedback can be filed at CLDR Tickets.

Alpha means that the main data and charts are available for review, but the specification, JSON data, and other components are not yet ready for review. Some data may change if showstopper bugs are found. The planned schedule is:
  • Sep 21 — Beta (data)
  • Oct 06 — Beta2 (spec)
  • Oct 27 — Release
In CLDR v40, the main focus is on:
  • Grammatical features (gender and case) for units of measurement in additional locales

    Phase 1 (v39) of grammatical features included just 12 locales (da, de, es, fr, hi, it, nl, no, pl, pt, ru, sv).

    Phase 2 (v40) has expanded the number of locales by 29 (am, ar, bn, ca, cs, el, fi, gu, he, hr, hu, hy, is, kn, lt, lv, ml, mr, nb, pa, ro, si, sk, sl, sr, ta, te, uk, ur), but for a narrower set of units.

  • Emoji v14 names and search keywords
  • Modernized Survey Tool front end.
There are many other changes: to find out more, see the draft CLDR v40 release page, which has information on accessing the date, reviewing charts of the changes, and necessary migration changes.

Unicode CLDR provides key building blocks for software supporting the world’s languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.

Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages


Tuesday, September 7, 2021

Unicode Consortium Announces Version 14.0 Cover Design

The Unicode Consortium is pleased to announce the new design selected for the cover of the forthcoming print-on-demand publication of The Unicode Standard, Version 14.0. The Unicode Consortium issued an open call for artists and designers to submit cover design proposals. All submitted designs were reviewed by an independent panel.

Image of Sophia Tai design
The selected cover artwork for Version 14.0 is an original design by Sophia Tai, an MA student in Typeface Design at the University of Reading. Her cover art represents type in boxes, which shares a visual language with the arrangement of metal type, as well as the Unicode code charts. She selected a global mix of characters to present a variety of writing systems, using neon colors to create liveliness. The neutral background represents a sense of being down to earth, as well as the longevity and preservation of writing systems.

Two runner-up designs were also selected. One is a contemporary design by Beatriz de Paula Mattos, a graphic design student at the University of Vale do Itajaí, Brazil. The other runner-up design was created by Jesús Barrientos Mora, a professor with a degree in Type Design, who also leads the Talavera Type Workshop foundry in Puebla, Mexico.

Beatriz de Paula Mattos:
Image of Beatriz de Paula Mattos design
Jesús Barrientos Mora:
Image of Jesús Barrientos Mora design

Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages


Tuesday, August 3, 2021

Announcing Internationalization & Unicode Conference #45 Keynote Speaker Gretchen McCulloch

Taking Playfulness Seriously—When character sets are used in unexpected ways

Gretchen McCulloch, photo by Yvon Huynh
HEEEEELLLLLLOOOO friends of Unicode! HaVE yoU HEard? 🚨🚨This year’s keynote speaker is Gretchen McCulloch, internet linguist and bestselling author of the 2019 book, Because Internet: Understanding the New Rules of Language. ✍️ You ✍️ may ✍️ also ✍️ know ✍️ Gretchen ✍️ from ✍️ her ✍️ column ✍️ about ✍️ internet ✍️ language ✍️ in ✍️ WIRED. If you aren’t familiar withGretchen’s book it includes great insights of how language and technology evolves! Don’t miss this unique opportunity to hear from her in person. (♥ω♥*)

Her talk, “Taking Playfulness Seriously—When character sets are used in unexpected ways,” explores those trailblazing language disrupters, who aren’t checking out the Oxford English dictionary or asking themselves, “What would my college English prof do about this comma?” You know who you are! 👀 She’ll discuss all the creative ways real-life netizens playfully create ASCII art out of text or combine emoji to convey new meanings, as well as the problems that arise when these kinds of creative uses clash with technical tools behind the scenes that aren’t expecting the unexpected—and what some solutions might look like.

In addition to Gretchen McCulloch, the conference offers Unicode tutorials, talks and panels on internationalization, web design, emoji, indigenous languages, historical scripts, and more. Of course, the conference also includes plenty of networking opportunities, as well as a special celebration of the Unicode Consortium’s 30th anniversary!


See What’s Happening at IUC 45

For thirty years the Internationalization & Unicode® Conference (IUC) has been the preeminent event highlighting the latest innovations and best practices of global and multilingual software providers. Join us in Santa Clara to promote your ideas and experiences working with natural languages, multicultural user interfaces, producing and supporting multinational and multilingual products, linguistic algorithms, applying internationalization across mobile and social media platforms, or advancements in relevant standards.

Join expert practitioners and industry leaders as they present detailed recommendations for businesses looking to expand to new international markets and those seeking to improve time to market and cost-efficiency of supporting existing markets. Recent conferences have provided specific advice on designing software for European countries, Latin America, China, India, Japan, Korea, the Middle East, and emerging markets.

Join us for the Internationalization & Unicode Conference 45, October 13-15, 2021, Santa Clara, California. To register and learn more, please visit the Internationalization & Unicode Conference website. Object Management Group®, (OMG®) organizes the Internationalization and Unicode Conferences around the world under an exclusive license granted by the Unicode Consortium.

Tuesday, July 20, 2021

The Unicode Consortium Welcomes Toral Cowieson as Executive Director & COO

keyboard photo Since its founding, the Unicode Consortium has grown and expanded its charter and scope. We’re embarking on a new chapter in the evolution of the Consortium and are pleased to announce the appointment of Toral Cowieson in the newly-created position of Executive Director & COO.

“We are thrilled to have Toral joining the team,” said Mark Davis, President and cofounder of the Consortium. “She brings a wealth of experience in leadership across non-profits, corporations, and board service. Her recent time at the Internet Society, including as head of Strategy and Impact Measurement, puts Unicode in good stead for this next stage of growth."

In this senior executive position reporting to the Chair of the Board of Directors, Ms. Cowieson will collaborate with the Board, officers, and team to extend the technical mission and impact, set the future agenda and program priorities, and ensure the long-term health and sustainability of the organization.

“Unicode standards are at the heart of how users seamlessly receive and share information across the nearly 22 billion devices around the world. I’m honored and excited to be joining the Consortium at this juncture, and look forward to working with the Board, staff, and the extended Unicode community to advance the mission and have an even greater impact in the years to come,” commented Ms. Cowieson.

In addition to Ms. Cowieson joining as Executive Director, the Consortium is also pleased to announce the following changes:

Board and Other Leadership Updates

Iris Orriss, who joined the Unicode Board in 2019, has been elected as the Treasurer of the Consortium. She is VP of Internationalization, Product Quality, and Product Experience Analytics at Facebook. Ms. Orriss is also Chair of the Board’s Finance and Funding Committee.

Greg Welch, member of the Board since 2013, has been elected as the Secretary of the Consortium and carries forward the excellent work done in this office by Michel Suignard for more than a decade. Mr. Welch is also Chair of the Board’s Governance & Nominating Committee.

Markus Scherer, the Chair of the ICU Technical Committee, has been appointed a Vice President. He is a member of the Google software internationalization team, focusing on the effective use of Unicode and on the development and deployment of cross-product internationalization libraries.

Announcing Unicode Fellows

The Consortium has recently created a new category for distinguished contributors, whose deep, long-term knowledge of internationalization and dedication to work on standards has greatly benefited the Consortium for many years. The Consortium is pleased to announce its two inaugural Unicode Fellows.

Peter Edberg has been named a Unicode Fellow. He has worked on internationalization, text and language support at Apple since 1988. He has been Apple’s representative to the Consortium for many years, and has been actively involved since 2008 with the Unicode CLDR and ICU projects.

Michel Suignard has been named a Unicode Fellow after serving as Secretary for the Unicode Consortium from 2007 to 2020. He worked for more than twenty-five years at Microsoft, where he held various positions in the development and sales divisions, many involving the development of the Unicode Standard. He is currently an independent consultant working on character encoding related matters, such as Internationalized Domain Names (IDN) and typography. Michel is the code chart editor for the Unicode Standard and is also the project editor of ISO/IEC 10646, which is the ISO standard aligned with the Unicode Standard.

Over 140,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages