Thursday, December 2, 2021

The Most Frequently Used Emoji of 2021

The Unicode Emoji Mirror Project

Emoji 15 image
92% of the world’s online population use emoji — but which emoji are we using? The Unicode Consortium, the not-for-profit organization responsible for digitizing the world’s languages, gathers information about how frequently emoji are used. Looking at patterns of usage helps to determine what new emoji should be added to the Unicode Standard. As part of this effort, we are making that data available to the public. 

The new Unicode Emoji Frequency page lists the Unicode v12.0 emoji ranked in order of how frequently they were used in 2021 and what has changed since 2019. Check it out for more analysis, insights and patterns that illustrate our collective experience during a global pandemic.

#UnicodeEmojiMirror


Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Wednesday, November 17, 2021

Unicode Emoji 15.0 Provisional Candidates

Emoji 15 image
The Unicode Technical Committee has approved the list of provisional candidates for Emoji 15.0. They are slated for release in September 2022 together with Unicode 15.0. These candidates were identified by the Unicode Emoji Subcommittee after reviewing proposals ranked according to previously-determined selection factors.

The list of provisional emoji candidates can be found here. Note that they have not yet been assigned code points or properties. For comments on these candidates, please reference PRI #435 in your feedback.

How to Provide Feedback: For information about how to discuss this Public Review Issue and how to supply formal feedback, please see the feedback and discussion instructions.

Feedback is reviewed by the relevant committee according to their meeting schedule.


Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Wednesday, November 10, 2021

ICU4X 0.4 Released

ICU LogoUnicode® ICU4X 0.4 has just been released. This revision brings an implementation of Unicode Properties, major performance and memory improvements for DateTimeFormat, and extends the data provider data loading models with BlobDataProvider.

ICU4X 0.4 also adds initial time zone support in DateTimeFormat, week of month/year, iteration APIs in Segmenter and experimental ListFormatter.

The ICU4X team is shifting to work on the 0.5 release in accordance with the roadmap and a product requirements document setting sights on a stable 1.0 release in Q2 2022.

ICU4X aims to develop a highly modular set of internationalization components for resource-constrained environments, portable across programming languages.

Multiple early adopters use ICU4X in pre-release software in Rust, C, C++, and WebAssembly. The team is ready to onboard additional early adopters to refine the APIs, build processes, and feature sets before the 1.0 release. The team is also looking for contributors to write code generation for additional target programming languages. For more information, please open a discussion on the ICU4X GitHub.

For details, please see the changelog.


Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

Thursday, October 28, 2021

Unicode CLDR v40 now available!

[nest image] Unicode CLDR version 40 is now available, with approximately 140,000 new or modified data fields.

In this release, the focus is on:

Grammatical features (gender and case)

In many languages, forming grammatical phrases requires dealing with grammatical gender and case. Without that, it can sound as bad as "on top of 3 hours" instead of "in 3 hours". The overall goal for CLDR is to supply building blocks so that implementations of advanced message formatting can handle gender and case.
  • Phase 1 (v39) of grammatical features included just 12 locales (da, de, es, fr, hi, it, nl, no, pl, pt, ru, sv) for all units of measurement.
  • Phase 2 (v40) has expanded the number of locales by 29 (am, ar, bn, ca, cs, el, fi, gu, he, hr, hu, hy, is, kn, lt, lv, ml, mr, nb, pa, ro, si, sk, sl, sr, ta, te, uk, ur), but for a more restricted number of units.
  • Phase 3 (v41) will further expand the units.

Emoji v14 names and search keywords

CLDR supplies short names and search keywords for the new emoji, so that implementations can build on them to provide, for example, type-ahead in keyboards.

Modernized Survey Tool front end

The Survey Tool is used to gather all the data for locales. The outmoded Javascript infrastructure was modernized to make it easier to add enhancements (such as the split-screen dashboard) and to fix bugs.

Specification Improvements

The LDML specification has some important fixes and clarifications for Locale Identifiers, Dates, and Units of Measurement.



Please see the CLDR v40 Release Note for details, including:

Unicode CLDR provides key building blocks for software supporting the world's languages. CLDR data is used by all major software systems (including all mobile phones) for their software internationalization and localization, adapting software to the conventions of different languages.


Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]

ICU 70 Released

ICU LogoUnicode® ICU 70 has just been released. ICU 70 incorporates updates to Unicode 14, including new characters, scripts, emoji, and corresponding API constants. ICU 70 adds support for emoji properties of strings. It also updates to CLDR 40 locale data with many additions and corrections. ICU 70 also includes many other bug fixes and enhancements, especially for measurement unit formatting, and it can now be built and used with C++20 compilers.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode Standard and of the Unicode locale data (CLDR).

For details, please see https://icu.unicode.org/download/70.

Note: Our website has moved. Please adjust your bookmarks.


Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]