Tuesday, April 17, 2018

Submissions open for 2020 Emoji

stopwatch image The deadline for emoji for 2019 was April 1, so any submissions received after that date are considered for release in 2020.

The submission form has undergone some revision, so please be sure to review the new text before putting together a proposal. There is a limited number of emoji characters considered each year, so be sure to follow the form so that you can provide the best case for any proposed emoji.

The emoji subcommittee has also produced a new page which shows the Emoji Requests submitted so far. You can look at what other people have proposed or suggested. In many cases, people have made suggestions, but have not followed through with complete submission forms, or have submitted forms, but not followed through on requested modifications to the forms.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Friday, April 13, 2018

Last Call on Unicode 11.0 Review

stopwatch image The beta review period for Unicode 11.0 and related technical standards will close on April 23, 2018. This is the last opportunity for technical comments before version 11.0 is released in Q2 2018. Implementers and interested parties are encouraged to download data files, review proposed updates, and submit comments.

Unicode 11.0 adds seven new scripts, including Hanifi Rohingya, 66 additional emoji characters, including four new components for hair color (for a total of 157 emoj sequences). The set of Georgian Mtavruli capital letters has been added to support modern casing practices.
In addition to the Unicode core specification, five Unicode Standard Annexes and two Unicode Technical Standards have significant specification and/or data file updates that are correlated with the new additions for Unicode 11.0.0. Review of those changes is strongly encouraged during the beta review period.

UAX #14, Unicode Line Breaking Algorithm
  • Uses Extended_Pictographic property for future-proofing
UAX #29, Unicode Text Segmentation
  • New support for Indic virama handling
  • Uses Extended_Pictographic property for future-proofing
  • A new table of formal regex definitions
UAX #31, Unicode Identifier and Pattern Syntax
  • Refines the use of ZWJ in identifiers
  • Broadens the definition of hashtag identifiers
UAX #38, Unicode Han Database (Unihan)
  • Five new fields and improved regular expressions.
  • Document extension of Unihan properties to non-Unihan
UAX #44, Unicode Character Database
  • New property Equivalent_Unified_Ideograph
  • New regular expressions Bidi_Paired_Bracket & Equivalent_Unified_Ideograph
  • More discussion of emoji variation sequences
  • Clarification of values allowed for the Age property
UTS #10, Unicode Collation Algorithm
  • Updates data to Unicode 11.0
  • Clarification of search tailoring in visual-order scripts
UTS #39, Unicode Security Mechanisms
  • Updates data to Unicode 11.0
  • Enhances discussions of joining controls & combining sequences
UTS #46, Unicode IDNA Compatibility Processing
  • Updates data to Unicode 11.0
  • Changes the format of the test file for arbitrary input settings
  • Updates input setting for Transitional_Processing
UTS #51, Unicode Emoji
  • Supplies Extended_Pictographic property for future-proofing
  • Simplifies emoji sequence definitions
  • EBNF and Regex expressions for loose matches
  • More proposed guidelines: gender-neutral emoji, skin-tone modifiers, ZWJ visible fallbacks, hair-style components
  • Mechanism for changing the “facing” direction for emoji
Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are in each public review page. For more information, see the open public review issues.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Monday, April 9, 2018

Last call on UTS #51 Unicode Emoji

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the proposed update UTS #51 Unicode Emoji. This specification is now synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

The recent changes modify the definition of emoji combining sequences, add a section describing the emoji property stability (including under operations like lowercasing) and a section providing EBNF and Regex expressions for loose matches on emoji in running text, and some clarifications of gender neutral characters.

Note: the emoji characters and properties for Version 11.0 have already been finalized, so this last call is just for the text of the specification, not the emoji characters or properties.

Tuesday, April 3, 2018

Updating Three Specifications Synchronized with Unicode Version 11.0

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the following proposed update specifications. These specifications are synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

UTS #39, Unicode Security Mechanisms updates data for Unicode 11.0, adds a new section describing the handling of Joining Controls (ZWJ and ZWNJ), and adds tests to Section Section 5.4 Optional Detection for checking nonspacing marks and sequences.

UTS #46 Unicode IDNA Compatibility Processing updates data for Unicode 11.0, and extends  the format of the test data file. The new test format allows implementations to determine more precisely where any validity test fails, and allows the implementation to filter for the exact combination of supported features.

UTS #10 Unicode Collation Algorithm updates data for Unicode 11.0, and otherwise makes no material changes to the text.

Details of the Unicode 11.0 Beta and open Public Review Issues are available on the Unicode website.

Friday, March 30, 2018

ICU 61 Released

ICU LogoUnicode® ICU 61 has just been released. This version upgrades to CLDR 33, has a new Java implementation for number and currency parsing, and includes many small API additions, improvements, and bug fixes.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/61