Wednesday, May 9, 2018

Emoji Draft Candidates for 2019

waffle image 104 proposed Emoji Candidates (60 characters plus variants) have advanced to Draft Candidate status for 2019.  These are the short-listed candidates for Emoji 12.0, which is planned for release in 2019Q1 together with Unicode 12.0.

The draft candidates include the following:

dog image kite image white heart image
Guide dog Kite White heart

See Emoji Candidates for the full list.

That list of draft candidates will be reviewed and finalized this September. Feedback is solicited on short names, keywords, and ordering. See also the Emoji 11.0 charts.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Tuesday, April 17, 2018

Submissions open for 2020 Emoji

stopwatch image The deadline for emoji for 2019 was April 1, so any submissions received after that date are considered for release in 2020.

The submission form has undergone some revision, so please be sure to review the new text before putting together a proposal. There is a limited number of emoji characters considered each year, so be sure to follow the form so that you can provide the best case for any proposed emoji.

The emoji subcommittee has also produced a new page which shows the Emoji Requests submitted so far. You can look at what other people have proposed or suggested. In many cases, people have made suggestions, but have not followed through with complete submission forms, or have submitted forms, but not followed through on requested modifications to the forms.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Friday, April 13, 2018

Last Call on Unicode 11.0 Review

stopwatch image The beta review period for Unicode 11.0 and related technical standards will close on April 23, 2018. This is the last opportunity for technical comments before version 11.0 is released in Q2 2018. Implementers and interested parties are encouraged to download data files, review proposed updates, and submit comments.

Unicode 11.0 adds seven new scripts, including Hanifi Rohingya, 66 additional emoji characters, including four new components for hair color (for a total of 157 emoj sequences). The set of Georgian Mtavruli capital letters has been added to support modern casing practices.
In addition to the Unicode core specification, five Unicode Standard Annexes and two Unicode Technical Standards have significant specification and/or data file updates that are correlated with the new additions for Unicode 11.0.0. Review of those changes is strongly encouraged during the beta review period.

UAX #14, Unicode Line Breaking Algorithm
  • Uses Extended_Pictographic property for future-proofing
UAX #29, Unicode Text Segmentation
  • New support for Indic virama handling
  • Uses Extended_Pictographic property for future-proofing
  • A new table of formal regex definitions
UAX #31, Unicode Identifier and Pattern Syntax
  • Refines the use of ZWJ in identifiers
  • Broadens the definition of hashtag identifiers
UAX #38, Unicode Han Database (Unihan)
  • Five new fields and improved regular expressions.
  • Document extension of Unihan properties to non-Unihan
UAX #44, Unicode Character Database
  • New property Equivalent_Unified_Ideograph
  • New regular expressions Bidi_Paired_Bracket & Equivalent_Unified_Ideograph
  • More discussion of emoji variation sequences
  • Clarification of values allowed for the Age property
UTS #10, Unicode Collation Algorithm
  • Updates data to Unicode 11.0
  • Clarification of search tailoring in visual-order scripts
UTS #39, Unicode Security Mechanisms
  • Updates data to Unicode 11.0
  • Enhances discussions of joining controls & combining sequences
UTS #46, Unicode IDNA Compatibility Processing
  • Updates data to Unicode 11.0
  • Changes the format of the test file for arbitrary input settings
  • Updates input setting for Transitional_Processing
UTS #51, Unicode Emoji
  • Supplies Extended_Pictographic property for future-proofing
  • Simplifies emoji sequence definitions
  • EBNF and Regex expressions for loose matches
  • More proposed guidelines: gender-neutral emoji, skin-tone modifiers, ZWJ visible fallbacks, hair-style components
  • Mechanism for changing the “facing” direction for emoji
Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are in each public review page. For more information, see the open public review issues.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Monday, April 9, 2018

Last call on UTS #51 Unicode Emoji

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the proposed update UTS #51 Unicode Emoji. This specification is now synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

The recent changes modify the definition of emoji combining sequences, add a section describing the emoji property stability (including under operations like lowercasing) and a section providing EBNF and Regex expressions for loose matches on emoji in running text, and some clarifications of gender neutral characters.

Note: the emoji characters and properties for Version 11.0 have already been finalized, so this last call is just for the text of the specification, not the emoji characters or properties.

Tuesday, April 3, 2018

Updating Three Specifications Synchronized with Unicode Version 11.0

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the following proposed update specifications. These specifications are synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

UTS #39, Unicode Security Mechanisms updates data for Unicode 11.0, adds a new section describing the handling of Joining Controls (ZWJ and ZWNJ), and adds tests to Section Section 5.4 Optional Detection for checking nonspacing marks and sequences.

UTS #46 Unicode IDNA Compatibility Processing updates data for Unicode 11.0, and extends  the format of the test data file. The new test format allows implementations to determine more precisely where any validity test fails, and allows the implementation to filter for the exact combination of supported features.

UTS #10 Unicode Collation Algorithm updates data for Unicode 11.0, and otherwise makes no material changes to the text.

Details of the Unicode 11.0 Beta and open Public Review Issues are available on the Unicode website.

Friday, March 30, 2018

ICU 61 Released

ICU LogoUnicode® ICU 61 has just been released. This version upgrades to CLDR 33, has a new Java implementation for number and currency parsing, and includes many small API additions, improvements, and bug fixes.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/61

Wednesday, March 28, 2018

CLDR Version 33 Released

Bold image Unicode CLDR 33 provides an update to the key building blocks for software supporting the world’s languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

This release had a limited submission phase. The focus was on improvements to emoji keywords and to the Odia and Assamese locales, addition of typographic names data, and improvements to the structure for specifying keyboard layouts. Improvements include:
  • Structure
    • New structure for typographicNames translations (such as terms for Bold, Italic, ...), with data for 33 locales.
    • The structure for specifying keyboard layouts was significantly enhanced, with many new elements and attributes, and expanded syntax for some preëxisting attribute values.
  • Additional Translations/Data
    • Annotations (emoji keywords) for a limited set of locales had a full review (ar, en_GB, de, es, ja, ru).
    • Two additional locales (Odia, Assamese) were brought up to Modern coverage level; some missing items were added in other locales.
    • Added 4 new transforms, and number spellout rules for 6 additional languages.
  • Property files
    • The emoji property data file ExtendedPictographic.txt has been removed from CLDR data, since the contents are now part of the UTS #51 “Unicode Emoji” data.
    • labels.txt was added for emoji categories and subcategories. 
For further details and links to documentation, see the CLDR Release Notes.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Shopify, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html

For more information, please contact the Unicode Consortium http://www.unicode.org/contacts.html.

Wednesday, March 14, 2018

Unicode 11.0 Beta Review

U11 beta image The beta review period for Unicode 11.0 has started. The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 11.0, often in coordination with changes to character properties. In particular, there are major changes to UAX #29, Unicode Text Segmentation. Seven new scripts have been added in Unicode 11.0, including Hanifi Rohingya. A major adjustment has been made to the Georgian script, with the introduction of uppercase Georgian letters. There are also 66 additional emoji characters.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are on the beta page.

See http://unicode.org/versions/beta-11.0.0.html for more information about testing the 11.0.0 beta.

See http://unicode.org/versions/Unicode11.0.0/ for the current draft summary of Unicode 11.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, SAP, Shopify, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html.

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, March 7, 2018

Call for Unicode 11.0 and 12.0 Cover Design Art

book cover The Unicode Consortium is inviting artists and designers to submit cover design proposals for Versions 11.0 and 12.0 of The Unicode Standard. This call is being issued simultaneously for the next two versions of the standard, scheduled for publications in 2018 and 2019, respectively.

The two selected cover designs will appear on the Unicode Standard 11.0 and 12.0 web pages, in the print-on-demand publications, and in associated promotional literature on the Unicode website. The two artists whose designs are selected for the covers will receive full credit in the colophon of the publication for which the art is used, and wherever else the design appears, and will each receive $700. Two selected runner-up artists will receive $150 apiece.

Please see the announcement page for requirements and more details.

Tuesday, February 27, 2018

Unicode CLDR 33 alpha available for testing

cldr v33 alpha The alpha version of Unicode CLDR 33 is available for testing. The alpha period lasts until the beta release on March 7, which will include updates to the LDML spec. The final release is expected on March 21.

CLDR 33 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 33 included a limited Survey Tool data collection phase focusing on emoji names/annotations and certain specific locales (Odia, Assamese). Other enhancements include a new typographic Names element, four new transforms, changes to properties data files for emoji, and other specific fixes. The draft release page at http://cldr.unicode.org/index/downloads/cldr-33 lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Unicode Emoji 11.0 characters now ready for adoption!

keyboardThe 157 new Emoji are now available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

The main goal of the Unicode Consortium is to enable modern software and computing systems to support the widest range of human languages, present and past. There are approximately 7,000 living human languages, but fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. Adopt-a-character donations are used to improve Unicode support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

For more information on the program, and to adopt a character, see the Adopt-a-Character Page.

And by the way, we have updated charts for the new emoji, with some fixed glyphs (thanks to Emojipedia!).

Monday, February 26, 2018

Adopt-A-Character Grant to Support Three Historic Scripts

document image The Adopt-a-Character Program has awarded a grant to support development of proposals for encoding the following three historic scripts in the Unicode Standard:
  • Book Pahlavi, an Aramaic-based script important to Zoroastrian and Parsi communities worldwide
  • Persian Siyaq Numbers, a numerical system used in Iran from the 9th to 20th centuries for accounting and administration
  • Uighur, a script used in the region spanning Uzbekistan to Mongolia from the 8th to 19th century.
The work will be done by Anshuman Pandey under the direction of Deborah Anderson (SEI, UC Berkeley) and Rick McGowan (Unicode Consortium).

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, February 7, 2018

Unicode Emoji 11.0 characters now final for 2018

🧨 Emoji 11.0 data has been released, with 157 new emoji such as:
🥵
hot face
🥴
woozy face
👩🏻‍🦰
woman, red haired:
light skin tone
👨🏿‍🦱
man, curly haired:
dark skin tone
🦸‍♀️
woman superhero
🥎
softball
🦟
mosquito
🏴‍☠️
pirate flag
🥿
flats
🦞
lobster

The new Emoji 11.0 set is fixed and final, and includes the data needed for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 11.0, scheduled for June 2018. The new emoji typically start showing up on mobile phones in August or September.

The man and woman emoji can now have various hair styles (red-haired, curly-haired, white-haired, and bald), and the new superhero and supervillain support genders and skin tones. The new leg and foot also support skin tones.

The new emoji are listed in Emoji Recently Added v11.0, with sample images. These images are just samples: vendors for mobile phones, PCs, and web platforms will typically use images that fit their overall emoji designs. In particular, the Emoji Ordering v11.0 chart shows how the new emoji sort compared to the others, with new emoji marked with rounded-rectangles. The other Emoji Charts for Version 11.0 have been updated to show the emoji.

The version number for this release of Unicode emoji is jumping from the previously-released Emoji 5.0 to Emoji 11.0 (instead of 6.0) — starting with this release, the version number for emoji is synchronized with the corresponding version number of the Unicode Standard.

To be considered for emoji 12.0, new emoji proposals must be submitted before the end of March 2018. This schedule is to align with the 2019 release of the Unicode Standard.



The 157 new emoji will soon be available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

[kangaroo badge]