Friday, September 14, 2018

Unicode CLDR 34 alpha available for testing

The alpha version of Unicode CLDR 34 is available for testing. The alpha period lasts until the beta release on September 26, which will include updates to the LDML spec. The final release is expected on October 10.

CLDR 34 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 34 included a full Survey Tool data collection phase. Other enhancements include several changes to prepare for the new Japanese calendar era starting 2019-05-01; updated emoji names, annotations, collation and grouping; and other specific fixes. The draft release page at http://cldr.unicode.org/index/downloads/cldr-34 lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Thursday, September 6, 2018

New Japanese Era

A new era in the Japanese calendar is expected to begin on May 1, 2019, following the announced abdication of Japanese Emperor Akihito. This era will be represented in dates by two names: one consisting of a sequence of two existing kanji and one consisting of a new single Japanese character that combines those two. (Similarly, the current era Heisei can be represented by either “平成” or “㍻”.)

The Japanese calendar system and support for era names is essential for important public sector business functions. Therefore, most software distributed in Japan will need to adopt the new era name and add font support for the new character.

The current Heisei era has been in place since 1989 — during the evolution of modern computer systems. Because of this, most software systems have not been tested for such an event. The exact date of the announcement of the new era name is unknown, but current expectations are that there will be a very narrow window for implementing the new era information in IT environments, perhaps less than a month. Until the announcement, dates in 2019 and beyond will continue to be written with the Heisei era name and its year numbering.

To prepare as well as possible for this unprecedented event, the Unicode Consortium has taken the following actions:

  • The code point U+32FF has been reserved for the new era character.
  • Once the new era name is announced, the Unicode Consortium will quickly issue a dot-release (Version 12.1) that will add that character at the reserved code point, U+32FF, with an appropriate character name, decomposition, and representative glyph.
  • Unicode CLDR and ICU are including test mechanisms in the 2018 October releases of CLDR 34 and ICU 63. Systems that use CLDR or ICU (all smartphones, for example) can test using these mechanisms.
  • Systems and applications that do not use CLDR or ICU will need to take similar steps for testing.
The short time window between the actual announcement and the effective date will present challenges to the IT industry. IT systems in Japan will be expected to have the support in place seamlessly. Because of the narrow timeframe and the need to upgrade or patch legacy software, it is important to start now to determine how soon your application/system can add support to your current implementations, stacks, and dependencies.

Thursday, August 23, 2018

IUC 42: Keynote Speaker Announced

Carlos Pallan Gayol

The Advent of Mayan Script Encoding: Mapping the Last Frontiers of Mayan Hieroglyphic Decipherment

Carlos Pallan Gayol
Archaeologist & Epigrapher, Dept. of Old American Studies & Ethnology, University of Bonn


Mayan hieroglyphs rank among the most visually complex writing systems ever created. Deciphering them has entailed a 200+ year scholarly quest, but this task is not yet completed and posits an inviting challenge for applying new tools from the information-age, culminating in the encoding of the Mayan script. Join us Tuesday morning, September 11th, as this keynote highlights the latest milestones attained in this pursuit by the NcodeX Project, where Carlos Pallan collaborates with Dr. Deborah Anderson, Researcher, Dept. of Linguistics, UC Berkeley, the Script Encoding Initiative and members of the Unicode advisory board. Stemming from research funded by Unicode’s Adopt-a-Character Program, it has been possible to produce new database tools and advanced functionalities, capable of mapping and analyzing all the textual contents of the extant Mayan books or Codices by relying on a novel catalog of Mayan signs with assigned code points.

See What’s Happening At IUC 42

For over 27 years the Internationalization & Unicode® Conference (IUC) has been the preeminent event highlighting the latest innovations and best practices of global and multilingual software providers. Join us in Santa Clara to promote your ideas and experiences working with natural languages, multicultural user interfaces, producing and supporting multinational and multilingual products, linguistic algorithms, applying internationalization across mobile and social media platforms, or advancements in relevant standards.

Join expert practitioners and industry leaders as they present detailed recommendations for businesses looking to expand to new international markets and those seeking to improve time to market and cost-efficiency of supporting existing markets. Recent conferences have provided specific advice on designing software for European countries, Latin America, China, India, Japan, Korea, the Middle East, and emerging markets.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Thursday, August 9, 2018

More Emoji Draft Candidates for 2019

Couples Image There are now 179 proposed Emoji Draft Candidates (61 characters plus variants) for 2019. These are the short-listed candidates for Emoji 12.0, which is planned for release in 2019Q1 together with Unicode 12.0.

The following changes were made in the recent Unicode Technical Committee (UTC) meeting:
  1. Added a candidate emoji for deaf person
  2. Changed service animal vest to safety vest, and added a candidate emoji sequence using it: service dog
  3. Added candidate emoji sequences for couple holding hands, with 55 combinations of skin tone and gender
  4. Changed names and ordering for various characters
The list of draft candidates will be reviewed and finalized in the next UTC meeting, this coming September. Feedback is solicited on short names, keywords, and ordering. See also the Emoji 11.0 charts.

Eight Emoji Provisional Candidates for 2020 were also added (ninja, military helmet, mammoth, feather, dodo, magic wand, carpentry saw, screwdriver). For example:

􁌂
􁌅
ninja
magic wand

Between now and March 2019, these and other Provisional Candidates will be collected. The Unicode emoji subcommittee will then assess the whole set, and make recommendations to the UTC for which emoji to advance to Draft Candidate status for 2020.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, July 18, 2018

ICU moves to GitHub and Jira

ICU LogoInternational Components for Unicode (ICU) is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.

As of this week, ICU has moved from a self-hosted source code and bug tracking environment, to git on GitHub and Jira on Atlassian Cloud, respectively. Pull requests are welcome, as are bug reports on the new issue tracking system.

For more information, please see the following links:

ICU Repository Access: http://site.icu-project.org/repository
ICU Bug Tracking: http://site.icu-project.org/bugs

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Friday, July 13, 2018

Unicode 11.0 Paperback Available

Unicode 11.0 copies The Unicode 11.0 core specification is now available in paperback book form with a new, original cover design. This edition consists of a pair of modestly priced print-on-demand volumes containing the complete text of the core specification of Version 11.0 of the Unicode Standard.

Each of the two volumes is a compact 6×9 inch US trade paperback size. The two volumes may be purchased separately or together, although they are intended as a set. The cost for the pair is US $16.58, plus postage and applicable taxes. Please visit the description page to order.

Note that these volumes do not include the Version 11.0 code charts, nor do they include the Version 11.0 Standard Annexes and Unicode Character Database, which are all freely available on the Unicode website.

Purchase The Unicode Standard, Version 11.0 - Core Specification

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Thursday, July 5, 2018

Unicode Consortium Announces Version 11.0 and Version 12.0 Cover Designs

The Unicode Consortium is pleased to announce the design selected for the cover of the forthcoming print-on-demand publication of The Unicode Standard, Version 11.0. The Unicode Consortium issued an open call for artists and designers to submit cover design proposals. An independent panel reviewed all submitted designs. Because of the accelerated release schedule for Version 12.0 (March 2019), the design for the print-on-demand publication of The Unicode Standard, Version 12.0 was also selected at this time.

Unicode 11.0 Books
The cover for Version 11.0 is an original design by Joyce S. Lee, a graduate student in the UC Berkeley School of Information. Her artwork was inspired by the well-known early 20th-century Bauhaus design school. She explains, “I see numerous parallels between the Bauhaus and the Unicode Consortium, including an intersection of workmanship and technological reproduction, a spirit of collaboration, as well as a widespread cultural influence. With this Bauhaus inspired cover, I thus aim to represent the Unicode Standard as a form of instructional reference for technologists around the world.”

[cover art by Monica Tang]
Cover artwork for Version 12.0 was created by Monica Tang, a computer science student at UC Berkeley. Her design was inspired by the simplicity of the geometric shapes that comprise the diversity of characters and symbols represented in the Unicode Standard. She notes, “Incorporating a variety of shapes and colors into a patterned design, I seek to convey the sheer breadth of the languages covered in the Unicode Standard as well as a sense of commonality.”

Runner-up designs by Feixiong “Hasutai” Liu and Maurice Meilleur were also selected. Hasutai is the founder and chief designer of Sir Sebsihiyan Sibe-Manchu Culture Center. Maurice Meilleur is Assistant Professor of Graphic Design at Appalachian State University.

Hasutai:
[art by Hasutai]
Maurice Meilleur:
[art by Maurice Meilleur]

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, June 27, 2018

New Gold Sponsor dotFM .FM TLD

The Unicode Consortium is pleased to announce that dotFM .FM TLD is now a gold sponsor for:

dotFM .FM TLD's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

BRS Media’s dotFM is pleased to sponsor Adopt a Character. This year, dotFM launched Emoji Domains within the .FM Top-Level Domain. Emoji domain is a domain name with an expressive digital image or icon in it. dotFM pioneered the ‘multimedia’ domain space since launching the .FM Top Level Domains in 1998. Today, the comprehensive portfolio of registrants not only includes broadcasters, Internet radio and the music community, but also interactive companies, premier social media ventures and podcast entrepreneurs worldwide.  — dotFM .FM TLD

The Unicode Consortium thanks dotFM .FM TLD for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 140,000 other characters are available for adoption — see Adopt a Character

Wednesday, June 20, 2018

ICU 62 Released

ICU LogoUnicode® ICU 62 has just been released. It upgrades to Unicode 11 and to CLDR 33.1 locale data. A new syntax for locale-neutral number skeleton strings can be used in MessageFormat for more control over number formatting. Several still-draft NumberFormatter methods and helper classes have been modified or renamed. In C++, DecimalFormat wraps the new NumberFormatter code, and there is a new implementation for number parsing.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/62

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

CLDR Version 33.1 Language/Locale Data Released for Unicode 11.0

Emoji Unicode CLDR 33.1 adds support for the recently released Unicode 11.0. Version 33.1 is the latest version of CLDR, the core open-source language data that major software systems use to adapt software to the conventions of over 80 different languages. The open-source Unicode ICU library incorporates the CLDR Version 33.1 data as part of its update to Unicode 11.0 in its ICU 62 release. ICU code is used by many products for Unicode and language support, including Android, Cloudant, ChromeOS, Db2, iOS, macOS, Windows, and many others.

The CLDR 33.1 release focuses on updates for Unicode 11.0: new names and keywords for the Unicode 11.0 emoji, Chinese collation stroke order, and script metadata. In addition, there are major improvements for names and annotations for the pre-11.0 emoji in CLDR languages. More extensive updates are planned for CLDR 34 (release expected in early October), with data submission still continuing.

For further details and links to documentation, see the CLDR 33.1 Release Notes.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Tuesday, June 5, 2018

Announcing The Unicode® Standard, Version 11.0

U+10F3D Sogdian Ain 10F3D Version 11.0 of the Unicode Standard is now available, both the core specification and data files. Version 11.0 adds 684 characters, for a total of 137,374 characters. These additions include seven new scripts, for a total of 146 scripts, as well as 145 new emoji.

The new scripts and characters in Version 11.0 add support for lesser-used languages and unique written requirements worldwide, including:
  • Georgian Mtavruli capital letters, newly added to support modern casing practices
  • Hanifi Rohingya, used to write the modern Rohingya language in Southeast Asia
  • Medefaidrin, used for modern liturgical purposes in Africa
  • Mazahua, a Mesoamerican language recognized by law in Mexico
  • Mayan numerals used in printed materials in Central America
  • Historic Sanskrit, Gurmukhi, and the Buryats
  • Five urgently needed CJK unified ideographs: three for chemical names and two for Japan's government administration
Popular symbol additions:
  • Copyleft symbol
  • Half stars for rating systems
  • More astrological symbols
  • Xiangqi Chinese chess symbols
  • New emoji characters including:
🦸 👨🏽‍🦰
🧸 🦞
🧨 🥳

For the full list of emoji characters, see emoji additions for Unicode 11.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji. Version 11.0 also includes other improvements for emoji handling:
  • a mechanism to request the glyph direction for emoji
  • descriptions of the four new emoji hair components
  • descriptions of gender neutral emoji
  • simplified statements of emoji-related rules for grapheme cluster boundaries and for word boundaries.
Three other important Unicode specifications have been updated for Version 11.0:

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications, often in coordination with changes to character properties. In particular, there are changes to:

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Adopt-a-Character

All the new characters including the new emoji are now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

Wednesday, May 9, 2018

Emoji Draft Candidates for 2019

waffle image 104 proposed Emoji Candidates (60 characters plus variants) have advanced to Draft Candidate status for 2019.  These are the short-listed candidates for Emoji 12.0, which is planned for release in 2019Q1 together with Unicode 12.0.

The draft candidates include the following:

dog image kite image white heart image
Guide dog Kite White heart

See Emoji Candidates for the full list.

That list of draft candidates will be reviewed and finalized this September. Feedback is solicited on short names, keywords, and ordering. See also the Emoji 11.0 charts.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Tuesday, April 17, 2018

Submissions open for 2020 Emoji

stopwatch image The deadline for emoji for 2019 was April 1, so any submissions received after that date are considered for release in 2020.

The submission form has undergone some revision, so please be sure to review the new text before putting together a proposal. There is a limited number of emoji characters considered each year, so be sure to follow the form so that you can provide the best case for any proposed emoji.

The emoji subcommittee has also produced a new page which shows the Emoji Requests submitted so far. You can look at what other people have proposed or suggested. In many cases, people have made suggestions, but have not followed through with complete submission forms, or have submitted forms, but not followed through on requested modifications to the forms.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Friday, April 13, 2018

Last Call on Unicode 11.0 Review

stopwatch image The beta review period for Unicode 11.0 and related technical standards will close on April 23, 2018. This is the last opportunity for technical comments before version 11.0 is released in Q2 2018. Implementers and interested parties are encouraged to download data files, review proposed updates, and submit comments.

Unicode 11.0 adds seven new scripts, including Hanifi Rohingya, 66 additional emoji characters, including four new components for hair color (for a total of 157 emoj sequences). The set of Georgian Mtavruli capital letters has been added to support modern casing practices.
In addition to the Unicode core specification, five Unicode Standard Annexes and two Unicode Technical Standards have significant specification and/or data file updates that are correlated with the new additions for Unicode 11.0.0. Review of those changes is strongly encouraged during the beta review period.

UAX #14, Unicode Line Breaking Algorithm
  • Uses Extended_Pictographic property for future-proofing
UAX #29, Unicode Text Segmentation
  • New support for Indic virama handling
  • Uses Extended_Pictographic property for future-proofing
  • A new table of formal regex definitions
UAX #31, Unicode Identifier and Pattern Syntax
  • Refines the use of ZWJ in identifiers
  • Broadens the definition of hashtag identifiers
UAX #38, Unicode Han Database (Unihan)
  • Five new fields and improved regular expressions.
  • Document extension of Unihan properties to non-Unihan
UAX #44, Unicode Character Database
  • New property Equivalent_Unified_Ideograph
  • New regular expressions Bidi_Paired_Bracket & Equivalent_Unified_Ideograph
  • More discussion of emoji variation sequences
  • Clarification of values allowed for the Age property
UTS #10, Unicode Collation Algorithm
  • Updates data to Unicode 11.0
  • Clarification of search tailoring in visual-order scripts
UTS #39, Unicode Security Mechanisms
  • Updates data to Unicode 11.0
  • Enhances discussions of joining controls & combining sequences
UTS #46, Unicode IDNA Compatibility Processing
  • Updates data to Unicode 11.0
  • Changes the format of the test file for arbitrary input settings
  • Updates input setting for Transitional_Processing
UTS #51, Unicode Emoji
  • Supplies Extended_Pictographic property for future-proofing
  • Simplifies emoji sequence definitions
  • EBNF and Regex expressions for loose matches
  • More proposed guidelines: gender-neutral emoji, skin-tone modifiers, ZWJ visible fallbacks, hair-style components
  • Mechanism for changing the “facing” direction for emoji
Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are in each public review page. For more information, see the open public review issues.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Monday, April 9, 2018

Last call on UTS #51 Unicode Emoji

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the proposed update UTS #51 Unicode Emoji. This specification is now synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

The recent changes modify the definition of emoji combining sequences, add a section describing the emoji property stability (including under operations like lowercasing) and a section providing EBNF and Regex expressions for loose matches on emoji in running text, and some clarifications of gender neutral characters.

Note: the emoji characters and properties for Version 11.0 have already been finalized, so this last call is just for the text of the specification, not the emoji characters or properties.

Tuesday, April 3, 2018

Updating Three Specifications Synchronized with Unicode Version 11.0

stopwatch image The Unicode Consortium is soliciting feedback on the text and data changes in the following proposed update specifications. These specifications are synchronized with Unicode Version 11.0, and slated for release at the same time, in early June. Feedback is due by April 23 — this is the last chance to provide feedback on any changes and any open review issues.

UTS #39, Unicode Security Mechanisms updates data for Unicode 11.0, adds a new section describing the handling of Joining Controls (ZWJ and ZWNJ), and adds tests to Section Section 5.4 Optional Detection for checking nonspacing marks and sequences.

UTS #46 Unicode IDNA Compatibility Processing updates data for Unicode 11.0, and extends  the format of the test data file. The new test format allows implementations to determine more precisely where any validity test fails, and allows the implementation to filter for the exact combination of supported features.

UTS #10 Unicode Collation Algorithm updates data for Unicode 11.0, and otherwise makes no material changes to the text.

Details of the Unicode 11.0 Beta and open Public Review Issues are available on the Unicode website.

Friday, March 30, 2018

ICU 61 Released

ICU LogoUnicode® ICU 61 has just been released. This version upgrades to CLDR 33, has a new Java implementation for number and currency parsing, and includes many small API additions, improvements, and bug fixes.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/61

Wednesday, March 28, 2018

CLDR Version 33 Released

Bold image Unicode CLDR 33 provides an update to the key building blocks for software supporting the world’s languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

This release had a limited submission phase. The focus was on improvements to emoji keywords and to the Odia and Assamese locales, addition of typographic names data, and improvements to the structure for specifying keyboard layouts. Improvements include:
  • Structure
    • New structure for typographicNames translations (such as terms for Bold, Italic, ...), with data for 33 locales.
    • The structure for specifying keyboard layouts was significantly enhanced, with many new elements and attributes, and expanded syntax for some preëxisting attribute values.
  • Additional Translations/Data
    • Annotations (emoji keywords) for a limited set of locales had a full review (ar, en_GB, de, es, ja, ru).
    • Two additional locales (Odia, Assamese) were brought up to Modern coverage level; some missing items were added in other locales.
    • Added 4 new transforms, and number spellout rules for 6 additional languages.
  • Property files
    • The emoji property data file ExtendedPictographic.txt has been removed from CLDR data, since the contents are now part of the UTS #51 “Unicode Emoji” data.
    • labels.txt was added for emoji categories and subcategories. 
For further details and links to documentation, see the CLDR Release Notes.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Shopify, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html

For more information, please contact the Unicode Consortium http://www.unicode.org/contacts.html.

Wednesday, March 14, 2018

Unicode 11.0 Beta Review

U11 beta image The beta review period for Unicode 11.0 has started. The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 11.0, often in coordination with changes to character properties. In particular, there are major changes to UAX #29, Unicode Text Segmentation. Seven new scripts have been added in Unicode 11.0, including Hanifi Rohingya. A major adjustment has been made to the Georgian script, with the introduction of uppercase Georgian letters. There are also 66 additional emoji characters.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are on the beta page.

See http://unicode.org/versions/beta-11.0.0.html for more information about testing the 11.0.0 beta.

See http://unicode.org/versions/Unicode11.0.0/ for the current draft summary of Unicode 11.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, SAP, Shopify, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html.

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, March 7, 2018

Call for Unicode 11.0 and 12.0 Cover Design Art

book cover The Unicode Consortium is inviting artists and designers to submit cover design proposals for Versions 11.0 and 12.0 of The Unicode Standard. This call is being issued simultaneously for the next two versions of the standard, scheduled for publications in 2018 and 2019, respectively.

The two selected cover designs will appear on the Unicode Standard 11.0 and 12.0 web pages, in the print-on-demand publications, and in associated promotional literature on the Unicode website. The two artists whose designs are selected for the covers will receive full credit in the colophon of the publication for which the art is used, and wherever else the design appears, and will each receive $700. Two selected runner-up artists will receive $150 apiece.

Please see the announcement page for requirements and more details.

Tuesday, February 27, 2018

Unicode CLDR 33 alpha available for testing

cldr v33 alpha The alpha version of Unicode CLDR 33 is available for testing. The alpha period lasts until the beta release on March 7, which will include updates to the LDML spec. The final release is expected on March 21.

CLDR 33 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 33 included a limited Survey Tool data collection phase focusing on emoji names/annotations and certain specific locales (Odia, Assamese). Other enhancements include a new typographic Names element, four new transforms, changes to properties data files for emoji, and other specific fixes. The draft release page at http://cldr.unicode.org/index/downloads/cldr-33 lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Unicode Emoji 11.0 characters now ready for adoption!

keyboardThe 157 new Emoji are now available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

The main goal of the Unicode Consortium is to enable modern software and computing systems to support the widest range of human languages, present and past. There are approximately 7,000 living human languages, but fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. Adopt-a-character donations are used to improve Unicode support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

For more information on the program, and to adopt a character, see the Adopt-a-Character Page.

And by the way, we have updated charts for the new emoji, with some fixed glyphs (thanks to Emojipedia!).

Monday, February 26, 2018

Adopt-A-Character Grant to Support Three Historic Scripts

document image The Adopt-a-Character Program has awarded a grant to support development of proposals for encoding the following three historic scripts in the Unicode Standard:
  • Book Pahlavi, an Aramaic-based script important to Zoroastrian and Parsi communities worldwide
  • Persian Siyaq Numbers, a numerical system used in Iran from the 9th to 20th centuries for accounting and administration
  • Uighur, a script used in the region spanning Uzbekistan to Mongolia from the 8th to 19th century.
The work will be done by Anshuman Pandey under the direction of Deborah Anderson (SEI, UC Berkeley) and Rick McGowan (Unicode Consortium).

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[badge]

Wednesday, February 7, 2018

Unicode Emoji 11.0 characters now final for 2018

🧨 Emoji 11.0 data has been released, with 157 new emoji such as:
🥵
hot face
🥴
woozy face
👩🏻‍🦰
woman, red haired:
light skin tone
👨🏿‍🦱
man, curly haired:
dark skin tone
🦸‍♀️
woman superhero
🥎
softball
🦟
mosquito
🏴‍☠️
pirate flag
🥿
flats
🦞
lobster

The new Emoji 11.0 set is fixed and final, and includes the data needed for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 11.0, scheduled for June 2018. The new emoji typically start showing up on mobile phones in August or September.

The man and woman emoji can now have various hair styles (red-haired, curly-haired, white-haired, and bald), and the new superhero and supervillain support genders and skin tones. The new leg and foot also support skin tones.

The new emoji are listed in Emoji Recently Added v11.0, with sample images. These images are just samples: vendors for mobile phones, PCs, and web platforms will typically use images that fit their overall emoji designs. In particular, the Emoji Ordering v11.0 chart shows how the new emoji sort compared to the others, with new emoji marked with rounded-rectangles. The other Emoji Charts for Version 11.0 have been updated to show the emoji.

The version number for this release of Unicode emoji is jumping from the previously-released Emoji 5.0 to Emoji 11.0 (instead of 6.0) — starting with this release, the version number for emoji is synchronized with the corresponding version number of the Unicode Standard.

To be considered for emoji 12.0, new emoji proposals must be submitted before the end of March 2018. This schedule is to align with the 2019 release of the Unicode Standard.



The 157 new emoji will soon be available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

[kangaroo badge]