Friday, 1 January 2016
Unicode version 9.0 is scheduled for release in June 2016. The final repertoire is now fixed, and 7,500 characters (including 72 emoji) will be added to Unicode 9.0. This will bring the total number of graphic and format characters in the Unicode Standard to 128,172 characters (in case you are concerned that Unicode is running out of space, that still leaves room for another 846,293 characters to be encoded). In summary, Unicode 9.0 wil include 11 new blocks (named ranges of characters) and cover 6 new scripts (Osage, Newa, Bhaiksuki, Marchen, Tangut, and Adlam), making a total of 270 blocks and 135 scripts.
74 Emoji characters have been accepted for encoding in Unicode 9.0. However, two of these characters have been de-emojified at the request of Apple: U+1F946 RIFLE (representing Shooting or Hunting) and U+1F93B MODERN PENTATHLON (which includes Pistol Shooting as one of its disciplines) will have no Unicode properties to suggest that they are emoji. So the two characters will still be encoded in Unicode 9.0, but as plain symbols not as emoji characters; and it is unlikely that any major vendors will implement them as emoji.
Provisional Code Point |
Sample Colour Emoji Image |
Provisional Character Name | Source |
---|---|---|---|
U+1F57A | MAN DANCING Encoded to match U+1F483 đ DANCER (typically implemented as a female dancer) |
L2/15-054 | |
U+1F5A4 | BLACK HEART unequivocally represented as black in all variants Encoded because there is a need for a black-coloured heart emoji, and U+2764 †HEAVY BLACK HEART is typically implemented as a red heart |
L2/15-054 | |
U+1F6D1 | OCTAGONAL SIGN stop sign |
L2/15-054 | |
U+1F6D2 | SHOPPING TROLLEY shopping cart |
L2/15-195 | |
U+1F6F4 | SCOOTER | L2/15-054 | |
U+1F6F5 | MOTOR SCOOTER | L2/15-054 | |
U+1F6F6 | CANOE | L2/15-196 | |
U+1F919 | CALL ME HAND | L2/15-054 | |
U+1F91A | RAISED BACK OF HAND | L2/15-054 | |
U+1F91B | LEFT-FACING FIST | L2/15-054 | |
U+1F91C | RIGHT-FACING FIST | L2/15-054 | |
U+1F91D | HANDSHAKE | L2/15-054 | |
U+1F91E | HAND WITH INDEX AND MIDDLE FINGERS CROSSED | L2/15-054 | |
U+1F920 | FACE WITH COWBOY HAT | L2/15-054 | |
U+1F921 | CLOWN FACE | L2/15-054 | |
U+1F922 | NAUSEATED FACE | L2/15-054 | |
U+1F923 | ROLLING ON THE FLOOR LAUGHING | L2/15-054 | |
U+1F924 | DROOLING FACE | L2/15-054 | |
U+1F925 | LYING FACE | L2/15-054 | |
U+1F926 | FACE PALM | L2/15-054 | |
U+1F927 | SNEEZING FACE gesundheit |
L2/15-195 | |
U+1F930 | PREGNANT WOMAN | L2/15-054 | |
U+1F933 | SELFIE typically used with face or human figure |
L2/15-054 | |
U+1F934 | PRINCE Encoded to match U+1F478 đž PRINCESS |
L2/15-054 | |
U+1F935 | MAN IN TUXEDO groom Encoded to match U+1F470 đ° BRIDE WITH VEIL |
L2/15-054 | |
U+1F936 | MOTHER CHRISTMAS Mrs. Claus Encoded to match U+1F385 đ FATHER CHRISTMAS |
L2/15-054 | |
U+1F937 | SHRUG | L2/15-054 | |
U+1F938 | PERSON DOING CARTWHEEL gymnastics |
L2/15-196 | |
U+1F939 | JUGGLING | L2/15-195 | |
U+1F93A | FENCER fencing |
L2/15-196 | |
U+1F93B | MODERN PENTATHLON NOT AN EMOJI (see above) |
L2/15-196 | |
U+1F93C | WRESTLERS wrestling |
L2/15-196 | |
U+1F93D | WATER POLO | L2/15-196 | |
U+1F93E | HANDBALL | L2/15-196 | |
U+1F940 | WILTED FLOWER wiltered rose |
L2/15-054 | |
U+1F941 | DRUM WITH DRUMSTICKS | L2/15-195 | |
U+1F942 | CLINKING GLASSES | L2/15-054 | |
U+1F943 | TUMBLER GLASS typically shown with iced drink whisky |
L2/15-195 | |
U+1F944 | SPOON | L2/15-195 | |
U+1F945 | GOAL NET | L2/15-196 | |
U+1F946 | RIFLE marksmanship, shooting (Olympic sport) hunting NOT AN EMOJI (see above) |
L2/15-196 | |
U+1F947 | FIRST PLACE MEDAL gold medal |
L2/15-196 | |
U+1F948 | SECOND PLACE MEDAL silver medal |
L2/15-196 | |
U+1F949 | THIRD PLACE MEDAL bronze medal |
L2/15-196 | |
U+1F94A | BOXING GLOVE boxing |
L2/15-196 | |
U+1F94B | MARTIAL ARTS UNIFORM judo and other martial arts |
L2/15-196 | |
U+1F950 | CROISSANT | L2/15-054 | |
U+1F951 | AVOCADO | L2/15-054 | |
U+1F952 | CUCUMBER | L2/15-054 | |
U+1F953 | BACON | L2/15-054 | |
U+1F954 | POTATO | L2/15-054 | |
U+1F955 | CARROT | L2/15-054 | |
U+1F956 | BAGUETTE BREAD french bread |
L2/15-195 | |
U+1F957 | GREEN SALAD | L2/15-195 | |
U+1F958 | SHALLOW PAN OF FOOD paella, casserole |
L2/15-195 | |
U+1F959 | STUFFED FLATBREAD döner kebab, falafel, gyro, shawarma |
L2/15-195 | |
U+1F95A | EGG | L2/15-267 | |
U+1F95B | GLASS OF MILK | L2/15-267 | |
U+1F95C | PEANUTS | L2/15-267 | |
U+1F95D | KIWIFRUIT | L2/15-267 | |
U+1F95E | PANCAKES | L2/15-267 | |
U+1F985 | EAGLE | L2/15-054 | |
U+1F986 | DUCK | L2/15-054 | |
U+1F987 | BAT | L2/15-054 | |
U+1F988 | SHARK | L2/15-054 | |
U+1F989 | OWL | L2/15-054 | |
U+1F98A | FOX FACE | L2/15-054 | |
U+1F98B | BUTTERFLY | L2/15-195 | |
U+1F98C | DEER | L2/15-195 | |
U+1F98D | GORILLA | L2/14-092 L2/15-195 |
|
U+1F98E | LIZARD | L2/15-195 | |
U+1F98F | RHINOCEROS | L2/15-195 | |
U+1F990 | SHRIMP | L2/15-267 | |
U+1F991 | SQUID | L2/15-267 |
NB The above code points and character names are subject to change, and should not be relied on at this point in time.
Sources
These characters are currently under ISO ballot for inclusion in ISO/IEC 10646:2016 (5th ed.) (see WG2 N4705 pages 130, 131, 135, and 137â138). Most of the 8,514 characters in this document will feed into Unicode version 10.0 in June 2017, but due to the urgent need of netizens to be able to use new emoji at the earliest possible date, the Unicode Technical Committee (UTC) has a habit (policy?) of fast-tracking emoji characters into the Unicode standard out of synchronization with the corresponding ISO standard (ISO/IEC 10646). On January 26 these 74 emoji characters were authorized for inclusion in the Unicode 9.0 beta, and unless any national bodies have strong and compelling objections to any of these emoji characters in the current CD ballot (which closes 29 February 2016), then these 74 emoji characters will definitely be in Unicode 9.0. A final decision will be made when the UTC meets in early May 2016.
In the end, at the UTC meeting in May 2016, the UTC decided to only accept 72 emoji characters. At the request of Apple (in response to several well-publicized emoji gun incidents, and a campaign against adding more violent emoji to Unicode), U+1F946 RIFLE and U+1F93B MODERN PENTATHLON (which includes shooting as one of its disciplines) were de-emojified, and will be encoded in Unicode 9.0 as plain non-emoji symbols. Of course, people can still use U+1F946 đ„ RIFLE (or various combinations of the letters A-Z, and many other Unicode characters) to threaten other people in text messages, but the threats will not need to be taken seriously because the rifle character will not be displayed in colour (and it is quite likely that major vendors will not support this character at all in their fonts).
More Emoji to Look Forward to ...
Proposals by Jennifer 8. Lee and friends to encode emoji characters representing chopsticks, dumplings, fortune cookies, and Chinese takeout boxes were joyfully accepted by the shadowy Emoji subcommittee at the January UTC meeting, but they were submitted too late for inclusion in Unicode 9.0 â we can look forwrd to welcoming them into Unicode 10.0 in June 2017.
Alolita Sharma (@alolita) : #UTC146 Peter Edberg accepts #dumpling #chopsticks #fortunecookie #takeoutbox originals from emoji designer YiyingLu (25 January 2016)
Emoji make up 99% of the noise and hype surrounding Unicode 9.0, but they account for only 1% of the new characters.
7,227 of the 7,426 non-emoji characters to be added to Unicode 9.0 are included in ISO/IEC 10646:2014 (4th ed.) Amendment 2, and are highlighted in this document (along with one currency sign, nine CJK unified ideographs, 36 emoji characters, and 5 emoji modifier characters which were fast-tracked into Unicode 8.0). These characters have all been through at least two rounds of ISO technical ballots, and they are now stable (they cannot be moved, removed, or renamed). The remaining 199 characters are included in the Committe Draft for ISO/IEC 10646:2016 (5th ed.) (full draft is downloadable as N4446). This edition has not yet completed its two rounds of technical ballots by ISO national bodies, but the UTC has decided to fast-track the Adlam script, the Newa script, and Japanese TV symbols (in addition to the 74 emoji discussed above) into Unicode 9.0. It is not unusual for the UTC to fast-track urgently-required characters (such as currency symbols and emoji) into a version of Unicode before they have completed their final technical ballot, but it is unprecedented to fast-track complete scripts, especially when the first technical ballot has not yet completed.
Newa in particular has been a very difficult script to get encoded because of technical and political differences of opinion about what characters to include and the encoding model to use (see the long list of documents relating to Newa in the table below). As recently as the first ballot on the Committee Draft for ISO/IEC 10646 in August 2015 the UK national body expressed concerns over the encoding of murmured resonants as atomic characters (L2/15-262 p. 16), so the encoding of Newa cannot be considered to be uncontroversial. By fast-tracking Adlam and Newa into Unicode 9.0, the UTC has effectively stiffled any ISO national body opposition to the Newa repertoire that the UTC has agreed upon. The CD ballot for ISO/IEC 10646 closes 29 February 2016, which theoretically allows the UTC time to tweak (or even withdraw) any of the fast-tracked characters in response to ballot comments by ISO national bodies, but any requests to change the character repertoire, character positions or character names for Newa or Adlam in the final ISO technical ballot (DIS ballot) later this year will have to be rejected as the encoding of Newa and Adlam is already a fait accompli.
Fast-tracked characters from the ISO/IEC 10646 CD are marked ** in the tables below.
7,297 of the 7,500 new characters in Unicode 9.0 belong to six new scriptsâŻ:
Inscription in the Marchen script on the library of the Yungdrung Bon Monastery in Dolanji (Himachal Pradesh)
Photograph © Chris Hatchell
Of the 7,500 characters added to Unicode 9.0 (including the 74 emoji), 7,357 characters are included in 11 new blocks, and 143 characters are added to existing blocks, as detailed in the two tables below. The code points and character names for all these characters are now fixed, and will not be changed. Draft official Unicode data files are available here, and I have made a plain text list of all the new characters to be added to Unicode 9.0 available here.
Block Name | Range | Characters / Source Documents |
---|---|---|
Cyrillic Extended-C | 1C80..1C8F |
9 letters used in early Church Slavonic (1C80..1C88). Aleksandr Andreev, Yuri Shardt, and Nikita Simmons, "Proposal to Use Standardized Variation Sequences to Encode Church Slavonic Glyph Variants in Unicode" (2014-07-20) [L2/13-153] Aleksandr Andreev, Yuri Shardt, and Nikita Simmons, "Proposal to Encode Additional Cyrillic Characters used in Early Church Slavonic Printed Books" (2014-08-20) [WG2 N4607 || L2/14-196] |
Osage | 104B0..104FF |
72 letters for Osage: 36 uppercase letters (104B0..104D3) and 36 lowercase letters (104D8..104FB). Michael Everson, Herman Mongrain Lookout, and Cameron Pratt, "Preliminary proposal to encode the Osage script in the UCS" (2014-02-20) [WG2 N4548 || L2/14-068] Michael Everson, Herman Mongrain Lookout, and Cameron Pratt, "Proposal to encode Latin characters for Osage in the UCS" (2014-07-30) [WG2 N4587 || L2/14-175] Michael Everson, Herman Mongrain Lookout, and Cameron Pratt, "Final proposal to encode the Osage script in the UCS" (2014-09-21) [WG2 N4619 || L2/14-214] |
Newa ** | 11400..1147F |
92 characters for Newa: 53 letters (11400..11434); 13 vowel signs (11435..11441); 7 other signs (11442..11448); an Om character (11449); a Siddhi character (1144A); 5 punctuation marks (1144B..1144F); 10 digits (11450..11459); a placeholder mark (1145B); and an insertion sign (1145D). Anshuman Pandey, "Preliminary Proposal to Encode the Prachalit Nepal Script in ISO/IEC 10646" (2011-05-03) [WG2 N4038 || L2/11-152] Anshuman Pandey, "Preliminary Proposal to Encode the Newar Script in ISO/IEC 10646" (2012-02-29) [WG2 N4184 || L2/12-003] Dev Dass Manandhar, Samir Karmacharya and Bishnu Chitrakar, "Proposal for the NepÄlalipi script in the UCS" (2012-02-05) [WG2 N4322 || L2/12-120] Ken Whistler, "On the encoding of the âNepaalalipiâ / âNewarâ script" (2012-05-11) [L2/12-200] Dev Dass Manandhar, "Response to L2/12-200 âOn the encoding of âNepaalalipiâ/âNewarâ scriptâ" (2012-07-21) [L2/12-244] Dev Dass Manandhar, "Ancillary materials on âbreathy consonantsâ in âNepaalalipiâ" (2012-07-21) [L2/12-245] Iain Sinclair, "Letter in support of N4184 and encoding the Newar script in ISO/IEC 10646" (2012-10-22) [WG2 N4372 || L2/12-336] Dev Dass Manandhar, Samir Karmacharya and Bishnu Chitrakar, "Proposal for the Nepaalalipi script in the UCS" (2012-10-29) [L2/12-349] Pat Hall, "Proposal to Encode Nepal Himalayish Scripts in ISO/IEC 10646" (2012-10-08) [WG2 N4347 || L2/12-365] Deborah Anderson, "Comparison between Newar and Nepaalalipi proposals (L2/12â003 and L2/12â349)" (2012-11-08) [L2/12-390] Dev Dass Manandhar, Bishnu Chitrakar and Samir Karmacharya, "To Unicode Technical Committee (UTC)" (2013-01-28) [L2/13-029] Dev Dass Manandhar, Samir Karmacharya and Bishnu Chitrakar, "Proposal to Encode Nepaalalipi Script in ISO/IEC 10646" (2014-04-10) [L2/14-086] Deborah Anderson, "Comparison between Newar and Nepaalalipi proposals (L2/12â003 and L2/14â086)" (2014-09-23) [L2/14-220] Deborah Anderson, "Recommendations to UTC from Script Meeting in Nepal" (2014-10-06) [L2/14-253] Anshuman Pandey, "Response to the Recommendation for Nepalese Scripts in L2/14-253" (2014-10-21) [WG2 N4602 || L2/14-258] Ken Whistler, "Rationale for Atomic Encoding of Murmured Resonants in Newa" (2014-10-27) [L2/14-281] Anshuman Pandey, "Specimen Showing Representation of Murmured Consonants in the Newar Script" (2014-10-28) [WG2 N4552 || L2/14-290] Ken Whistler, "Towards a Consensus Encoding of Newa" (2014-12-04) [WG2 N4660 || L2/14-285] |
Mongolian Supplement | 11660..1167F |
13 head marks for Mongolian (11660..1166C). Aaron Bell, Greg Eck, Andrew Glass, and Andrew West, "Encoding Mongolian head letters" (2014-01-17) [L2/14-030] Aaron Bell, Greg Eck, Andrew Glass, and Andrew West, "Proposal to encode five Mongolian head marks" (2014-02-06) [WG2 N4542 || L2/14-067] China, "Comments on N4542 Five Mongolian Head Marks" (2014-02-19) [WG2 N4547 || L2/14-081] China, "A Letter to the Authors of N4542 (5 Birgas in Mongolian Block" (2014-09-23) [WG2 N4632 || L2/14-240] |
Bhaiksuki | 11C00..11C6F |
97 characters for Bhaiksuki: 46 letters (11C00..11C08, 11C0A..11C2E); 12 vowel signs (11C2F..11C36, 11C38..11C3B); 5 other signs (11C3C..11C40); 2 dandas (11C41..11C42); a word separator (11C43); 2 gap fillers (11C44..11C45); 10 decimal digits (11C50..11C59); 18 numbers (11C5A..11C6B); and a hundreds unit mark (11C6C). Anshuman Pandey and Dragomir Dimitrov, "Proposal to Encode the Bhaiksuki Script in ISO/IEC 10646" (2013-07-22) [WG2 N4469 || L2/13-167] Anshuman Pandey and Dragomir Dimitrov, "Revised Proposal to Encode the Bhaiksuki Script in ISO/IEC 10646" (2013-10-27) [WG2 N4489 || L2/13-194] Anshuman Pandey and Dragomir Dimitrov, "Revised Proposal to Encode the Bhaiksuki Script in ISO/IEC 10646" (2014-01-27) [L2/14-036] Anshuman Pandey and Dragomir Dimitrov, "Final Proposal to Encode the Bhaiksuki Script in ISO/IEC 10646" (2014-04-23) [WG2 N4573 || L2/14-091] |
Marchen | 11C70..11CBF |
68 characters for Marchen: 2 marks (11C70..11C71); 30 letters (11C72..11C8F); 29 subjoined letters (11C92..11CA7, 11CA9..11CAF); 5 vowel signs (11CB0..11CB4); and 2 other signs (11CB5..11CB6). Andrew West, "Proposal to encode the Marchen script in the SMP of the UCS" (2011-04-30) [WG2 N4032 || L2/11-140] Andrew West, "Final proposal to encode the Marchen script in the SMP of the UCS" (2013-10-22) [WG2 N4491 || L2/13-197] |
Ideographic Symbols and Punctuation | 16FE0..16FFF |
1 iteration mark for Tangut (16FE0). See under Tangut. |
Tangut | 17000..187FF |
6,125 Tangut ideographs (17000..187EC) [characters are named algorithmically based on their code point, as TANGUT IDEOGRAPH-hhhhh]. Richard Cook (UC Berkeley Script Encoding Initiative), "Proposal to encode Tangut characters in UCS Plane 1" (2007-05-09) [WG2 N3297 || L2/07-143] [Multi Column Chart : WG2 N3297A || L2/07-144] [Single Column Chart: WG2 N3297B || L2/07-145] Richard Cook (UC Berkeley Script Encoding Initiative), "Tangut Proposal Code Chart Update" (2007-07-24) [L2/07-229] Richard Cook (UC Berkeley Script Encoding Initiative), "Tangut Background" (2007-09-01) [WG2 N3307 || L2/07-289] China, "Response to UC Berkeleyâs proposals on Tangut" (2007-09-16) [WG2 N3338 || L2/07-301] Richard Cook, "Expert feedback on Chinese NB input on WG2/N3297 Tangut Encoding Proposal" (2007-09-17) [WG2 N3343 || L2/07-302] UK, "Comments on N3297: Proposal to encode Tangut characters in UCS Plane 1 and Charts" (2008-04-19) [WG2 N3448 || L2/08-175] China and US, "Comments on N3297: Proposal to encode Tangut characters in UCS Plane 1 and charts" (2008-04-22) [WG2 N3467 || L2/08-187] Richard Cook, [Five column chart] (2008-07-15 / 2010-04-16) [WG2 N3822 || L2/08-259] Richard Cook, "Single-Column Tangut Code Chart (using Column G font)" (2008-09-03) [L2/08-336] UK, "Review of Proposed Tangut Repertoire" (2008-09-01 Michael Everson and Andrew West, "Expert Feedback on the proposed Tangut character set in PDAM 6.2" (2008-09-24) [WG2 N3498 || L2/08-341] Richard Cook and Ken Lunde, "The UCS Tangut Repertory" (2008-10-10) [WG2 N3521 || L2/08-349] China, "Response from Tangut scholars of China on the Tangut Unicode proposal" (2008-10-13) [N3539 || L2/08-376] Erkki I. Kolehmainen, "Report from the Ad Hoc on Tangut" (2008-10-13) [N3541 || L2/08-377] Michael Everson, Nathan Hill, Guillaume Jacques, Andrew West, Viacheslav Zaytsev, "Proposal for a revised Tangut character set for encoding in the SMP of the UCS" (2009-03-01) [WG2 N3577 || L2/09-095] Michael Everson, Nathan Hill, Guillaume Jacques, Andrew West, Viacheslav Zaytsev, "Proposal for a revised Tangut character set for encoding in the SMP of the UCS" (2009-04-08) [WG2 N3577R || L2/09-115] [Appendix A: WG2 N3577R-A || L2/09-116] [Appendix B: WG2 N3577R-B || L2/09-117] Deborah Anderson and Richard Cook, "Request for Tangut font and mappings from N3577 to Amendment 7 repertoire" (2009-03-04) [WG2 N3586] Peter Constable, "Tangut Ad-Hoc Meeting Report" (2009-04-20) [WG2 N3629 || L2/09-169] China, Ireland, UK, "Final proposal for encoding the Tangut script in the SMP of the UCS" (2010-04-05) [WG2 N3797 || L2/10-095] [Appendix A: WG2 N3797-A] [Appendix B: WG2 N3797-B] Deborah Anderson and Richard Cook, "Comments on Tangut proposal N3797" (2010-04-16) [WG2 N3821 || L2/10-131] Deborah Anderson, "Tangut Ad hoc report" (2010-04-21) [WG2 N3833 || L2/10-141] UK, "Report on Tangut Encoding" (2011-05-22) [WG2 N4033 || L2/11-214] [Appendix A: WG2 N4033A || L2/11-214] [Appendix B: WG2 N4033B || L2/11-214] Michael Everson and Andrew West, "Tangut chart to supplement N4033 'Report on Tangut Encoding'" (2011-05-26) [WG2 N4083 || L2/11-204] Richard Cook and Deborah Anderson, Script Encoding Initiative, UC Berkeley, "Comments on Tangut report N4033" (2011â06â01) [WG2 N4094] Andrew West, Viacheslav Zaytsev, Michael Everson, "Proposal to encode the Tangut script in the UCS" (2012-10-02) [WG2 N4325 || L2/12-313] Michael Everson and Andrew West, "Code chart for Tangut ideographs and Tangut radicals" (2012-10-02) [WG2 N4327 || L2/12-315] China, "Comments on N4325, 4326 and N4327 (Tangut)" (2012-10-20) [WG2 N4370] China, "Explanation on the Re-facture of Tangut Fonts" (2013-06-10) [WG2 N4455] Deborah Anderson, SEI, UC Berkeley, "Summary of Tangut meeting (Beijing, China)" (2013-12-10) [WG2 N4516 || L2/13-241] Andrew West, Michael Everson, Han Xiaomang, Jia Changye, Jing Yongshi, Viacheslav Zaytsev, "Proposal to encode the Tangut script in the UCS" (2014-01-21) [WG2 N4522 || L2/14-023] Andrew West, Michael Everson, Han Xiaomang, Jia Changye, Jing Yongshi, Viacheslav Zaytsev, "Code chart for the Tangut script" (2014-01-21) [WG2 N4525 || L2/14-021] Andrew West, Viacheslav Zaytsev, Sun Bojun, Michael Everson, "Tangut glyph corrections" (2014-10-01) [WG2 N4588R2 || L2/14-209] China, "Review of N4558R Tangut glyph corrections" (2014-09-29) [WG2 N4640] Deborah Anderson, "Ad Hoc Reports for Tangut and Khitan Large Script" (2014-09-29) [WG2 N4642 || L2/14-246] Andrew West, Viacheslav Zaytsev, Michael Everson, "Discussion of Tangut character L2008-4148" (2014-12-01) [WG2 N4650 || L2/14-301] Andrew West, Michael Everson, Viacheslav Zaytsev, "Review of Tangut repertoire in DAM ballot" (2015-07-16) [WG2 N4667 || L2/15-175] China, "Reply to WG2N4650 and WG2N4667 on Tangut" (2015-10-13) [WG2 N4684 || L2/15-279] |
Tangut Components | 18800..18AFF |
755 Tangut radicals and character components (18800..18AF2). Michael Everson and Andrew West, "Proposal to encode Tangut Radicals and CJK Strokes in the UCS" (2008-09-01) [WG2 N3495 || L2/08-335] Richard Cook and Deborah Anderson, "Comments on the Tangut radicals and strokes proposal (N3495 = L2/08â335)" (2008-10-29) [L2/08-399] Andrew West, Viacheslav Zaytsev, Michael Everson, "Proposal to encode Tangut radicals in the UCS" (2012-10-02) [WG2 N4326 || L2/12-314] Michael Everson and Andrew West, "Code chart for Tangut ideographs and Tangut radicals" (2012-10-02) [WG2 N4327 || L2/12-315] Andrew West, Viacheslav Zaytsev, Sun Bojun, Michael Everson, "Proposal to encode Tangut radicals in the UCS" (2014-09-30) [WG2 N4636 || L2/14-228] |
Glagolitic Supplement | 1E000..1E02F |
38 combining letters for Glagolitic (1E000..1E006, 1E008..1E018, 1E01B..1E021, 1E023..1E024, 1E026..1E02A). Aleksandr Andreev, Heinz Miklas, and Yuri Shardt, "Proposal to Encode Combining Glagolitic Letters in Unicode" (2014-08-20) [WG2 N4608 || L2/14-087] Ralph Cleminson and David Birnbaum, "Expert Feedback on L2/14-087 Proposal to Encode Additional Glagolitic Characters" (2014-04-27) [WG2 N4608 || L2/14-103] Ralph Cleminson and David Birnbaum, "Additional Expert Feedback on L2/14â087 Proposal to Encode Additional Glagolitic Characters" (2014-07-21) [WG2 N4608 || L2/14-165] |
Adlam ** | 1E900..1E95F |
87 characters for Adlam: 34 uppercase letters (1E900..1E921); 34 lowercase letters (1E922..1E943); 7 marks (1E944..1E94A); 10 digits (1E950..1E959); and 2 punctuation marks (1E95E..1E95F). Michael Everson, "Preliminary proposal for encoding the Adlam script in the SMP of the UCS" (2012-10-28) [WG2 N4488 || L2/13-191] Michael Everson, "Proposal for encoding the Adlam script in the SMP of the UCS" (2014-09-23) [WG2 N4628 || L2/14-219] |
Leaf from a Tangut Buddhist manuscript (Great Perfection of Wisdom Sutra)
Block Name | Range | Characters / Source Documents |
---|---|---|
Arabic Extended-A | 08A0..08FF |
5 Arabic letters for Bravanese (08B6..08BA). Hamid Banafunzi, Marghani Banafunzi, and Maxamed Nuur, "Proposal to encode five Arabic script characters for the Bravanese (Chimiini)" (2014-08-31) [L2/13-178] Roozbeh Pournader, "Proposal to encode four Arabic characters for Bravanese" (2014-11-06) [WG2 N4498 || L2/13-223] Roozbeh Pournader and Shervin Afshar, "Proposal to Encode Arabic Letter Teh with Small Teh Above for Bravanese" (2014-11-01) [L2/13-293] 3 Arabic letters for Warsh-based orthographies (08BB..08BD). Lorna Evans (SIL International), "Supporting the Warsh orthography for Arabic script" (2014-04-29) [L2/14-104] Lorna Evans (SIL International), "Proposal to encode Warshâbased Arabic script characters" (2014-08-15) [WG2 N4597 || L2/14-211] 15 Quranic marks used in Pakistani printing (08D4..08E2). Lateef Sagar Shaikh, "Proposal to encode Quranic marks used in Quran published in Pakistan" (2014-04-24) [L2/14-095] Lateef Sagar Shaikh, "Proposal to encode Quranic Alternate Dammatan used in Quran published in Pakistan" (2014-04-25) [L2/14-096] Roozbeh Pournader, "Proposal to encode fourteen Pakistani Quranic marks" (2014-07-27) [WG2 N4589 || L2/14-105] Lateef Sagar Shaikh, "Proposal to encode Quranic mark Ar-Rub used in Quran published in Pakistan" (2014-08-11) [WG2 N4592 || L2/14-148] |
Kannada | 0C80..0CFF |
1 spacing candrabindu sign (0C80). Vinodh Rajan, "Proposal to encode Kannada Sign Spacing Candrabindu" (2014-07-18) [WG2 N4591 || L2/14-153] |
Malayalam | 0D00..0D7F |
3 chillu letters (0D54..0D56). Cibu Johny, "Proposal to encode MALAYALAM LETTER CHILLU LLL" (2013-05-15) [WG2 N4428 || L2/13-063] Cibu Johny, "Proposal to encode MALAYALAM LETTER CHILLU M" (2014-01-08) [WG2 N4539 || L2/14-013] Cibu Johny, "Proposal to encode MALAYALAM LETTER CHILLU Y" (2013-12-26) [WG2 N4539 || L2/14-017] 1 para sign (0D4F). Cibu Johny, "Proposal to encode MALAYALAM SIGN PARA" (2014-01-16) [WG2 N4538 || L2/14-016] 10 characters for fractions (0D58..0D5E, 0D76..0D78). Shriramana Sharma, "Proposal to encode Malayalam minor fractions" (2013-04-25) [WG2 N4429 || L2/13-051] |
Combining Diacritical Marks Supplement ** | 1DC0..1DFF |
1 combining deletion mark for Newa (1DFB). Ken Whistler, "Towards a Consensus Encoding of Newa" (2014-11-07) [WG2 N4660 || L2/14-285] |
Miscellaneous Technical | 2300..23FF |
4 power button symbols (23FB..23FE). Terence Eden, Joe Loughry, and Bruce Nordman, "Proposal to Include IEC Power Symbols" (2014-02-14) [WG2 N4567 || L2/14-009] Michael Everson, "Towards a proposal to encode power symbols in the UCS" (2014-02-04) [WG2 N4535 || L2/14-059] |
Supplemental Punctuation | 2E00..2E7F |
1 punctuation mark for Slavonic (2E43: DASH WITH LEFT UPTURN). Aleksandr Andreev, Yuri Shardt, and Nikita Simmons, "Proposal to Encode a Slavonic Punctuation Mark in Unicode" (2014-02-04) [WG2 N4534 || L2/13-238] 1 suspension mark for Byzantine Greek (2E44: DOUBLE SUSPENSION MARK). Dumbarton Oaks (Joel Kalvesmaki), "Proposal to encode GREEK BYZANTINE DOUBLE SUSPENSION MARK" (2014-07-18) [WG2 N4595 || L2/14-157] |
Latin Extended-D | A720..A7FF |
1 letter for Unifon (A7AE: LATIN CAPITAL LETTER SMALL CAPITAL I). Michael Everson, "Proposal to encode âUnifonâ and other characters in the UCS" (2012-04-29) [WG2 N4262 || L2/12-138] Michael Everson, "Revised proposal to encode Unifon characters in the UCS" (2014-02-24) [WG2 N4549 || L2/14-070] |
Saurashtra | A880..A8DF |
1 candrabindu sign (A8C5). Vinodh Rajan, "Proposal to encode Saurashtra Sign Candrabindu" (2014-08-07) [WG2 N4590 || L2/14-163] |
Ancient Greek Numbers | 10140..1018F |
2 signs for ancient Greek (1018D..1018E). Dumbarton Oaks (Joel Kalvesmaki), "Proposal to encode GREEK BYZANTINE INDICTION SIGN" (2014-07-18) [WG2 N4596 || L2/14-156] Dumbarton Oaks (Joel Kalvesmaki), "Proposal to encode GREEK BYZANTINE NOMISMA SIGN" (2014-07-18) [WG2 N4594 || L2/14-158] |
Khojki | 11200..1124F |
1 sukun sign for Arabic transliteration in the Khojki script (1123E). Anshuman Pandey, "Proposal to Encode the Khojki Sign SUKUN in ISO/IEC 10646" (2014-05-05) [WG2 N4575 || L2/14-133] |
Enclosed Alphanumeric Supplement ** | 1F100..1F1FF |
18 Japanese TV symbols required for ARIB STD-B62 (1F19B..1F1AC). Japan National Body, "Proposal to include additional Japanese TV symbols to ISO/IEC 10646" (2015-07-23) [WG2 N4671 || L2/15-238] |
Enclosed Ideographic Supplement ** | 1F200..1F2FF |
1 Japanese TV symbol required for ARIB STD-B62 (1F23B). Japan National Body, "Proposal to include additional Japanese TV symbols to ISO/IEC 10646" (2015-07-23) [WG2 N4671 || L2/15-238] |
Miscellaneous Symbols and Pictographs ** | 1F300..1F5FF |
2 emoji (see top of post): 1F57A : MAN DANCING 1F5A4 : BLACK HEART |
Transport and Map Symbols ** | 1F680..1F6FF |
5 emoji (see top of post): 1F6D1 : OCTAGONAL SIGN 1F6D2 : SHOPPING TROLLEY 1F6F4 : SCOOTER 1F6F5 : MOTOR SCOOTER 1F6F6 : CANOE |
Supplemental Symbols and Pictographs ** | 1F900..1F9FF |
67 emoji and emoticons (see top of post): 1F919 : CALL ME HAND 1F91A : RAISED BACK OF HAND 1F91B : LEFT-FACING FIST 1F91C : RIGHT-FACING FIST 1F91D : HANDSHAKE 1F91E : HAND WITH INDEX AND MIDDLE FINGERS CROSSED 1F920 : FACE WITH COWBOY HAT 1F921 : CLOWN FACE 1F922 : NAUSEATED FACE 1F923 : ROLLING ON THE FLOOR LAUGHING 1F924 : DROOLING FACE 1F925 : LYING FACE 1F926 : FACE PALM 1F927 : SNEEZING FACE 1F930 : PREGNANT WOMAN 1F933 : SELFIE 1F934 : PRINCE 1F935 : MAN IN TUXEDO 1F936 : MOTHER CHRISTMAS 1F937 : SHRUG 1F938 : PERSON DOING CARTWHEEL 1F939 : JUGGLING 1F93A : FENCER 1F93B : MODERN PENTATHLON 1F93C : WRESTLERS 1F93D : WATER POLO 1F93E : HANDBALL 1F940 : WILTED FLOWER 1F941 : DRUM WITH DRUMSTICKS 1F942 : CLINKING GLASSES 1F943 : TUMBLER GLASS 1F944 : SPOON 1F945 : GOAL NET 1F946 : RIFLE 1F947 : FIRST PLACE MEDAL 1F948 : SECOND PLACE MEDAL 1F949 : THIRD PLACE MEDAL 1F94A : BOXING GLOVE 1F94B : MARTIAL ARTS UNIFORM 1F950 : CROISSANT 1F951 : AVOCADO 1F952 : CUCUMBER 1F953 : BACON 1F954 : POTATO 1F955 : CARROT 1F956 : BAGUETTE BREAD 1F957 : GREEN SALAD 1F958 : SHALLOW PAN OF FOOD 1F959 : STUFFED FLATBREAD 1F95A : EGG 1F95B : GLASS OF MILK 1F95C : PEANUTS 1F95D : KIWIFRUIT 1F95E : PANCAKES 1F985 : EAGLE 1F986 : DUCK 1F987 : BAT 1F988 : SHARK 1F989 : OWL 1F98A : FOX FACE 1F98B : BUTTERFLY 1F98C : DEER 1F98D : GORILLA 1F98E : LIZARD 1F98F : RHINOCEROS 1F990 : SHRIMP 1F991 : SQUID |
Last modified: 2016-05-24
Index of BabelStone Blog Posts