Unicode Encoding of the Yi Script


Unicode Encoding of the Standardised Liangshan Yi Script

Since Version 3.0, Unicode has included two blocks covering the standardised Liangshan Yi script (as devised in 1974 and officially promulgated in 1980 as the "Scheme for the Standardisation of the Yi Script" 彝文规范方案) :


For details of the history of the encoding of the Yi script see N1187 "Encoding the Yi script" (1995-03-24) by Michael Everson.



Yi Syllables

The Unicode implementation of the Yi Syllables block is based on the 1991 PRC standard GB 13134-91.

The Yi Syllables block comprises 1,165 syllables, of which 1,164 represent all of the allowable syllables of the Liangshan Yi dialect (a syllable comprising consonant initial, vowel medial and tone final components), and one is a syllable repetition symbol [U+A015]. Of the 1,164 syllable syllables, 819 are unique syllables representing syllables with a t, Null or p tone, and 345 are diacritical modifications of one the 819 primary syllables in order to represent syllables with an x tone. 342 of the 345 x-tone syllables are modifications of the syllable for the corresponding syllable with a Null tone (e.g. the syllable ix [U+A001] is a diacritical modification of the syllable i [U+A002]). In three instances, there is no corresponding syllable with a Null tone, and the x-tone syllable is a diacritical modification of the syllable for the corresponding syllable with a p-tone ( hnox from hnop, nzox from nzop, and vex from vep). In all cases the x-tone modification is indicated by an inverted breve mark over the primary syllable. It could be argued that the x-tone syllables need not have been given individual code points, but could have been encoded by means of a combining diacritical character that could be appended to a primary syllable. However, this would allow users to enter invalid syllable/diacritical combinations, and so hard-coding the modified syllables corresponding to x-tone syllables is a far simpler solution.

The names given to the syllables in the Unicode standard are the spelling of the syllables in the Latin alphabet designed for representing the Liangshan Yi language in 1956. Note that the syllable repetition symbol w is called "YI SYLLABLE WU" in the Unicode standard, which is misnomer, as this syllable has no fixed phonetic value, and simply indicates a replication of the preceding syllable (sometimes with tonal mutation). In the Latin alphabet designed for Liangshan Yi this syllable is represented by the letter "w" appended to the preceding word (e.g. ꃪꀕ vatw). To the best of my knowledge, it is never represented by "wu".

The order of the syllables within the Unicode block is according to the spelling order of the Liangshan Yi Latin alphabet as used in dictionaries of the Liangshan Yi language :


Again, the only exception is the syllable repetition symbol w [U+A015], which is placed between the null-initial syllables and the b-initial syllables. It would have been more logical to place this syllable at the end of the block.

There are no punctuation marks in the Yi block (other than the syllable repetition symbol), as the PRC Standard for the Liangshan Yi script specifies the use of modern Chinese punctuation, which is as for English usage other than the use of the ideographic comma [U+3001] in place of U+002C when used for list enumeration, the use of the ideographic full stop [U+3002] instead of U+002E, and the use of single and double angle brackets [U+3008-300B] for book titles.

There are also no numbers in the Yi block, as numbers may be represented either by the appropriate syllable (e.g. cyp "one") or by arabic digits 0-9 [U+0030-0039].



Yi Radicals

The Yi Radicals block comprises 55 syllables (Unicode 3.0 specified 50 syllables, but five subsidiary radical forms omitted from Unicode 3.0 were added in Unicode 3.2). These radicals were artificially created by analogy with Chinese radicals (although unlike most Chinese radicals, the Yi radicals do not have any intrinisic meaning). The fifty-five radical syllables represent basic component strokes that are common to subsets of the set of standardised Liangshan Yi syllables. Note that these radicals only relate to the Unicode Yi Syllable block, not to all Yi syllables. A glossary of all known (non-standardised) Liangshan Yi syllables recognises 170 radicals, whilst a dictionary of Guizhou Yi syllables recognises 119 radicals.

The radicals are arbitrarily assigned based on the most commonly occuring strokes. In the case of the standardised Liangshan Yi script, twenty-six radicals are officially recognised, and each of the 1,165 Liangshan Yi syllables is assigned to one (and only one) of these radicals for indexing purposes. Each radical is assigned between nine and a hundred and two syllable syllables as shown in Table 1 below. For example the radical ssi is associated with the syllable syllables at, tip, hna, ggut, kie, nzop, etc. Thus the syllable syllable bip may be found in a dictionary by looking in the index under the radical ꒕. ssi Within each radical set, syllable syllables are normally sub-divided by number of strokes. See Yiwen Jianzi Ben 彝文检字本 for an example of syllable syllable lookup using these radicals.

Although there are only twenty-six radicals, the Yi Radicals block comprises fifty-five syllables. This is because certain stroke forms which act as radicals, but which are do not occur with sufficient frequency to justify being given individual radical status, are included as subsidiary forms under one of the twenty-six main radicals. For example, Radical 2 ( li) has a subsidiary radical form kit. Of the ninety-three syllable syllables assigned to Radical 2, ninety-one have the main li form, and only two have the subsidiary kit form. Sixteen of the twenty-six radicals have between one and four subsidiary forms.

Radicals are intended purely to facilitate the lookup of Yi syllables in dictionaries or for non-phonetic indexing, and would not not normally be expected to be found in ordinary textual contexts.


Table 1 : Yi Radicals
Radical Number Primary Form Subsidiary Forms Total Syllables
(Excluding X-Tone Syllables)
1 qot [U+A490]   101 (66)
2 li [U+A491] kit [U+A492] 93 (61)
3 nyip [U+A493]   102 (70)
4 cyp [U+A494]   101 (69)
5 ssi [U+A495]   52 (39)
6 ggop [U+A496]   40 (28)
7 gep [U+A497] mi [U+A498] 47 (34)
8 hxit [U+A499]   39 (28)
9 lyr [U+A49A] bbut [U+A49B] 14 (12)
10 mop [U+A49C] yo [U+A49D]
put [U+A49E]
28 (22)
11 hxuo [U+A49F] tat [U+A4A0] 62 (41)
12 ga [U+A4A1] zup [U+A4A2]
cyt [U+A4A3]
64 (48)
13 ddur [U+A4A4] bur [U+A4A5]
gguo [U+A4A6]
46 (30)
14 nyop [U+A4A7]   9 (7)
15 tu [U+A4A8]   38 (28)
16 op [U+A4A9]   90 (65)
17 jjut [U+A4AA] zot [U+A4AB] 44 (32)
18 pyt [U+A4AC] hmo [U+A4AD]
yit [U+A4AE]
vur [U+A4AF]
26 (20)
19 shy [U+A4B0] vep [U+A4B1]
za [U+A4B2]
21 (15)
20 jo [U+A4B3] nzup [U+A4B4]
jjy [U+A4B5]
got [U+A4B6]
jjie [U+A4B7]
31 (23)
21 wo [U+A4B8] du [U+A4B9]
shur [U+A4BA]
lie [U+A4BB]
10 (6)
22 cy [U+A4BC] cuop [U+A4BD] 30 (22)
23 cip [U+A4BE] hxop [U+A4BF] 18 (13)
24 shat [U+A4C0] zur [U+A4C1]
shop [U+A4C2]
che [U+A4C3]
28 (19)
25 zziet [U+A4C4] nbie [U+A4C5] 22 (16)
26 ke [U+A4C6]   9 (6)


The name assigned to each radical is the name of the Yi syllable syllable that it most closely resembles.



Traditional Yi Script

The existing Yi Syllables and Yi Radicals Unicode blocks relate to the standardised script now used for writing the Liangshan dialect of the Yi language in Southern Sichuan. This script was developed in 1974 in order to replace the existing Yi script, which no longer accurately reflected the phonetic characteristics of the modern Yi language, had numerous internal inconsistencies (syllables with multiple pronunciations and/or meanings, or multiple syllables representing the same syllable), and had numerous variant and local-form syllables.

Whilst the adaptation of a new standardised script is an important achievement, and I believe a necessary one if the Yi script was to remain viable, this does mean that Unicode cannot be used to encode the many pre-modern texts (mostly manuscript, some printed) and epigraphic inscriptions (dating back over 500 years) that exist in the pre-standardised script, or modern texts written in the non-standardised scripts of the Yi peoples used in Guizhou, Yunnan and Guangxi.

The Roadmap to the BMP shows that U+A500 through U+A72F have been provisionally reserved for "Yi Extensions". This allows space for an additional 560 code points. However, the number of code points needed to fully encode the greater Yi script will be considerably more than this value. Wu Zili 武自立 (Chuantong Yiwen 传统彝文 p.104) states that the total number of Yi syllables found in the various Yi dialectal regions is :


This makes a total of nearly 30,000 syllables. However, this figure is far larger than the actual number of syllables that will need to be encoded to cover the entire corpus of Yi texts.

Firstly, as writing was mainly used for religious, magical or medical texts that were handed down from generation to generation by the priests (pimu) of individual villages, and not as a means of communication between different communities or for the general dissemination of knowledge, there was a strong tendency for localised versions of the script to develop, with variant syllable forms being used by individual scribes. Variant forms of the same basic syllable do not need to be encoded, and indeed should not be encoded.

This situation is exemplified by a 1989 draft dictionary of the Guizhou dialect Yi script (Yi-Han Zidian 彝汉字典), which gives a total of about 8,000 individual syllables, but reduces these to about 1,700 basic syllable forms, the other 6,300 being simple graphic variants. A total of 1,700 basic syllables is fairly close to the sum of 1,840 syllables that tradition relates were devised by the creator of the Yi script during the Tang dynasty. It should be noted that quite a few of the 1,700 basic syllables have the same phonetic value as each other, which can probably be ascribed to a reduction in the phonetic range of the language since the original creation of the script, resulting in phonetic convergence of syllables that originally represented different syllables.

A similar reduction in numbers can be expected for the scripts used in Yunnan and Sichuan, as the total number of basic syllables should be limited by the number of possible syllables for the language at the time of the script's creation.

Secondly, although the scripts used in the different dialect regions are mutually unintelligible, they do share many common syllables. In some cases the syllables are identical, in other cases the syllables have been rotated around 90 or 180 degrees (due to the differing orientation of writing in different regions), and in other cases the graphic appearance of the syllables have become modified. Guoji Ningha 果吉·宁哈 has published a pioneering comparitive study of the Yi scripts of Yunnan, Sichuan, Guizhou and Guangxi (Lun Dian, Chuan, Qian, Gui Yizu Wenzi 论滇、川、黔、桂彝族文字) which demonstrates that many syllables span the various dialectal regions. Often the syllables have changed their graphic appearance to a greater or lesser extent, or their meanings have undergone a semantic shift, or they have acquired different pronunciations; but it is nevertheless clear that they derive from a common proto-syllable. Of course it is to be expected that there will have been some innovations during the history of the script, and that some new forms will have been developed locally. However, further comparative work should be carried out in order to try to reconstruct the core common syllables on which the regional scripts are based.

I think there are two possible approaches to the problem of encoding the superset of Yi syllables for all dialectal regions in all extant texts and epigraphic inscriptions.

The first approach would be to attempt to reconstruct the common set of basic syllables, and allocate each basic syllable a single code point. Variant forms of a basic syllable, including rotated and simplified forms, could be indicated where necessary by means of variation selectors. Any isolate forms would also be allocated a single code point. This would probably result in a block of about 2,000-3,000 code points. This extended superset of Yi syllables would presumably include syllables that correspond to most or all of the 819 basic syllable syllables in the Yi Syllables block, although possibly with differences in graphic appearance. Whether such syllables should be excluded from the Extended Yi block or not is a question that can yet be answered with certainty, although I would be inclined to treat the modern standardised Liangshan Yi syllables separately from the superset of syllables from it derives. That the creation of the standardised Liangshan Yi script was an attempt to completely reinvent the script for modern times rather than simply tinker with the pre-existing script merits separate script status to the standardised Liangshan Yi script, in the same way that the Gothic script is merited separate script status even though all of the Gothic letters correspond to pre-existing Greek, Latin or Runic letters. If an Extended Yi block were to be proposed for inclusion in Unicode, I would suggest that the new block should be added to the Supplementary Multilingual Plane (Plane 1) rather than to the Basic Multilingual Plane.

A second approach would be to create standardised scripts on the model of the Liangshan Yi syllabary for the Yi dialects of Guizhou, Yunnan and Guangxi. Each of these scripts would then be assigned a separate Unicode block in the same way that the various Brahmi scripts each have a separate block even though they share a common set of letters. If the non-Liangshan Yi dialects are to survive in the 21st century as living languages with their own literature and media, then I believe this is the only way forward (the phonetic differences of the various dialects preclude the possibility of other dialects sharing the Liangshan Yi syllabary). Were the Guizhou and Yunnan scripts to be standardised, this would probably mean that a high proportion of all existing syllables would be encoded in one or more of the standardised Yi scripts, in which case it would be sufficient to encode all the unused syllables as a miscellaneous "Yi Extensions" block.

Whichever approach succeeds, I believe that it is important that the full set of Yi syllables does become encoded within Unicode at some stage. However, I suspect that no proposals will be forthcoming in the near future, as much work still remains to be done before all existing Yi syllables have been fully catalogued and analysed.



600 Phrases in the Liangshan Yi Dialect | Yi People & Language | Yi Script | Yi Phonetic Alphabet | Dictionary of Liangshan Yi

BabelStone Home Page