The Myth of the Tangut Ritual Language


[Mirrored from]

I have previously discussed how complex the Tangut script is, and how Tangut characters are constructed like interconnected jigsaw puzzles using a method of character construction that I believe is not used for any other known writing system. But it is not only the Tangut script that is difficult and mysterious; the Tangut language is also hard to fathom, and has features that are difficult to explain. One of the most puzzling features of the Tangut language is that the surviving monolingual Tangut dictionaries define about twice as many characters than are actually used in almost all extant Tangut texts; that it is to say about half of the approximately 6,000 Tangut characters defined in dictionaries do not seem to be needed for writing actual Tangut text, but appear to have been designed solely with the intention of teasing future generations of linguists.

However, in 1986 (see bibliography at the bottom of the post) the Japanese Tangutologist Nishida Tatsuo noted that some of the unused characters were to be found in use in several ritual poems or odes [Tangut 𗊱] (five odes preserved in a woodblock print held at the Institute of Oriental Studies in St Petersburg [Tang. 125, No. 121]). In these poems each line was written twice, once using ordinary Tangut vocabulary (Nishida's "Vocabulary II") and once using an unusual vocabulary that in many cases was constructed from the otherwise unused characters in the Tangut dictionaries (Nishida's "Vocabulary I"). Nishida hypothesised that Vocabulary I represents the language of the "black-headed people" who he considers were a nomadic people that formed the ruling class of Tangut society; and that Vocabulary II represents the language of the "red-faced people" who he considers were a sedentary, agricultural people that formed the bulk of Tangut society. In other words, Vocabulary I may represent a linguistic substratum that was only preserved in the odes of the ruling class.

In 1996 the Russian Tangutologist Ksenia Kepping formulated the terms "common language" and "ritual language" to refer to these two forms of vocabulary, suggesting that the ritual language was an artificial language (lacking grammatical morphemes) created by Tangut shamans in ancient times, before the adoption of Buddhism, for ritual purposes. She supposes that this ancient ritual language only survived in writing in a few ancient ritual odes.

Recently Marc Miyake has been discussing Ritual Tangut [RT] and Common Tangut [CT] in a series of blog posts about the Tangut words for "camel", which has encouraged me to take a closer look at the one ritual ode that I have the text for (given in Nishida 1997), the Ode on Monthly Pleasures 𗼑𗼑𗫉𗊱 (translated into Chinese as 月月樂詩). My preliminary study of the text of this ode has caused me to doubt the theory that it was originally written in a special ritual language, and although I haven't yet gone through the entire Tangut text of the ode, I think that there is some value in sharing my initial impressions.

My first observation is that the "common language" version of the ode does not seem to me to be a translation of the "ritual language" version as Kepping suggests (Kepping 1996 page 28). Rather, I would suggest that the "common language" version represents the original text of the ode, and that the "ritual language" version is a gloss on this version, similar in nature to the interlinear vernacular glosses in some medieval Latin manuscripts. For example, in the 8th century Vespasian Psalter the Latin text of Psalm 67:2, exsurgat Deus et dissipentur inimici eius et fugiant qui oderunt eum a facie eius ["Let God arise, let his enemies be scattered: let them also that hate him flee before him"], is glossed word for word, in Latin word order, in Old English as a-rise god ⁊ sien to-strogdne feond his ⁊ flen from on-siene his ða fiodun hine. The resultant Old English is unnatural in a similar way that the Tangut "ritual language" appears unnatural and grammatically awkward compared with the common Tangut language.

In the only extant edition of the Odes, for each line of the ode the "ritual language" text precedes the "common language" text, which gives the impression that the "ritual language" text is the primary version of the ode, and the "common language" text is a secondary or derived version of the text, which on the surface would seem to argue against my gloss theory. However, this layout may simply reflect the fact that the "ritual language" interlinear gloss was originally written in small-sized characters to the right of the main "common language" text (in the same way that interlinear annotations to Chinese novels are written), and that during the course of textual transmission the small-sized interlinear glosses were transformed into ordinary-sized text running parallel and to the right (i.e. preceding) of the main text.

At this point it is worth take a detailed look at a couple of lines from the Ode on Monthly Pleasures. My first example is the first line of the ode, the first of two introductory lines that precede the verse relating to the First Month.

Ode on Monthly Pleasures Introduction line 1B (common language)
𗨛L2511rjɨr"to arise"

Ode on Monthly Pleasures Introduction line 1A (ritual language)
𘘞L1846ka"month" [RT]
𘛮L5288khjij"happy" [CT]
𗫉L2480bie̱j"entertainment" [CT]
𗥦L2750ɣu"head", "start", [CT]
𗤄L2082·jɨr"to ask" [CT]

The "common language" version of this opening line reads perfectly naturally, and can be translated as "How did the celebrations for each month arise?", referring to the various seasonal activities that take place throughout the calendar year. On the other hand, the "ritual language" version is awkward and difficult to translate without filling in some missing gaps: "[Someone] asks [what] the origin [of the] happy entertainment [for each] month [is]". The ordinary Common Tangut verb "to ask" in "ritual language" text corresponding to questions in the "common language" version has been taken as a ritual interrogative marker, but I think it is simpler to take it as an explanatory gloss on the "common language" question: "[the text] asks". Likewise, instead of trying to form the string of preceding nouns into a coherent sentence, I find it more reasonable to see them as a sequence of disconnected glosses on the corresponding words in the "common language" text: "month [means] month"; "entertainment [means] happy entertainment"; "origin [means] start"; "how did it arise asks [a question]".

This line also illustrates an unexpected feature of the text, namely that the "ritual language" version of the ode is not exclusively composed of "ritual" vocabulary, but also includes a great deal of ordinary Tangut vocabulary. In this line only the first word, ka "month", can be considered to be a ritual word; all the other words in the "ritual language" version of this line are ordinary Tangut words that also occur in ordinary Tangut texts. Thus the "ritual language" version of the ode cannot be said to be written in a special ritual language, merely that it includes certain vocabulary items that are not found in other Tangut texts.

My second example is the first line of the verse relating the the third lunar month.

Ode on Monthly Pleasures Month 3 line 1B (common language)
𘀑L3911pu"pigeon" (borrowing from Chinese 鵓[鴿] bó[gē])
𗚦L4176tju"turtledove" (borrowing from Chinese 鳩[鴿] jiū[gē])
𘙇L0795rjɨradverbal prefix
𗠜L4519bji"to call"
𗴴L0140lhejr"peaceful and happy"

Ode on Monthly Pleasures Month 3 line 1A (ritual language)
𗛰L4344lhejr"three" [RT]
𘘞L1846ka"month" [RT]
𘝁L0673thə"wing" [CT]
𘋳L5598gjwi"clothes" [CT]
𗍫L4027njɨ̱"two" [CT]
𘋠L5932"kind", "variety" [CT]
𗛝L4246lhejr"woods" [RT] (L2769 phjo "divination" may be a mistake for L3890 bo which forms a collocation with L4246 meaning "woods")
𗑲L4980tśhji̱w"to speak" [CT]
𗫶L3092djijadverbal suffix [CT]
𗊪L2029low"country" [CT]
𗲋L1529lụ"happy" [RT]

The "common language" version translates as "In the third month the pigeons and turtledoves call among the trees, and the country is at peace". The "ritual language" version has a couple of interesting features that are worth noting.

The two specific bird names in the main text (pigeons and turtledoves) are glossed as "two kinds of birds", where "birds" is represented by a kenning of two ordinary Tangut words ("wing-clothes" = "bird") and "two kinds" is written with ordinary Tangut characters, using the ordinary Tangut word for "two" rather than the "ritual language" word for "two". It makes sense to imagine that the glossist could not think of any of other words for these two specific bird names, and so simply glossed them as "two kinds of birds"; but on the other hand it is not plausible that "pigeons and turtledoves" is a translation of "two kinds of birds", which is strong evidence against Kepping's theory that the "common language" version is a translation of an original "ritual language" text.

Kepping has noted that the "ritual language" version lacks grammatical morphemes, which is generally true, and to be expected if the "ritual language" version comprises glosses on the words of the "common language" version; but, as can be seen in this example, grammatical morphemes are sometimes used in the glosses. The adverbal prefix rjɨr in the "common language" version corresponds to an adverbal suffix djij in the "ritual language" version. Both are ordinary Tangut words, and both function to indicate a continuative mode (the birds in the woods called continuously, not just once).

Kepping has also noted that the "ritual language" version favours two-syllable words over one-syllable words (other than verbs) in the "common language" text (Kepping 1996 page 27), and this does seem to be generally true, although, as can be seen from both the above examples, single-syllable nouns are not uncommon in the "ritual language" version. My explanation for the greater number disyllabic words in the "ritual language" version is simply that two-syllable words are less ambiguous and thus provide a more certain gloss. This explains the use of two-syllable "ritual language" words that are composed of two ordinary Tangut words (e.g. "happy entertainment" as a gloss for "entertainment"), but it does not explain how words that are unique to the "ritual language" text and which cannot be understood by reference to ordinary Tangut words were able to be understood by readers as glosses on the corresponding ordinary Tangut word. For example, how would a Tangut reader know that ka ·o means "moon", and why would this word be more understandable than the ordinary word for "month"? Indeed, if "ritual language" words such as this had such extremely restricted usage, and were only very rarely encountered, how would anyone ever even learn to read them in the first place?

I have doubts that "ritual language" words such as ka ·o are the remnants of a linguistic substratum or that they represent an artificial ritual language, but I do have an alternative theory to explain this otherwise unattested vocabulary. I think it is possible that they are in fact archaic words from a culturally important and universally known Tangut text (something equivalent to the Bible in European culture) that we do not know about simply because no copies of it have survived. A text such as this could have been required reading for all Tangut students, and its archaisms could have been as familiar to the Tangut people as biblical or Shakespearean expressions are to us. If this were the case, an otherwise unattested word such as ka ·o for "moon" could have been an obvious and unambiguous gloss for an ordinary, modern Tangut word. Compared with Khitan and Jurchen, a wealth of Tangut manuscripts and printed texts have survived down to the present day, but this just highlights how much more must have been lost forever. We study Tangut through the distorted prism of the sands of Kharakhoto, only able to see a fraction of all the Tangut books that there must once have been. One lost book is all it takes to confuse us into believing in the myth of an ancient ritual language.

Tangut Numbers

For the rest of this post I am going to look in more detail at one particular category of words used in the Ode on Monthly Pleasures, the numbers one to ten (see also Nishida 1997 pages 141–145 where he covers the same topic). Tangut numbers are relatively easy, and are clearly cognate to numbers in other Tibeto-Burman languages such as Nuosu (Liangshan Yi).

Tangut Numbers 1-10
Number Tangut LFW No. Reconstruction Nuosu Notes
1𘈩L0100lew cyp [ʦhɿ²¹]𘂪 L4855 dzjij "single" is probably the TB cognate for "one"
2𗍫L4027njɨ̱ nyip [ȵi²¹] 
3𘕕L5865sọ suo [sɔ³³] 
4𗥃L2205ljɨr ly [lɿ³³] 
5𗏁L1999ŋwə nge [ŋɯ³³] 
6𗤁L3200tśhjiw fut [fu⁵⁵]Nuosu [fu⁵⁵] is anomalous; other Yi dialects have e.g. [tɕhɔ¹³]
7𗒹L4778śjạ shyp [ʂɿ²¹] 
8𘉋L4602·jar hxit [hi⁵⁵] 
9𗢭L3113gjɨ̱ ggu [gu³³] 
10𗰗L1084ɣạ ci [ʦhi³³]𗫥 L3231 dźjɨ̣ "ten" is probably the TB cognate for "ten"

The ordinary month names are equally unproblematic, composed of a number followed by the word lhjị "month" (cf. Nuosu ꆪ hlep [ɬɯ²¹] "moon", "month"), but with special words for the first and last lunar month.

Ordinary Tangut Month Names
𗩭𗼑L2105/L2814tśjow lhjịfirst lunar month (正月)
𗍫𗼑L4027/L2814njɨ̱ lhjịsecond lunar month (二月)
𘕕𗼑L5865/L2814sọ lhjịthird lunar month (三月)
𗥃𗼑L2205/L2814ljɨr lhjịfourth lunar month (四月)
𗏁𗼑L1999/L2814ŋwə lhjịfifth lunar month (五月)
𗤁𗼑L3200/L2814tśhjiw lhjịsixth lunar month (六月)
𗒹𗼑L4778/L2814śjạ lhjịseventh lunar month (七月)
𘉋𗼑L4602/L2814·jar lhjịeighth lunar month (八月)
𗢭𗼑L3113/L2814gjɨ̱ lhjịninth lunar month (九月)
𗰗𗼑L1084/L2814ɣạ lhjịtenth lunar month (十月)
𗰗𘈩𗼑L1084/L0100/L2814ɣạ lew lhjịeleventh lunar month (十一月)
𗎊𗼑L4082/L2814rejr lhjịtwelfth lunar month (臘月)

These are the month names that are used in the "common language" version of the Ode on Monthly Pleasures, but the corresponding month names given in the "ritual language" version are very different, using a different, two-syllable word, for "month" (ka ·o), and a different set of numbers.

Special Tangut Month Names
𗤒𗆧𘘞𗳝L3305/L3457/L1846/L0863kjiw sjiw ka ·o"new year month" = first lunar month
𘙇𘂚𘘞𗳝L0795/L5855/L1846/L0863rjɨr lọ ka ·osecond lunar month
𗛰𘕻𘘞𗳝L4344/L5565/L1846/L0863lhejr gju ka ·othird lunar month
𗲛𗝝𘘞𗳝L1341/L4362/L1846/L0863kwej ŋwər ka ·ofourth lunar month
𗉨𗘝𘘞𗳝L1783/L1615/L1846/L0863tśjɨ̱r lu ka ·ofifth lunar month
𘀐𘍼𘘞𗳝L3849/L5081/L1846/L0863źjiw we ka ·osixth lunar month
𗘎𗘋𘘞𗳝L0332/L1347/L1846/L0863ŋwər kạ ka ·oseventh lunar month
𗍫𗥃𘘞𗳝L4027/L2205/L1846/L0863njɨ̱ ljɨr ka ·o"two four month" = eighth lunar month
𗥃𗏁𘘞𗳝L2205/L1999/L1846/L0863ljɨr ŋwə ka ·o"four five month" = ninth lunar month
𗍫𗏁𘘞𗳝L4027/L1999/L1846/L0863njɨ̱ ŋwə ka ·o"two five month" = tenth lunar month
𗏁𗤁𘘞𗳝L1999/L3200/L1846/L0863ŋwə tśhjiw ka ·o"five six month" = eleventh lunar month
𗌽𗎓𘘞𗳝L0804/L4051/L1846/L0863djɨ kjiwr ka ·o"cold month" = twelfth lunar month

The first and last lunar month have special names ("new year month" and "cold month" respectively, written with the ordinary Tangut words for "new", "year" and "cold"), and for the 8th through 11th months combinations of two ordinary Tangut numbers are used to represent the numbers eight through eleven disyllabically ("two [times] four" for "eight"; "four [plus] five" for "nine"; "two [times] five" for "ten"; "five [plus] six" for "eleven"). The remaining months, 2nd through 7th, prefix the word "month" (ka ·o) with a special set of disyllabic numbers that are (as far as I can tell) unique to the odes. Elsewhere in the odes, cardinal numbers are sometimes written using the ordinary ("common language") Tangut number characters (as seen in the first line of the verse for the 3rd Month, given an example above), and they are sometimes written with the special disyllabic words (e.g. lhejr gju for "three" in the 6th line of the verse for the 8th Month; and njɨ̱ ŋwə [2×5] for "ten" in the 7th line of the verse for the 6th Month).

The special numbers for two through eight used in the odes are paralleled by special ordinal numbers used (I'm not sure where, in one of the other odes perhaps) to indicate the relative seniority of sons, where the "common language" character ·jiw meaning "man" or "son" is prefixed by a single character to form a disyllabic word meaning "eldest son" through "eighth son".

Special Words for Sons
𗤙𗷸L2645/L1448da ·jiweldest son
𘂈𗷸L5914/L1448lọ ·jiwsecond son
𗬏𗷸L2465/L1448rjɨj ·jiwthird son
𗓟𗷸L4934/L1448ŋwər ·jiwfourth son
𘗤𗷸L5053/L1448tsjɨ̱r ·jiwfifth son
𘊚𗷸L3649/L1448we ·jiwsixth son
𗸨𗷸L1423/L1448ŋwər ·jiwseventh son
𗸪𗷸L1257/L1448·jar ·jiweighth son

The character da used in the word "eldest son" may be a borrowing from Chinese 大 "big", "eldest"; and the character for "eighth", 𗸪 ·jar, is phonetically identical to the ordinary Tangut word for "eight", and so can be considered to be simply "eight" written with a different character (constructed from the left hand side of 𗸩 L0384 ljịj "child" and the whole of 𘉋 ·jar "eight"). The other six characters, corresponding to "second" through "seventh" are not related to the ordinary Tangut number characters, but do correspond phonetically to one of the characters in the disyllabic words for "two" through "seven" in the names for the second through seventh months in the "ritual language".

Month Numbers and Son Numbers
Number Month Numbers Son Numbers
two𘙇𘂚L0795/L5855rjɨr lọ𘂈L5914lọ
three𗛰𘕻L4344/L5565lhejr gju𗬏L2465rjɨj
four𗲛𗝝L1341/L4362kwej ŋwər𗓟L4934ŋwər
five𗉨𗘝L1783/L1615tśjɨ̱r lu𘗤L5053tsjɨ̱r
six𘀐𘍼L3849/L5081źjiw we𘊚L3649we
seven𗘎𗘋L0332/L1347ŋwər kạ𗸨L1423ŋwər

The son numbers for "two" (lọ), "four" (ŋwər) and "six" (we) are phonetically identical to the second character of the corresponding month names; whereas the son numbers for "three" (rjɨj), "five" (tsjɨ̱r) and "seven" (ŋwər) are phonetically identical or similar to the first character of the corresponding month names. This correspondence cannot be a coincidence, and the only reasonable explanation is that the son numbers two through seven are cognate to the month numbers two through seven, or indeed, in four out of six cases are they are essentially just different ways of writing the same number. The characters for the son numbers two through seven are bound characters, and do not appear to occur other than modifying the word ·jiw "son"; but five of the six corresponding characters from the month numbers do occur by themselves with their numeric meaning in combination with ordinary ("common language") words in various Tangut texts :

Even in the Odes the shortened monosyllable form of the month number is sometimes used instead of the disyllabic word, for example, in the 4th line of the verse for the 12th Month the character 𗛰 lhejr is used by itself to mean "three". On the other hand, the disyllabic month numbers do not appear to occur outside the odes, and would seem to be expansions of the simple monosyllabic numbers. Quite why or how the single syllable numbers have been expanded to two syllables is unclear; except for the first character of the disyllabic word for "two" (L0795 rjɨr), all the other halves of the disyllabic numbers (L5565 gju, L1341 kwej, L1615 lu, L3849 źjiw, L1347 kạ) are bound characters and do not seem to occur by themselves or in combination with any other character anywhere else. The Wen Hai dictionary defines these five characters as their corresponding number when used as a disyllabic collocation, and their character construction also implies that they are numbers, but it does not seem likely that they were originally monosyllabic numbers, so I will not discuss them further.

This leaves us with six monosyllabic words for the numbers two through seven that can be used instead of the normal Tangut numbers :

These alternative numbers are used both in "common language" and "ritual language" texts, and do not seem to me to have any particular connection with a pre-Buddhist ritual language that Kepping has posited; indeed, some of the words that incorporate these numbers, such as Tripitaka and Wutai Shan, are overtly Buddhist. But why are these alternative numbers used sometimes? The mountain name "Five Platforms Mountain" (Wutai Shan) is written with the ordinary Tangut words for "platform" and "mountain", so why is "five" alone written with the non-standard word for "five"? What, then, are the origins of this series of numbers?

At the present time I don't have an answer to these questions, but would note that I have been unable to find cognates for these numbers in any other language, or find any language with identical or very similar words for four and seven that are phonetically similar to ŋwər.


Addendum A

Coincidentally, at the same time that I was writing this post, Marc Miyake has been discussing the different words for "two" on his blog :

Addendum B

On 25 April 2013 a Day of Tangut Studies was held at SOAS, University of London, and I presented a talk entitled "The Ode on Monthly Pleasures—a new interpretation" which elaborates on the theme of this blog post. A PDF of my presentation may be downloaded from academia.

Last modified: 2017-01-01 (updated with Unicode Tangut characters)

If Tangut characters do not display correctly, please download and install the Tangut Yinchuan font.

Index of Blog Posts