Jörg Rhiemeier's Conlang Page
The Caucasus mountain range (fig.1) is a linguistic wonderland (though definitely not the kind of place I want to live, with all that political trouble and economic problems; field linguists working in this area require the courage and caution of a war reporter), with about 50 different languages in an area the size of France. There are almost as many languages in the Caucasus as there are in the rest of Europe. Some of these languages belong to families found elsewhere. The Indo-European family is represented by Armenian, Ossetian, Talysh and Tati (all Iranian languages except Armenian), and of course the modern lingua franca of the area, Russian. The Azeri, Balkar, Karachay, Kumyk and Nogay languages belong to the Turkic family, and Aisor, a descendant of Aramaic, is a Semitic language.
Most languages of the Caucasus, however, belong to three families not found elsewhere. These (about 40) languages are the ones meant when linguists speak of Caucasian languages (as is done in this essay). They are somtimes also called Palaeo-Caucasian, Old Caucasian or, especially in eastern Europe, Ibero-Caucasian (the 'Ibero-' in the latter name has nothing to do with the Iberian peninsula, but refers to the ancient kingdom of Iberia in what is now Georgia; the term is deprecated because it is so misleading).
Fig. 1. Ethnolinguistic groups in the Caucasus region. (Source: CIA)
This classification follows Catford (1977), numbers of speakers according to the 1970 Soviet Union Census. Some languages will have more speakers today, some less. Several languages also have speakers outside the former USSR.
Fig. 1a. Distribution of the Caucasian languages.
One ought to keep in mind that these three families are distinct families and not branches of a single "Caucasian" family. No relationship between any two of the three families has been established so far. Nor has any of them been proven to be related to any family elsewhere in the world. Many authors have raised such claims, but most of them (especially those who claim that "Caucasian" was related to Basque or whatever, without realizing that there are three different families) can be dismissed out of hand. Any serious relationship hypothesis either involves only one of the three families, or if it involves more than one, all involved families have to be considered separately.
One of the first scholars to propose a relationship between a Caucasian family and a non-Causasian family was Franz Bopp, one of the founding fathers of Indo-European comparative linguistics, who assumed in 1847 that SC was related to Indo-European. The idea of such a relationship is also entertained by the proponents of the 'Nostratic' hypothesis, who group SC with Indo-European, Uralic, Altaic, Afro-Asiatic and Dravidian. Some linguists also assume that the Hurrian and Urartean languages (two related ancient languages of eastern Anatolia) are related to NEC. None of these proposals have won much acceptance.
Even though the three Caucasian families are not known to be related to each other, there are some similarities between them: they form a Sprachbund or convergence area. However, they are also very different from each other in many respects.
The Caucasian languages are famed for their rich consonant inventories. All of them have labial, dental, velar and uvular stops as well as alveolar, postalveolar and uvular fricatives. There are voiceless, voiced and glottalized (ejective) stops.
The actual number of consonants varies from language to language. Georgian has the smallest consonant inventory (fig. 2); the other languages add more consonants to it. The NWC languages have labialized and palatalized consonants as well as lateral fricatives; the latter are also found in many NEC languages. The most copious inventory is that of Ubykh (fig. 3), with 80 consonants.
Fig. 2. Consonant inventory of Georgian.
The SC languages are also famed for their formidable consonant clusters; forms such as Georgian brts'q'inva 'to shine' are nothing unusual. The NWC languages are also rich in consonant clusters, though not quite as excessive ones as SC, and the NEC languages have few clusters.
|Voiceless stops||p pʕ||t tw||[k] kj kw||q qʕ qj qw qʕw|
|Voiced stops||b bʕ||d dw||[g] gj gw|
|Glottalized stops||p' pʕ'||t' tw'||[k'] kj' kw'||q' qʕ' qj' qw' qʕw'|
|Voiceless affricates||ts||tɕ tɕw||tʃ||tʂ|
|Glottalized affricates||ts'||tɕ' tɕw'||tʃ'||tʂ'|
|Voiceless fricatives||f||s||ɕ ɕw||ʃ ʃw||ʂ||x||χ χʕ χj χw χʕw||h|
|Voiced fricatives||v||z||ʑ ʑw||ʒ ʒw||ʐ||γ||ʁ ʁʕ ʁj ʁw ʁʕw|
|Laterals||l ɬ ɬ'|
Fig. 3. Consonant inventory of Ubykh.
The Caucasian languages also have the reputation of being poor in vowels. This, however, is undeserved. Georgian has an ordinary five-vowel system /a e i o u/; some languages, such as Chechen, have even more. Only the NWC languages can be said to have only two vowel phonemes, a high one and a low one. However, these vowel phonemes have broad ranges of allophones, depending on the neighbouring consonants, which are spread about the whole vowel space.
The only Caucasian language with a long-standing tradition of writing is Georgian, which is written in its own beautiful script known as Mχedruli (fig. 4; there are also other variants of this alphabet that look quite different and are little used today; the term Mχedruli refers specifically to the one chiefly used in modern Georgia). It has been a literary language since about 400 A.D.; the Georgian alphabet is specifically designed for the Georgian language and the spelling is mostly phonemic.
The other Caucasian languages had no official status in Czarist Russia, and were only occasionally written (using Georgian, Arabic or Cyrillic letters) or, in most cases, not at all. This changed after the revolution, when, at least nominally, all languages of the USSR were granted equal rights (though in practice, Russian remained the primus inter pares). Several Caucasian languages were developed as written languages taught in schools, first using the Latin alphabet but later the Cyrillic alphabet. Some languages gained official status in various autonomous republics and provinces set up in the Caucasus area by the Soviet government.
Fig. 4. The Georgian alphabet. (Source: Omniglot)
All Caucasian languages are rich in inflection, but the inflectional systems in the three families are very different from each other. The NWC languages have only small numbers of noun cases (1 to 4), while many NEC languages have very many (up to 40 and more, most of them bimorphemic local cases). The SC languages are somewhere in the middle, with similar case inventories as those found in the older Indo-European languages. Almost all Caucasian are ergative at least to some degree. Most NEC languages have a system of noun classes.
The verb is always inflected. The NWC languages have especially complex verb morphology. The verb agrees with subject and object in person and number; there are also many tenses, moods and other inflections. The NEC and SC verbs also show complex morphology; in most NEC languages, verbs show noun class agreement.
Typical for Caucasian languages is, as already mentioned, the ergative construction: the subject of an intransitive verb is treated like the object of a transitive verb. The word order is SOV in most languages; most Caucasian languages use postpositions rather than prepositions.
The NWC family consists of five languages, one of which is now extinct. Abkhaz is spoken in Abkhazia, which nominally forms the northwestern tip of the Republic of Georgia, but is actually under Russian control. Abaza is closely related to Abkhaz. It is spoken in the Karachay-Cherkess Republic of Russia. Ubykh was formerly spoken near Sochi, Russia. It went extinct in 1992, when its last native speaker, Tevfik Esenç, died. Adyghe is spoken in the Republic of Adyghea, Russia; its close relative Kabardian in the Kabardino-Balkar and Karachay-Cherkess Republics, Russia. When Russia conquered the Caucasus in 1864, the majority of NWC speakers (including all Ubykhs) emigrated to Turkey, where some NWC-speaking communities still exist today.
The NWC languages are known among linguists for their large consonant inventories, ranging from 47 in Kabardian (fig. 5) to 80 in Ubykh (fig. 3).
|Voiceless stops||p||t||kw||q qw|
|Glottalized stops||p'||t'||kw'||q' qw'||ʔ ʔw|
|Voiceless fricatives||f||s||ɕ||ʃ||ɬ||x xw||χ χw||ħ|
|Voiced fricatives||v||z||ʑ||ʒ||ɮ||γ||ʁ ʁw|
Fig. 5. Consonant inventory of Kabardian.
The vowel inventories are small. The NWC languages are often described as having only two vowel phonemes, a high one and a low one. These two phonemes, however, have different allophones depending on the adjacent consonants (fronted next to palatalized consonants, rounded next to labialized consonants). Thus, the high vowel can be realized as [i], [ɨ], [u]; the low vowel as [ɛ], [a], [ɔ]. This means that on the phonetic level, the langugaes show rather normal vowel inventories. It is likely that Proto-NWC had a normal vowel inventory, but transferred the features of fronting and rounding from the vowels to the neighbouring consonants.
Stress is distinctive in the NWC languages, e.g. Abkhaz 'aχwaʂa 'unfortunate', a'χwaʂa 'Friday', aχwa'ʂa 'piece'. The languages have mostly open syllables and rather complex morphophonemic alternations.
Nouns in NWC are inflected for case, number and definiteness. The case systems are rather simple. Abkhaz has only an unmarked absolutive and an adverbial case. Adyghe has a four-case system (fig. 6). Possession is indicated by pronominal prefixes.
Fig. 6. Nominal inflection in Temirgoi Adyghe.
Personal pronouns distinguish gender in the 2nd and 3rd persons (fig. 7). Only Abkhaz and Abaza have true 3rd person pronouns; Ubykh and the Circassian languages use demonstratives instead. Demonstratives in Ubykh distinguish 'this' and 'that'; in the other languages, 'this (near me)', 'that (near you)' and 'that (yonder)'.
Fig. 7. Personal pronouns in Abkhaz.
Verbs show very complex morphology in the NWC languages. They are conjugated for subject, direct object and indirect object. There are three sets of conjugation prefixes: set I for intranstive subjects and transitive objects (all NWC languages are ergative), set II for indirect objects, set III for transitive subjects (fig. 8).
|1st||s-||s-||s- (/z-)||ħ-||ħ-||ħ- (/a:-)|
|2nd masc.||w-||w-||w-||ʃw-||ʃw-||ʃw- (/ʒw-)|
|2nd fem.||b-||b-||b-||ʃw-||ʃw-||ʃw- (/ʒw-)|
|3rd masc.||d-||j-||j-||j-||r- (/d-)||r- (/d-)|
|3rd fem.||d-||l-||l-||j-||r- (/d-)||r- (/d-)|
|3rd non-human||j-/Ø-||a-/Ø-||(n)a-||j-||r- (/d-)||r- (/d-)|
Fig. 8. Verb agreement prefixes in Abkhaz.
The NWC languages have complex tense systems. Abkhaz, for instance, has present, imperfect, two futures, two conditionals, aorist, past indefinite, perfect and pluperfect (fig. 9).
|Future I||-p'||Conditional I||-rɘn|
|Future II||-ʂt'||Conditional II||-ʂan|
Fig. 9. Tense suffixes in Abkhaz.
The NWC languages have SOV word order and are postpositional. Genitives precede the nouns, but most adjectives follow. Subordinate clauses precede the main verb; the verb in the subordinate clause appears in an infinite form called a converb (similar to Turkish or Japanese).
Because NWC languages are ergative, intransitive and transitive clauses are constructed differently, and different verb inflections are required whether an object is present or absent, as in the following example from Abkhaz:
'The dog bites.'
a-'la a-tsgwə-'kwa jə-'rə-tsħa-wa-n
the-dog the-cat-PL it(ERG)-them(ABS)-bite-DYN-IMPF
'The dog was biting the cats.'
The NWC languages do not use finite subordinate clauses; instead, non-finite verb forms (participles and converbs) are used.
An example of an Ubykh relative clause:
'the man for whom I give you an apple'
An example of an Abkhaz complement clause:
'It is a lie that he spoke thus.'
The NEC family is deeply divided into two branches: Nakh and Daghestanian. Nakh and Daghestanian languages are different enough in their vocabularies that some linguists doubt their relationship, but that is a minority position; anyway, the languages are similar enough to be treated together (there are some typological differences, though). The Nakh group consists of three languages: Chechen, spoken mainly in the Chechen Republic (Russia), Ingush in the neighbouring Ingush Republic (Russia), and Bats, spoken in the Pankishi valley in northeastern Georgia. The Daghestanian group consists of 26 languages, most of which are spoken in the Republic of Daghestan (Russia). The most important Daghestanian languages are Avar and Lezgian. Some Daghestanian languages are only spoken in a single village.
No external relationship of the NEC languages has been established so far. Some linguists have proposed a relationship of NEC with two interrelated extinct ancient languages of eastern Anatolia, Hurrian and Urartean. Another relationship candidate is Burushaski, an isolate in northeastern Pakistan, which is typologically similar to the Daghestanian languages. However, typological similarity is not sufficient to establish relationship.
The NEC languages have rich consonant inventories, though less so than the NWC languages. As in other Caucasian languages, there are ejective and uvular consonants as well as two or more series of sibilants. Many NEC languages have lateral fricatives as well as 'intensive' (tense or geminate) consonants. The inventory of Avar (fig. 10) is given below.
|Voiceless stops||p||t||k k:||q:|
|Glottalic stops||(p')||t'||k' k':||q':||ʔ|
|Voiceless affricates||ts ts:||tʃ tʃ:||tɬ:|
|Glottalic affricates||ts' ts':||tʃ' tʃ':||tɬ':|
|Voiceless fricatives||s s:||ʃ ʃ:||ɬ ɬ:||x:||χ χ:||ħ||h|
Fig. 10. Consonant inventory of Avar.
Some languages also have pharyngealized, labialized and palatalized consonants. The vowel systems vary much. Avar has a e i o u; some languages have front rounded vowels, back unrounded vowels or both. Khinalug and the Nakh languages also have diphthongs. Some Daghestanian languages are tonal.
Nouns in NEC languages are grouped into noun classes. These are usually not marked on the noun itself (with exceptions such as Avar w-ats: 'brother', j-ats: 'sister'), but adjectives and verbs agree to them. The number of noun classes varies from language to language. Most languages have three (male human, female human, non-human) or four classes. The Nakh languages have more; Tabasaran has only two (human vs. non-human); Lezgian, Agul and Udi have no noun class disctinction at all. There is thus a cline running from the northwest (many classes) to the southeast (few classes) within the family.
|bəd||Ø-iʔeru||oʒe||'this small boy'|
|bodu||j-iʔeru||kid||'this small girl'|
|bəd||j-iʔeru||tselu||'this small drum'|
|bodu||b-iʔeru||wə||'this small dog'|
|bəd||r-iʔeru||tʃ'it'||'this small knife'|
Fig. 11. Noun class agreement in Hunzib NPs.
Fig. 11a. Noun class inventory sizes in NEC.
The nouns themselves inflect for case and number. Usually, the base stem is also the absolutive singular. From this stem, the oblique singular stem is formed with a suffix, the absolutive plural with a different suffix, and the oblique plural stem from the absolutive plural. There are also different patterns.
The Nakh languages have case inventories comparable to those of SC and the older Indo-European languages. Most Daghestanian languages, in contrast, have very rich inventories of local cases. These cases are formed by the combination of suffixes indicating orientation and direction, and in some languages, also a non-distal/distal opposition (fig. 12).
Fig. 12. Local cases in Tsez.
Verb morphology in NEC languages is simpler than in the NWC languages. Verbs usually inflect for tense, aspect and mood, and in many languages, take class prefixes agreeing with the noun class of the absolutive argument. Some of the more southerly languages have person-number markers on the verb.
The NEC languages allow for much freedom in word order, due to the extensive case marking; the basic, unmarked order is SOV. Adjectives and genitives precede the noun, postpositions follow it. Objects are focused by OVS order, subjects by OSV order, as in the following Archi examples:
hunter-ERG bear.ABS kill:III.AOR
'The hunter killed a bear.'
bear.ABS kill:III.AOR hunter-ERG
'The hunter killed a bear.'
bear.ABS hunter-ERG kill:III.AOR
'The hunter killed a bear.'
Relative clauses are formed with participles in most NEC languages, e.g. in Godoberi:
wac̄u-di b.axi-bu hamaxi
brother-ERG III-buy.PST-PART donkey
'the donkey which my brother bought'
The South Caucasian (Kartvelian) family consists of four languages. Georgian is the largest Caucasian language. It is the official language of Georgia and the only Caucasian language with a long-standing literary tradition, written in an alphabet of its own (fig. 4). Mingrelian (also spelled Megrelian), spoken in northwestern Georgia, is closely related to Laz, which is spoken mainly along the Black Sea in Turkey near the Georgian border, and also in a small area in the southwestern corner of Georgia. Svan in northern Georgia is the most distantly related member of the family.
The Kartvelian languages are the least consonant-rich of the Caucasian languages. The consonant inventory of Georgian is shown in fig. 2; the other languages of the family have similar inventories. These consonants form extensive clusters. Georgian has the five vowels a e i o u. Mingrelian adds schwa, Laz front rounded vowels, Svan both. Vowel gradations similar to Indo-European ablaut occur in the SC languages; Swan also has i-umlaut.
The South Caucasian languages, like all Caucasian languages, are rich in inflectional morphology. They are intermediate between the NWC languages with their rich verbal and less rich nominal morphology, and the NEC languages with their simpler verbs and more richly inflected nouns. Of the three Caucasian families, South Caucasian is the one most similar to Indo-European.
Nouns in SC languages are inflected for case and number. There are no genders or noun classes. The numbers are singular and plural; the case systems are more developed than in NWC, but not as luxuriant as in the Daghestanian languages. They are indeed comparable to the systems of the older Indo-European languages (fig. 13).
|Singular||Plural (old)||Plural (modern)|
Fig. 13. Nominal inflection in Georgian (kali 'woman').
The case alignment in Georgian is split. In the present tense, the case marking is accusative, with the dative case used as an accusative. In the aorist, it is split-S. Direct objects of transtitive verbs and subjects of stative verbs are in the nominative, while subjects of transitive and of active intransitive verbs are in the ergative.
The Kartvelian verb is a complex affair, and this section will only touch on the basics. The verb is inflected for tense, aspect and mood as well as the person and number of subject and object. There are numerous prefixes, suffixes and circumfixes.
|1st person||v-||v- -t||m-||gv-|
|2nd person||Ø/χ-||Ø/χ- -t||g-||g- -t|
|3rd person||-a/o/s||-n||Ø-||Ø- -t|
Fig. 14. Personal affixes in Georgian.
The tense/aspect/mood forms ('screeves', from Georgian mts'k'rivi 'row') of Georgian are grouped in three series:
The three series show different morphosyntactic alignment. In the present series, it is accusative, using the nominative case to mark subjects and the dative case to mark direct objects. In the aorist series, it is active/stative, using the ergative to mark agents and the nominative case to mark patients. In the perfect series, it is also active/stative, but agents appear in the dative case. An exception to this are verbs of perception and emotion, which always have dative subjects and nominative objects.
Of all three Caucasian stocks, the Kartvelian languages are syntactically most like the Indo-European languages. There are, for instance, relative pronouns, a feature that is common in Europe but rare anywhere else:
is ena romel-zeda-c daiγiγina
that tongue which-on-REL he.hummed Rustaveli-ERG
'the language in which Rustaveli hummed'
Similarly, complement clauses have a fairly European look to them:
naxa rom γvino aγar iq'o
3SG.saw.it that wine not.anymore it.was
'(S)he saw that there was no wine anymore.'
Modern Georgian is a mostly head-final SOV language with postpositions. Old Georgian is less consistently head-final: modifiers often follow their heads. Old Georgian genitives show suffixaufnahme, i.e. they agree with the head noun in case and number. Example:
q'ovel-i igi sisχl-u saχl-isa-j m-is
all-NOM that blood-NOM house-GEN-NOM that-GEN Saul-GEN-GEN-NOM
'all the blood of the house of Saul' (2 Kings 16:8)
This construction is no longer productive in Modern Georgian.
Boeder, W. 2005. The South Caucasian languages. Lingua 115:5-89.
Catford, J.C. 1977. Mountain of Tongues: The Languages of the Caucasus. Annual Review of Anthropology 6:283-314.
Deeters, G. 1963. Die kaukasischen Sprachen. Handbuch der Orientalistik 1.7:1-79.
Hewitt, G. 2005. North West Caucasian. Lingua 115:91-145.
Klimov, G. 1994. Einfüherung in die kaukasische Sprachwissenschaft. Hamburg: Buske.
van den Berg, H. 2005. The East Caucasian language family. Lingua 115:147-190.
© 2007-2010 Jörg
Last update: 2010-10-10