The Celtic languages

The Celtic languages form a branch of the great Indo-European language family. In antiquity, Celtic languages were spoken across large parts of western and central Europe. Today, they are only spoken in small areas of northwestern Europe, by less than 2 million people altogether. This makes Celtic the smallest of the surviving branches of Indo-European in terms of speakers.

It ought to be noted that the term "Celtic" for this language family is a fairly recent invention. It was coined by the Welsh scholar Edward Lhuyd in the early 18th century, who first recognized that the non-English languages of the British Isles were related to each other and to the extinct Gaulish language. It is derived from the term Keltoi which Greek geographers used for the people of Gaul. Before Lhuyd, there was no native tradition on the British Isles that linked Brythonic to Goidelic or any of the two to Gaul, and none of the Insular Celtic peoples referred to themselves as "Celts". Yet, the relationship of the Celtic languages to each other and the validity of the Celtic node in the Indo-European family tree have been established beyond doubt.

There are lots of myths surrounding the Celts. Here are some in roughly increasing order of absurdity:

All of that is utterly unsubstantiated. There is nothing "special" about the Celts and their languages in any "esoteric" way. I am not going to discuss these shrewd ideas here; they have been refuted elsewhere. But even without such embellishments, the Celtic languages are interesting and a popular source of inspiration for conlangers, and that is what this essay is about.


Two different and intersecting schemes are in use to classify the Celtic languages. One scheme distinguishes between Q-Celtic and P-Celtic. The criterion is the development of the Proto-Celtic consonant */kw/, which is shifted to /p/ in P-Celtic but not in Q-Celtic (e.g., Welsh ["P"] pedwar '4' vs. Irish ["Q"] cathair, both from Proto-Celtic *kwetwār). The other scheme draws the line between Continental Celtic and Insular Celtic. All modern Celtic languages belong to the Insular group (even Breton; the Bretons are, at least linguistically, not descendants of Gauls but of Britons who left Britain around 500 AD). A combination of both classification schemes gives the following chart (fig. 1):

Q-Celtic P-Celtic
Insular Goidelic:
Irish (250,000)
Manx (see below)
Scots Gaelic (60,000)
Welsh (700,000)
Cumbric (extinct)
Cornish (see below)
Breton (500,000)
?Pictish (extinct)
Continental Hispano-Celtic:
(all extinct)
Gaulish group:
(all extinct)

Fig. 1. The four branches of Celtic.

[Map of ancient Celtic languages]

Fig. 1a. Distribution of Celtic languages ca. 200 BC.

[Map of modern Celtic languages]

Fig. 1b. Distribution of Celtic languages today.

The figures given above are from Wikipedia; other sources give different figures. All the Continental Celtic languages died out in antiquity, and few writings have survived. The Celtic membership of Lusitanian and Tartessian (both languages were spoken on the Iberian peninsula in ancient times) is disputed: while Lusitanian is certainly an Indo-European language, it is not certain whether it is Celtic; in the case of Tartessian, a plausible-looking reading of the extant inscriptions as Celtic has been proposed (Koch 2009), but some scholars doubt that it is even Indo-European.

Only four of the Insular Celtic languages survived into our century as native languages. Cumbric disappeared about 1000 years ago. Cornish died out in the late 18th, Manx in the 20th century. There are revival movements for the latter two languages, with several thousand practitioners of each language. Another language that may have been a Brythonic language is Pictish, but very little is known about it. Some scholars think it was not Indo-European at all, but that position is no longer maintained by many scholars.


The general history of the Celtic languages is fairly well known. The ancient Celts did not write much, but they left us several hundred inscriptions, and thousands of geographical names. The Celtic languages clearly are a branch of the Indo-European stock; the branch belongs to the so-called kentum group of Indo-European, in which the Proto-Indo-European palatovelars were not assibilated (other kentum branches are Italic, Germanic, Greek and, at the opposite end of the Indo-European world, Tocharian). Proto-Celtic was probably spoken in southern Central Europe at the beginning of the Iron Age (ca. 800-600 BC). It has been tentatively associated with the Hallstatt culture of that time; the later, more sophisticated La Tène culture certainly also was Celtic. By the year 200 BC, most of Western and Central Europe was Celtic-speaking (fig. 1a). Only few centuries later, however, the spread of Latin in the Roman Empire led to the demise of Continental Celtic, and by 500 AD, only the Insular Celtic languages had survived. Today, the Celtic languages are only spoken in small areas in the northwestern fringe of Europe (fig. 1b).

Some linguists assume that the extinct Continental Celtic languages excerted a substratum influence on Romance languages (especially French) by influencing how the people in the Roman provinces adopted Latin. Things famously attributed to this include the shift of Vulgar Latin /u/ to /y/ in French, and the liaison in the same language. However, the notion that liaison is connected to the Insular Celtic initial mutations via a common Celtic tendency to run words together is now discredited. If liaison had existed in the Vulgar Latin of Gaul, sound changes the language underwent, such as the lenition of intervocalic stops, would have resulted in initial mutations similar to the Insular Celtic ones in French, and that did not happen. Also, there is nothing specifically "Celtic" about the /u/ > /y/ shift. All that can be safely ascribed to Celtic are several hundred loanwords in French and other Romance languages (such as Fr. charrue 'plough' from Celtic *karrūkā).

It is difficult to draw up a family tree of the Celtic languages. The "Q-Celtic" and "Continental Celtic" groups are paraphyletic, i.e. the corresponding other group ("P-Celtic" and "Insular Celtic", respectively) evolved from their middle. But neither the "P-Celtic" nor the "Insular Celtic" group is necessarily a valid node in the tree. It is controversial whether Brythonic, which is both "Insular" and "P-Celtic", is more closely related to Goidelic or to Gaulish. Most likely, the family tree model is not very helpful at all here, and the ancient Celtic world is better understood as a dialect continuum within which innovation spread wave-like, resulting in the intersecting isoglosses we see in the known Celtic languages.

It may well be that the "P-Celtic" innovation, the shift of Proto-Celtic */kw/ to /p/ closing the gap left by the loss of */p/ in Proto-Celtic, spread through the ancient Celtic world together with the La Tène culture as a shibboleth of a more sophisticated lifestyle, failing to reach the outliers in Spain and Ireland. Some scholars also consider it possible that the change simply happened independently in Gaul and Britain; after all, there was a gap to close in the Celtic stop system, and a similar change affected a part of the Italic language family as well, showing that it could happen more than once.

The innovations common to the Insular Celtic languages are also not necessarily indicative of a closer relationship within the Celtic family. The Continental and Insular Celtic languages differ not only in location, but also in their structure. The Continental Celtic languages were conservative Indo-European languages, similar in structure and appearance to Greek or Latin. The Insular Celtic languages certainly evolved from similar languages, but in their modern forms they are very different. In fact, they differ so much from other Indo-European languages that the inclusion of the Insular Celtic languages in the Indo-European family was a subject of controversy in the early years of Indo-European linguistics. Most of the changes responsible for this were already completed about 600 AD, to judge from the earliest written attestations.

It is controversial why these languages are so different from the other languages of western Europe including the Continental Celtic languages. Some scholars assume that a non-Indo-European substratum language spoken in the British Isles before the arrival of Celtic was involved here, but many are deeply sceptical of that. A popular notion has been that the enigmatic substratum language was related to Semitic; but actually, apart from verb-initial word order, which in itself is not rare enough to indicate a connection, Insular Celtic and Semitic languages are not particularly similar. This leaves us with an unknown substratum, which has not much explanatory value, and the substratum theory has been abandoned by most scholars. Fact is that the changes happened.

The position of Celtic within the Indo-European stock is also a matter of debate. The Celtic languages share some features with the Italic languages (Latin and its closest relatives); but whether this justifies an "Italo-Celtic" node in the Indo-European family tree, or the similarities are the result of ongoing contact between the Italic and Celtic group, is disputed.

Features of the Insular Celtic languages


The consonant systems of the Insular Celtic languages are characterized by lenition of intervocalic stops. Voiced stops are lenited to voiced fricatives in both branches; voiceless stops are lenited to voiced stops in Brythonic and to voiceless fricatives in Goidelic. Geminate voiceless stops are lenited to voiceless fricatives in both branches. This means that the Insular Celtic languages are very rich in fricative sounds.

In some contexts, this lenition rule operated across word boundaries, causing an initial consonant to be lenited because of the vocalic ending of the preceding word. The result is a famous feature of the Insular Celtic languages: the initial mutations.

Labial Dental Alveolar Alveolar
Postalveolar Palatal Velar Glottal
Voiceless stops p t c
Voiced stops b d g
Voiceless fricatives ff th s ll si ch h
Voiced fricatives f dd
Nasals m n ng
Trill r
Approximants w l i

Fig. 2. Consonant inventory of Welsh (in Welsh orthography).

Labial Dental Alveolar Velar Glottal
Voiceless stops p t c
Voiced stops b d g
Voiceless fricatives ph th s ch sh
Voiced fricatives bh dh gh
Nasals m n ng
Liquids l,r

Fig. 3. Consonant inventory of Old Irish (in Irish orthography; all consonants have palatalized counterparts). The voiceless labial fricative is spelled f when from earlier *w and ph as mutation product of p. In Modern Irish, th is pronounced [h] and dh like gh ([γ]).

The vowels inherited from Indo-European also underwent far-reaching changes. Vowel assimilations occur; many unstressed vowels and final syllables are lost. These changes altered the appearance of the languages dramatically. The Proto-Indo-European mobile accent is not preserved. In Proto-Celtic, the first syllable was stressed; this is preserved in Goidelic, while in Brythonic, the accent has shifted to the penultimate syllable (after loss of final syllables).


The Insular Celtic languages are, like most Indo-European languages, inflecting. Due to the numerous sound changes that the languages have undergone, the inflections are often complex and irregular and do not clearly show the inherited Indo-European patterns. The noun has two numbers (singular and plural); Goidelic also shows remnants of a case system (nominative, vocative and genitive; in some dialects also a dative), while in the Brythonic languages, noun cases are completely lost. However, nouns undergo initial mutations in some contexts.

All Insular Celtic languages have a definite article, Breton also an indefinite article. There are two genders, masculine and feminine (Old Irish also has a neuter).

Radical Lenited Nasalized Spirantized
p b mh ph
t d nh th
c g ngh ch
b f m
d dd n
g Ø ng
m f
ll l
rh r

Fig. 4. Initial mutations in Welsh.

Radical Lenited Nasalized
p ph b
t th d
c ch g
b bh m
d dh n
g gh ng
m mh
s sh [h]
f fh [Ø]

Fig. 5. Initial mutations in Irish.

A morphological peculiarity of the Insular Celtic languages are the so-called "conjugated prepositons". These are prepositions that fused with a following personal pronoun, e.g. Welsh arnaf 'on me'.

Verbs inflect, as in other Indo-European languages, for tense and mood as well as the person and number of the subject. In the modern Insular Celtic languages, there is an increasing tendency to use periphrastic verb constructions (i.e., an inflected auxiliary followed by a verbal noun). The most important infinite form (in Brythonic the only one) is the verbal noun.

Present Imperfect Preterite Pluperfect Subjunctive Imperative
1st sg. caraf carwn cerais caraswn carwyf
2nd sg. ceri carit ceraist carasit cerych câr
3rd sg. câr carai carodd carasai caro cared
1st pl. carwn carem carasom carasem carom carwn
2nd pl. cerwch carech carasoch carasech caroch cerwch
3rd pl. carant carent carasant carasent caront carent
Impersonal cerir cerid carwyd carasid carer
Verbal noun caru 'to love'

Fig. 6. Verb inflection in Welsh.

Present Past habitual Preterite Future Conditional Subjunctive Imperative
1st sg. ligim liginn lig mé ligfidh mé ligfinn lige mé ligim
2nd sg. ligeann tú ligteá lig tú ligfidh tú ligfeá lige tú lig
3rd sg. ligeann sé ligeadh sé lig sé ligfidh sé ligfeadh sé lige sé ligeadh sé
1st pl. ligimid ligimis ligeamar ligfimid ligfimis ligimid ligimis
2nd pl. ligeann sibh ligeadh sibh lig sibh ligfidh sibh ligfeadh sibh lige sibh ligigí
3rd pl. ligeann siad ligidís lig siad ligfidh siad ligifdís lige siad ligidí
Impersonal ligtear ligtí ligeadh ligfear ligfí ligear ligear
Verbal noun ligean 'to allow'
Past participle ligthe

Fig. 7. Verb inflection in Irish.


The Insular Celtic languages show VSO word order (less prominently in Breton, where it is obscured by frequent fronting of the subject, but quite consistently in the other languages). Indirect objects, prepositional phrases and adverbs are placed at the end of the clause:

(1) Welsh
Taflodd y ferch y bêl dros y gwrych.
threw the girl the ball over the hedge
'The girl threw the ball over the hedge.'

In the noun phrase, the order is article-numeral-noun-adjective (the example below also shows that numerals are used with singular rather than plural nouns):

(2) Welsh
y tri dyn dall
the three man blind
'the three blind men'

Demonstratives are always placed at the end of the NP, which also carries a definitive article:

(3) Welsh
y ferch fach hon
the girl little this
'this little girl'

Genitives follow the possessum; while Goidelic has a genitive case, in Brythonic, the genitive relation is expressed by word order alone. A possessed NP is definite and carries no article. Relative clauses likewise follow the head noun; they are introduced by a particle:

(4) Welsh
y dyn a ddygodd yr arian
the man REL stole the money

Embedded clauses use the verbal noun, with the object encoded as possessor:

(5) Welsh
Bwriadai'r athro i'r plant ddarlen llyfr arall.
intended the teacher for the children read.VN book another
'The teacher intended the children to read another book'

The verbal noun is used in quite a few syntactic constructions, such as the progressive aspect:

(6) Irish
Tá Mícheál ag labhairt Gaeilge le Cáit anois.
is M. at speak.VN Irish with C. now
'M. is speaking Irish with C. now.'

There are two different verbs translated as 'to be'. The existential verb expresses existence, location or condition (like Spanish estar):

(7) Irish
Tá na húlla ar an mbord.
are the apples on the table
'The apples are on the table.'

The copula is used for definitions and identifications (like Spanish ser):

(8) Irish
Is é Seán an múinteoir.
is he S. the teacher.
'S. is the teacher.'

The copula is also used in cleft constructions:

(9) Irish
Is mise a dúirt é.
is me that say-1SG it
'It is me who said it.'

There are no simple words for 'yes' and 'no'. Instead, yes-no questions are answered with small sentences:

(10) Irish
An éisteann Seán lena mháthair ariamh?
PART listen S. ever mother his?
'Does S. ever listen to his mother?'
Éisteann. / Ni éisteann.
(He) listens. / Not (he) listens.
'He listens.' / 'He doesn't listen.'


© 2007-2010 Jörg Rhiemeier
Last update: 2010-10-10