IN SEARCH OF THE PROTO-LANGUAGE: The Trace of «Ukrainian Indo-Europeans» in the Caucasus

Photo by Gaël Gaborel — OrbisTerrae on Unsplash
The Indo-European language family is the most widespread on the planet, encompassing dozens of languages — from Albanian, which has no close relatives, to English, which is understood almost everywhere except in the most impenetrable jungles and deserts. All of these languages originated from a once-unified proto-language.
Today, 2.5 billion speakers of Indo-European languages live across all continents, accounting for at least half of the world’s population. But how did this language family manage to dominate the world? Scientists have found the answer by tracing the spread of an ancient DNA lineage.
IN SEARCH OF THE PROTO-LANGUAG
This language family has been called by many names. It was referred to as «Japhethic», after Japheth, one of the three sons of Noah, the biblical builder of the famous Ark. It has also been called «Aryan», «Indo-Iranian», and «Indo-Germanic». In 1813, an English scholar proposed the term «Indo-European».
Since then, the term has become firmly established in linguistic science, as it most accurately reflects reality. In addition to Indian Sanskrit and Germanic and Iranian languages, the Indo-European family includes nearly all modern European languages: Romance, Germanic, Slavic, Caucasian, Baltic, and others.
The main exceptions are Basque, Hungarian, and a few others. Linguists long ago suspected that all 400 Indo-European languages known today — both living and extinct — evolved from a single proto-language.
THE SEARCH FOR THE «ARYAN» HOMELAND
For at least the past 200 years, the history of the Indo-European family has posed a mystery for scholars. The issue first arose in the 18th century when researchers noticed similarities between Classical Greek, Latin, and Sanskrit — the ancient language of South Asia.
This led to the hypothesis of a common origin. The quest to find the Indo-European homeland has occupied scholars for decades, with various regions proposed as candidates: the Asian and Black Sea steppes, Asia Minor, the Caucasus, the Iranian Plateau, and even India.
Unfortunately, none of these hypotheses could be scientifically confirmed — until the latest advancements in genetic analysis made it possible. Using these cutting-edge methods, researchers in ancient genomics have finally identified the homeland of a nomadic tribe that transformed the culture and genetics of the entire world.
THE «STEPPE» HYPOTHESIS VS. THE «ANATOLIAN» HYPOTHESIS
Among all the theories explaining the widespread distribution of Indo-European languages, one dominated academic thought for a long time. Most scholars supported it — only to be proven wrong. According to this theory, ancient farmers from a region of modern-day Turkey, now known as Anatolia, spread Indo-European languages across the world.
This linguistic expansion was believed to have been driven by the spread of the agricultural revolution, which began around 9,000 years ago. The «Anatolian hypothesis» held sway in academia for quite some time. However, by 2015, it faced strong criticism.
That year, several groundbreaking studies in ancient genomics revealed the migration of Yamnaya culture pastoralists across Eurasia, starting about 5,500 years ago. This discovery gave rise to the «Steppe hypothesis», which explains the presence of steppe ancestors in most regions where Indo-European languages are spoken today.
WERE THE YAMNAYA THE FIRST INDO-EUROPEANS?
The Yamnaya were Bronze Age pastoralists who roamed the steppes of what is now Ukraine and Russia. Their name comes from one of their distinctive burial practices — interring the dead in mound-covered pit graves, laid on their backs with bent knees.
This culture played a crucial role in the widespread diffusion of Indo-European languages, becoming the strongest argument in favor of the «Steppe hypothesis.» Researchers likened the Yamnaya culture to a drop of dye on the surface of the water — from this single point, Indo-European languages spread like an expanding stain across the world over centuries.
To pinpoint the «source» of the Indo-European language family, which today is spoken by the majority of the world’s population, scientists analyzed hundreds of genomes extracted from ancient remains uncovered by archaeologists.
The data, published in the scientific journal Nature, provides clear evidence that the Yamnaya culture, associated with the earliest Indo-Europeans, emerged and developed in the region of the northern shores of the Black Sea.
A CULTURAL EXPLOSION WITHOUT AN EPICENTER
Even with genetic evidence, the «Steppe hypothesis» faced a significant challenge — it failed to explain the presence of the ancient, now-extinct Anatolian branch of Indo-European languages. This included Hittite, a language spoken by a highly developed civilization that rivaled Ancient Egypt.
The Hittite Empire thrived in what is now Turkey during the second millennium BCE. However, genetic studies found no trace of Yamnaya ancestry among the Hittites. Seeking answers, researchers once again turned to genetics for clues.
It is clear that the Yamnaya culture was the source of the Indo-European languages, but where did the Yamnaya themselves originate? Determining their genetic roots turned out to be a complex task. The reason lies in the rapid expansion of the Yamnaya people, who carried a nearly identical genetic signature.
This expansion resembled an explosion but with no obvious epicenter. Scientists also used another metaphor to describe it: a tumor with widely spread metastases but an elusive point of origin.
IT ALL BEGAN IN THE NORTHERN CAUCASUS
To solve this mystery, an international team of researchers, including Ukrainian scientists, analyzed genomic data from 428 ancient individuals. They examined and compared the genetic profiles of both the Yamnaya and the populations that preceded them in the Black Sea steppe and further southeast along the Caucasus Mountains.
As a result, they identified a genetic «signature» of people who lived in the region between the Caucasus Mountains and the lower Volga. This signature turned out to be the crucial link shared by the earliest Yamnaya and the ancient Anatolians. Further calculations revealed that around 6,000 years ago, a group of people from the Caucasus–Lower Volga region migrated westward.
Upon reaching the Black Sea region, they encountered and intermingled with local hunter-gatherers. This fusion gave rise to the Yamnaya. Around the same time, another group from the same Caucasus–Lower Volga region moved toward what is now Anatolia, where they also mixed with the local population. This led to the emergence of the now-extinct Indo-European languages and peoples of Asia Minor.
A CASE OF SCIENTIFIC DIPLOMACY
Today, we are witnessing how new technologies — particularly in genetics — are accelerating scientific progress. Right before our eyes, the «Anatolian» and «Steppe» hypotheses are being replaced by a more refined understanding. Scientists now have strong evidence to suggest that it was not the Yamnaya but rather the earlier populations of the Caucasus and Lower Volga who spoke the earliest form of Indo-European languages.
The Yamnaya then carried this language into the Black Sea steppes, from where it spread across the world at an astonishing rate. Beyond its scientific significance, this research also serves as an example of scientific diplomacy. Since the project began before the war in 2022, both Ukrainian and Russian researchers were equally involved. However, after Russia’s full-scale invasion of Ukraine, collaboration between them became impossible.
As a result, two parallel research teams emerged, with the same Western scholars serving as co-authors for both Ukrainian and Russian scientists. This allowed the project to continue and reach completion without being put on hold due to the ongoing conflict.
Original research: