A brief note on some new developments regarding the genomics of Indians

When we wrote a previous article on this matter we had stated that new data will alter the details of our understanding of picture discussed therein. Indeed, two new manuscripts which were deposited in the past month by McColl et al and Narasimhan et al have done so. These are still deposited manuscripts and have not been formally published. Further other data might also come in the near future. Hence, we are not launching into any detailed presentation of the revised scenarios in this note. What we intend here is to simply provide a few illustrations of the authors’ results without much critical investigation.


A screen shot of McColl et al Figure 4

First, the study of McColl et al focuses on the far east bringing in new ancient DNA data. The main point of interest to the Indian scenario is that the Andaman Onge are part of a major push of hunter-gatherers into the far east and Pacific, which spawned several branches that in turn mixed among themselves in various combinations giving rise among others to the Austronesian groups and East Asians. Further, in deep Pacific there were admixtures of the basal branches of this radiation with the Denisovans, the signal of which is very clearly seen in Papuans and Australian aborigines. The basal-most branch of this group analyzed by McColl et al is the 40000 YBP Tianyuan man, suggesting that these populations were in the east by then. A basal branch of this radiation group also seems to have contributed to the ancestry of only a subset of native Americans (independently of the East Asian branch that also originated from this group). This suggests that they might have reached the New World independently in an earlier wave or mixed with one strand of the main East Asian line of Native American ancestry as they entered the New World. A deep sister group of the Onge and probably a basal member of this Eastern radiation was an ancient hunter-gatherer group that settled India, where they might have undergone admixtures with one or more preexisting non-sapiens species of Homo. This population is now defined as Ancient Ancestral South Indian (AASI) by Narasimhan et al, refining the earlier definition of “Ancestral South Indian” by Reich et al. We may term them the Indian hunter-gatherers.

The key point which Narasimhan et al make is that Neolithicization of the North-Western Indian Subcontinent proceeded via the entry of Iranian farmers from the west. Thus, this clarifies a previously uncertain situation based on archaeology alone. The entry of these Iranian farmers could have happened as early as the Mehrgarh Neolithic or in more than one wave of closely related western populations. In any case the authors posit that it had happened by 6700-5000 YBP. This Iranian farmer group mixed with the AASI in the NW of the Indian subcontinent and this admixture was likely the form of the population of the Harappan civilization that arose in this region. They term this population Indus periphery. Narasimhan et al also show that the Bactria-Margiana complex (BMAC) received some admixture from this population, likely of Harappan provenance, but did not contribute notably to the ancestry of the Indian subcontinent. Starting around 4100 YPB they start seeing Sintashta Steppe contributions appear for the first time in BMAC. This ancestry appears to have filtered south and reached the core Indian subcontinent thereafter. By 3700-3500 YBP they start seeing East Asian admixture on the Central Asian steppes, which continues down to the Scythian Iron Age. However, this East Asian ancestry is not visible in Indian populations. Hence, it appears that we are left with a window of 4100-3500 YBP when the Aryan invasion of the subcontinent happened. This is at the upper end of the mainstream invasion scenarios. Further, it is not inconsistent with the possibility that the invasion triggered the collapse of the Harappan urbanization around 3900 YBP. But it is also possible that the Aryans entered and occupied a landscape where the Harappan urban civilization had already collapsed or was in its last throes. It also provides support for the young age of the Veda, especially if one chose to place the Ṛgveda in the Panjab. Further, it lends some support to the scenario that the Soma cult was acquired by the Indo-Iranians and integrated with the older fire-cult as they reached the BMAC sites. It is notable in this context that one of the main proponents of the Soma cult in the Vaidika system, the Kaśyapa clan, was the default gotra for a brāhmaṇa who did not know his. There are issues with each of these points and interesting complications but we desist from discussing any of these now.

Interestingly, there was another recent publication by Vishnupriya et al applying the Bayesian phylogenetic methods to Dravidian languages. The results suggested a possible expansion of Dravidian happening around 4500 YBP. Narasimhan et al seem to mildly favor a Harappan origin of Dravidian. However, both the linguistic date estimate and several other linguistic arguments are against the Harappan civilization being that of Dravidian speakers. Rather, we suspect the Dravidians arose in the South as part of the Southern Neolithicization – this might have had genetic and memetic contributions from the Indus periphery but the Dravidian languages themselves were likely of Southern provenance, probably in the upper Godavari valley. In the aftermath of the Indo-Aryan reconfiguration of the north, it is likely that the Dravidians had their own expansions both South and North adopting various Indo-Aryan technologies and ideologies. This led to the Dravidianization of many AASI hunter-gathers, who might have earlier spoken other languages.

Narasimhan et al model extant Indian populations as a three-way mixture of the Indus-periphery, the Indian Hunter-gather (AASI) and the Steppe population related to the Sintashta complex. Below are some figures based on their model to illustrate the situation.

Indians_St_I_OFigure 1. A box plot showing the three modeled components of Indian Ancestry for the 140 populations studied by the authors. The gray line indicates the position of the genuine brāhmaṇa population with the lowest steppe ancestry (i.e. leaving out some groups which are not conventional brāhmaṇa, e.g. viśvakarman). It is clear that the brāhmaṇa-s show above average steppe ancestry and below average Indian hunter-gatherer ancestry.

Indians_st_I_O_histFigure 2. The same data is represented as a histogram. It is clear that whereas the Indus-periphery and Indian Hunter-gatherer ancestry is unimodal, the steppe ancestry is not with groups showing low and high steppe ancestry. This explains the authors’ earlier model of ASI and ANI.

We then sorted the populations into five categories: 1) braḥmaṇa-s (here we retained the viśvakarman); 2) Warrior caste (traditional kṣatriya-s) and their equivalents; 3) Middle castes: vaiśya-s, cattle-breeders and agriculturalists; 4) service castes: traditional service jati-s often included as other backward, backward and scheduled castes; 5) tribes. For this we had drop generic groups like Gujarati, Punjabi, Muslim and the like. This left us with 124 populations. These are plotted as a ternary diagram.

Indian_TernaryFigure 3. Ternary diagram of the 3 strands of Indian ancestry. The 5 caste-tribal groups defined above are colored: 1-red, 2-orange, 3-aquamarine; 4-blue; 5-violet. One can see the effect of the two admixtures with the steppe ancestry’s effect being predominant in the varṇa populations.

A closer examination of this is seen the next three figures:

Indian_steppeFigure 4. A box plot of the inferred steppe ancestry in the above-defined five groups. The steppe ancestry is arrayed in accordance with the caste ladder and tribals have the least of it on an average.

Indian_Indus_peripheryFigure 5. A box plot of the inferred Indus-periphery ancestry in the above-defined five groups. It is interesting to note that unlike the steppe ancestry’s the Indus-periphery ancestry is greater in the warrior and middle caste groups than in braḥmaṇa-s, who have a lower median value of this component. However, this difference is only mildly significant in the current data (p=.033) and sampling bias cannot be ruled out.

Indians_HGFigure 6. A box plot of the inferred Indian hunter-gatherer ancestry in the above-defined five groups. Here for the four groups from the warrior castes to the tribes we see a reverse of the scenario seen for the steppe ancestry. However, the braḥmaṇa-s show a slightly higher median value of this component. While again we should be clear that this could be due to sampling bias, taken together with the above plot, it might reflect some sociological reality. The braḥmaṇa-s probably to start with did not mix much with the preexisting populations of the subcontinent but as they expanded, especially while moving south, they mixed with directly with populations with lower Indus-periphery and higher hunter-gatherer components.

Together, these plots suggest a picture, which was long suspected from the physical appearance of Indians. The Indo-Aryans established themselves in the subcontinent entering via the NW, where they mixed with the older Indus-periphery populations that were likely part of or survivors of the old Harappan civilization. The groups with a wide-range of older Indian hunter-gatherer-Iranian farmer mixture were incorporated across the upper caste ladder but especially in warrior and middle castes where we see considerable dispersion (e.g. southern agnikula-kṣatriya with low steppe ancestry). The movement of braḥmaṇa-s into the south possibly also involved admixture with these groups.

Finally, a brief political note. The pro-Hindu pakṣa had acquired an aberration mainly in the past 3 decades known as OIT or the out of India theory for the origin of the Indo-Europeans. This never had a leg to stand on but is now dead and cremated. Unfortunately, the pro-Hindu side and mainstream H nationalism has invested so much in making Indo-Aryan autochthonism a centerpiece of their thought that it mostly ceded the writing of data-based Hindu prehistory to parties who are never going to be favorable to them. Even more tragically they do not even seem to recognize how wrong they were – there is a finite probability that most of the OIT proponents are going to continue that way. Further, there is an unsubstantiated rumor making rounds that the Indian side might have prevented the use of Indian aDNA in the current analysis fearing the inevitable end to OIT. If this were true then it would add to the scandal and only provide more fuel for the usual enemies of the Hindus. This intellectual failure of mainstream Hindu nationalism in framing its foundations is quite worrisome as it might reflect a deeper systematic failure in thought.

The Genomic Formation of South and Central Asia, Narasimhan et al. https://www.biorxiv.org/content/early/2018/03/31/292581

Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia, McColl et al. https://www.biorxiv.org/content/early/2018/03/08/278374

A Bayesian phylogenetic study of the Dravidian language family http://rsos.royalsocietypublishing.org/content/5/3/171504

