    The origin and dispersal of the Austronesian language family, one of the largest and most widespread in the world, have long attracted the attention of linguists, archaeologists, and geneticists. Even though there is a growing consensus that Taiwan is the source of the spread of Austronesian languages, little is known about the migration patterns of the early Austronesians who settled in and left Taiwan, i.e. the \"Into-Taiwan\" and \"out-of-Taiwan\" events. In particular, the genetic diversity and structure within Taiwan and how this relates to the into-/out-of-Taiwan events are largely unexplored, primarily because most genomic studies have largely utilized data from just two of the 16 recognized Highland Austronesian groups in Taiwan. In this study, we generated the largest genome-wide data set of Taiwanese Austronesians to date, including six Highland groups and one Lowland group from across the island and two Taiwanese Han groups. We identified fine-scale genomic structure in Taiwan, inferred the ancestry profile of the ancestors of Austronesians, and found that the southern Taiwanese Austronesians show excess genetic affinities with the Austronesians outside of Taiwan. Our findings thus shed new light on the Into- and Out-of-Taiwan dispersals.






    Ancient anatomically modern humans (AMHs) encountered other archaic human species, most notably Neanderthals and Denisovans, when they left Africa and spread across Europe and Asia ~60,000 years ago. They interbred with them, and modern human genomes retain DNA inherited from these interbreeding events. High quality (high coverage) ancient human genomes have recently been sequenced allowing for a direct estimation of individual heterozygosity, which has shown that genetic diversity in these archaic human groups was very low, indicating low population sizes. In this study, we analyze ten ancient human genome-wide data, including four sequenced with high-coverage. We screened these ancient genome-wide data for pathogenic mutations associated with monogenic diseases, and established unusual aggregation of pathogenic mutations in individual subjects, including quadruple homozygous cases of pathogenic variants in the PAH gene associated with the condition phenylketonuria in a ~120,000 years old Neanderthal. Such aggregation of pathogenic mutations is extremely rare in contemporary populations, and their existence in ancient humans could be explained by less significant clinical manifestations coupled with small community sizes, leading to higher inbreeding levels. Our results suggest that pathogenic variants associated with rare diseases might be the result of introgression from other archaic human species, and archaic admixture thus could have influenced disease risk in modern humans.






    The Mongolian population exceeds six million and is the largest population among the Mongolic speakers in China. However, the genetic structure and admixture history of the Mongolians are still unclear due to the limited number of samples and lower coverage of single-nucleotide polymorphism (SNP). In this study, we genotyped genome-wide data of over 700,000 SNPs in 38 Mongolian individuals from Fuxin in Liaoning Province to explore the genetic structure and population history based on typical and advanced population genetic analysis methods [principal component analysis (PCA), admixture, FST, f 3 -statistics, f 4 -statistics, qpAdm/qpWave, qpGraph, ALDER, and TreeMix]. We found that Fuxin Mongolians had a close genetic relationship with Han people, northern Mongolians, other Mongolic speakers, and Tungusic speakers in East Asia. Also, we found that Neolithic millet farmers in the Yellow River Basin and West Liao River Basin and Neolithic hunter-gatherers in the Mongolian Plateau and Amur River Basin were the dominant ancestral sources, and there were additional gene flows related to Eurasian Steppe pastoralists and Neolithic Iranian farmers in the gene pool of Fuxin Mongolians. These results shed light on dynamic demographic history, complex population admixture, and multiple sources of genetic diversity in Fuxin Mongolians.






    Delimiting species across a speciation continuum is a complex task, as the process of species origin is not generally instantaneous. The use of genome-wide data provides unprecedented resolution to address convoluted species delimitation cases, often unraveling cryptic diversity. However, because genome-wide approaches based on the multispecies coalescent model are known to confound population structure with species boundaries, often resulting in taxonomic over-splitting, it has become increasingly evident that species delimitation research must consider multiple lines of evidence. In this study, we used phylogenomic, population genomic, and coalescent-based species delimitation approaches, and examined those in light of morphological and ecological information, to investigate species numbers and boundaries comprising the Chirostoma \"humboltianum group\" (family Atherinidae). The humboltianum group is a taxonomically controversial species complex where previous morphological and mitochondrial studies produced conflicting species delimitation outcomes. We generated ddRADseq data for 77 individuals representing the nine nominal species in the group, spanning their distribution range in the central Mexican plateau.
    Our results conflict with the morphospecies and ecological delimitation hypotheses, identifying four independently evolving lineages organized in three geographically cohesive clades: (i) chapalae and sphyraena groups in Lake Chapala, (ii) estor group in Lakes Pátzcuaro and Zirahuén, and (iii) humboltianum sensu stricto group in Lake Zacapu and Lerma river system.
    Overall, our study provides an atypical example where genome-wide analyses delineate fewer species than previously recognized on the basis of morphology. It also highlights the influence of the geological history of the Chapala-Lerma hydrological system in driving allopatric speciation in the humboltianum group.






    Genome-wide genotype data from 48 carefully selected population samples of Transylvania-living Szeklers and non-Szekler Hungarians were analyzed by comparative analysis. Our analyses involved contemporary Hungarians living in Hungary, other Europeans, and Eurasian samples counting 530 individuals altogether. The source of the Szekler samples was the commune of Korond, Transylvania. The analyzed non-Szekler Hungarian samples were collected from villages with a history dating back to the era of the Árpád Dynasty. Population structure by principal component analysis and ancestry analysis also revealed a great within-group similarity of the analyzed Szeklers and non-Szekler Transylvanian Hungarians. These groups also showed similar genetic patterns with each other. Haplotype analyses using identity-by-descent segment discovering tools showed that average pairwise identity-by-descent sharing is similar in the investigated populations, but the Korond Szekler samples had higher average sharing with the Hungarians from Hungary than non-Szekler Transylvanian Hungarians. Average sharing results showed that both groups are isolated compared to other Europeans, and pointed out that the non-Szekler Transylvanian Hungarian inhabitants of the investigated Árpád Age villages are more isolated than investigated Szeklers from Korond. This was confirmed by our autozygosity analysis as well. Identity-by-descent segment analyses and 4-population tests also confirmed that these Hungarian-speaking Transylvanian ethnic groups are strongly related to Hungarians living in Hungary.






    Basques have historically lived along the Western Pyrenees, in the Franco-Cantabrian region, straddling the current Spanish and French territories. Over the last decades, they have been the focus of intense research due to their singular cultural and biological traits that, with high controversy, placed them as a heterogeneous, isolated, and unique population. Their non-Indo-European language, Euskara, is thought to be a major factor shaping the genetic landscape of the Basques. Yet there is still a lively debate about their history and assumed singularity due to the limitations of previous studies. Here, we analyze genome-wide data of Basque and surrounding groups that do not speak Euskara at a micro-geographical level. A total of ∼629,000 genome-wide variants were analyzed in 1,970 modern and ancient samples, including 190 new individuals from 18 sampling locations in the Basque area. For the first time, local- and wide-scale analyses from genome-wide data have been performed covering the whole Franco-Cantabrian region, combining allele frequency and haplotype-based methods. Our results show a clear differentiation of Basques from the surrounding populations, with the non-Euskara-speaking Franco-Cantabrians located in an intermediate position. Moreover, a sharp genetic heterogeneity within Basques is observed with significant correlation with geography. Finally, the detected Basque differentiation cannot be attributed to an external origin compared to other Iberian and surrounding populations. Instead, we show that such differentiation results from genetic continuity since the Iron Age, characterized by periods of isolation and lack of recent gene flow that might have been reinforced by the language barrier.






    Trans-Eurasian cultural and genetic exchanges have significantly influenced the demographic dynamics of Eurasian populations. The Hexi Corridor, located along the southeastern edge of the Eurasian steppe, served as an important passage of the ancient Silk Road in Northwest China and intensified the transcontinental exchange and interaction between populations on the Central Plain and in Western Eurasia. Historical and archeological records indicate that the Western Eurasian cultural elements were largely brought into North China via this geographical corridor, but there is debate on the extent to which the spread of barley/wheat agriculture into North China and subsequent Bronze Age cultural and technological mixture/shifts were achieved by the movement of people or dissemination of ideas. Here, we presented higher-resolution genome-wide autosomal and uniparental Y/mtDNA SNP or STR data for 599 northwestern Han Chinese individuals and conducted 2 different comprehensive genetic studies among Neolithic-to-present-day Eurasians. Genetic studies based on lower-resolution STR markers via PCA, STRUCTURE, and phylogenetic trees showed that northwestern Han Chinese individuals had increased genetic homogeneity relative to northern Mongolic/Turkic/Tungusic speakers and Tibeto-Burman groups. The genomic signature constructed based on modern/ancient DNA further illustrated that the primary ancestry of the northwestern Han was derived from northern millet farmer ancestors, which was consistent with the hypothesis of Han origin in North China and more recent northwestward population expansion. This was subsequently confirmed via excess shared derived alleles in f3/f4 statistical analyses and by more northern East Asian-related ancestry in the qpAdm/qpGraph models. Interestingly, we identified one western Eurasian admixture signature that was present in northwestern Han but absent from southern Han, with an admixture time dated to approximately 1000 CE (Tang and Song dynasties). Generally, we provided supporting evidence that historic Trans-Eurasian communication was primarily maintained through population movement, not simply cultural diffusion. The observed population dynamics in northwestern Han Chinese not only support the North China origin hypothesis but also reflect the multiple sources of the genetic diversity observed in this population.






    Bull fertility is considered an indispensable trait, as far as farm economics is concerned since it is the successful conception in a cow that provides calf crop, along with the ensuing lactation. This ensures sustainability of a dairy farm. Traditionally, bull fertility did not receive much attention by the farm managers and breeding animals were solely evaluated based on phenotypic predictors, namely, sire conception rate and seminal parameters in bull. With the advent of the molecular era in animal breeding, attempts were made to unravel the genetic complexity of bull fertility by the identification of genetic markers related to the trait. Marker-Assisted Selection (MAS) is a methodology that aims at utilizing the genetic information at markers and selecting improved populations for important traits. Traditionally, MAS was pursued using a candidate gene approach for identifying markers related to genes that are already known to have a physiological function related to the trait but this approach had certain shortcomings like stringent criteria for significance testing. Now, with the availability of genome-wide data, the number of markers identified and variance explained in relation to bull fertility has gone up. So, this presents a unique opportunity to revisit MAS by selection based on the information of a large number of genome-wide markers and thus, improving the accuracy of selection.






    Current technology allows rapid assessment of DNA sequences and methylation levels at a single-site resolution for hundreds of thousands of sites in the human genome, in thousands of individuals simultaneously. This has led to an increase in epigenome-wide association studies (EWAS) of complex traits, particularly those that are poorly explained by previous genome-wide association studies (GWAS). However, the genome and epigenome are intertwined, e.g., DNA methylation is known to affect gene expression through, for example, genomic imprinting. There is thus a need to go beyond single-omics data analyses and develop interaction models that allow a meaningful combination of information from EWAS and GWAS.
    We present two new methods for genetic association analyses that treat offspring DNA methylation levels as environmental exposure. Our approach searches for statistical interactions between SNP alleles and DNA methylation (G ×Me) and between parent-of-origin effects and DNA methylation (PoO ×Me), using case-parent triads or dyads. We use summarized methylation levels over nearby genomic region to ease biological interpretation. The methods were tested on a dataset of parent-offspring dyads, with EWAS data on the offspring. Our results showed that methylation levels around a SNP can significantly alter the estimated relative risk. Moreover, we show how a control dataset can identify false positives.
    The new methods, G ×Me and PoO ×Me, integrate DNA methylation in the assessment of genetic relative risks and thus enable a more comprehensive biological interpretation of genome-wide scans. Moreover, our strategy of condensing DNA methylation levels within regions helps overcome specific disadvantages of using sparse chip-based measurements. The methods are implemented in the freely available R package Haplin ( ), enabling fast scans of multi-omics datasets.







    Here, we report genome-wide data analyses from 110 ancient Near Eastern individuals spanning the Late Neolithic to Late Bronze Age, a period characterized by intense interregional interactions for the Near East. We find that 6th millennium BCE populations of North/Central Anatolia and the Southern Caucasus shared mixed ancestry on a genetic cline that formed during the Neolithic between Western Anatolia and regions in today\'s Southern Caucasus/Zagros. During the Late Chalcolithic and/or the Early Bronze Age, more than half of the Northern Levantine gene pool was replaced, while in the rest of Anatolia and the Southern Caucasus, we document genetic continuity with only transient gene flow. Additionally, we reveal a genetically distinct individual within the Late Bronze Age Northern Levant. Overall, our study uncovers multiple scales of population dynamics through time, from extensive admixture during the Neolithic period to long-distance mobility within the globalized societies of the Late Bronze Age. VIDEO ABSTRACT.





