Approaching Archaeogenetics

DNA Map Populations Tutorials

ADMIXTOOLS (Software Package)

Another software that has been used in a myriad of publications is ADMIXTOOLS. ADMIXTOOLS is a collection of programs that use f-statistics derived from direct genetic data (SNPs) to infer genetic relationships between populations. The original ADMIXTOOLS was developed by David Reich and Nick Patterson; the program was developed for Linux and Mac. A revised version of the software, ADMIXTOOLS 2, was developed by Robert Maier, Pavel Flegontov, Ulas Isildak, David Reich, and Nick Patterson for Linux, Mac and Windows. ADMIXTOOLS 2 serves to improve upon the original ADMIXTOOLS and utilize more efficient processes for analyzing admixture. All ADMIXTOOLS programs utilize f-statistics (specifically f2 statistics) as their basis, which describe how populations are related to each other.

Admixture

An analysis of the proportions of all genetic populations that contributed to an individual/population's genome via interbreeding.

ADMIXTURE (Program)

ADMIXTURE is a program developed by David H Alexander, John Novembre, and Kenneth Lange for Linux and Mac. It estimates the ancestry of a population using K (a given number) populations. It does so in a model-based manner. It utilizes a given dataset containing samples (which can be either ancient or modern, and can be sequenced in any way), and estimates the amount of genetic ancestry of the samples derived from each of K populations, though it does not model genetic drift. These populations are hypothetical and not designated by the user, meaning that ADMIXTURE does not directly test for admixture between populations, examining individual samples instead of populations as a whole. Admixture proportions are almost invariably displayed via barplots.

Archaeogenetics

The application of population genetics to ancient DNA samples. As defined by Colin Renfrew, it refers to "the study of the past by use of the techniques of molecular genetics" This area of study is interdisciplinary, existing at the intersection of genetics and archaeology.

EIGENSOFT (Software Package)

EIGENSOFT is a software package developed by Nick Patterson, Alkes L Price, Samuela Pollack, Kevin Galinsky, Chris Chang, Sasha Gusev, and David Reich. It contains various programs, including SmartPCA (a program for PCA analysis) and related programs, EIGENSTRAT (a stratification correction program), and convertf (a program for converting datasets).

Founder Effect

Refers to a reduction in genetic variation caused by a population being descended from a small amount of individuals. Results in genetic drift.

F-Statistics

Coming soon

Haplogroup

Refers to a group that contains a common direct maternal or direct paternal ancestor, and is based on a collection of haplotypes, which are clusters of genes inherited explicitly from one parent. There are two types of haplogroups: mitochondrial (maternal) haplogroups and Y-chromosome (paternal) haplogroups.

Hardy-Weinberg principle

Coming soon

LINADMIX (Program)

LINADMIX was developed by Lily Agranat-Tamir, Shamam Waldman, Naomi Rosen, Benjamin Yakir, Shai Carmi, and Liran Carmel. LINADMIX works in tandem with ADMIXTURE, relying on ADMIXTURE's output. LINADMIX estimates admixture proportions for a target population using source populations as mixing coefficients and computes a plausibility value to determine whether or not the model is plausible, meaning that it can also be used to designate plausible models. LINADMIX can be used to model modern populations, and can be used in cases of missing data and genetic drift (whereas ADMIXTURE cannot model genetic drift, LINADMIX is robust to it). Although LINADMIX performs better when source populations are highly diverged, genetically similar source populations can still be used.

Linkage Disequilibrium

LINADMIX was developed by Lily Agranat-Tamir, Shamam Waldman, Naomi Rosen, Benjamin Yakir, Shai Carmi, and Liran Carmel. LINADMIX works in tandem with ADMIXTURE, relying on ADMIXTURE's output. LINADMIX estimates admixture proportions for a target population using source populations as mixing coefficients and computes a plausibility value to determine whether or not the model is plausible, meaning that it can also be used to designate plausible models. LINADMIX can be used to model modern populations, and can be used in cases of missing data and genetic drift (whereas ADMIXTURE cannot model genetic drift, LINADMIX is robust to it). Although LINADMIX performs better when source populations are highly diverged, genetically similar source populations can still be used.

Nucleotide

"Letters" of a DNA sequence.

PLINK (Program)

PLINK is a software used for curating datasets, quality control, and other genetic analyses. See more

Population Bottleneck

Refers to a population that experienced a significant reduction in population size, resulting in limited genetic diversity within the population.

Principal Component Analysis (PCA)

PCAs are used to indirectly infer a population's admixture. PCAs reduce the various dimensions of an individual's genetic data to a few dimensions. A sample will be reduced to coordinates on a plane with the axes representing a principal component, which in the case of population genetics would represent a genetic signature or "shift". While PCAs can be used to infer the admixture of a population as a whole by plotting multiple individuals of a certain population, A sample's coordinates depend on the principal components being used, though the principal components are arbitrary and not designated. These coordinates can then be graphed, allowing the user to visualize various genetic clines and clusters and determine how different populations are related to each other.

qpAdm (program)

qpAdm is a program included in the ADMIXTOOLS package. It utilizes f-statistics to determine the admixture proportions of a given target population, and allows the user to ascertain plausible models of admixture. Its primary use is to designate plausible models of admixture for a given target population and eliminate implausible models. The user designates a target population/sample (the population/sample whose admixture will be modeled), up to four source populations (which will be used to model the admixture of the target), and outgroups (technically a misnomer, as qpAdm outgroups should include populations that are differentially related to source and target populations. Outgroups in qpAdm are populations that did not contribute to the ancestry of the target and serve as reference populations compared to the source and target populations, utilizing a "branching" model.

SmartPCA (Program)

One of the most common programs used to generate PCAs in population genetics is SmartPCA. SmartPCA was developed by Nick Patterson, Alkes L Price, and David Reich. It is part of the package EIGENSOFT. SmartPCA will provide coordinates for samples depending on the principal component/eigenvector. An additional program in the EIGENSOFT package can be used to plot the data that is output by SmartPCA, though other programs can be used to plot the data as well.

SNP

Single Nucleotide Polymorphisms- "mutations" of a specific nucleotide ("letter" of the DNA sequence).