Genetic analyses can be used to understand human history, evolution, and migration. However, there are various types of genetic analyses, each with varying applications.
The first type of genetic analysis (in the context of population genetics) is haplogroup analysis. The term "haplogroup" refers to a group that contains a common direct maternal or direct paternal ancestor, and is based on a collection of haplotypes, which are clusters of genes inherited explicitly from one parent. There are two types of haplogroups: mitochondrial (maternal) haplogroups and Y-chromosome (paternal) haplogroups.
Mitochondrial haplogroups determine one's direct maternal line. They are determined by a collection of haplotypes in the mitochondria containing specific SNPs- Single Nucleotide Polymorphisms, which are "mutations" of a specific nucleotide ("letter" of the DNA sequence). Every individual possesses a mitochondrial haplogroup (as every individual possesses mitochondria), inherited from their mother.
Y-chromosome haplogroups determine one's direct paternal line. They are determined by a collection of haplotypes in the Y chromosome containing specific SNPs. Unlike mitochondrial haplogroups, only those with Y chromosomes possess Y-chromosome haplogroups, inherited from their father.
Initially, haplogroup analyses were quite common due to the fact that it is technically not necessary to sequence the entire genome to determine one's haplogroup, so it was more efficient. Additionally, unlike most of the genome, haplogroups are not subject to recombination. This means that if a new mutation arises, creating a new haplogroup, this new haplogroup can still be traced to the common ancestor of the progenitor haplogroup using SNPs that define the ancestral line of that haplogroup. However, haplogroups do not tell the entire story of one's DNA. They only depict one's maternal and paternal lines, neglecting the fact that the entirety of our DNA is not derived from two specific individuals. In reality, individuals have an innumerable amount of ancestors, though not all of them passed down their DNA to that individual (not every ancestor passes down DNA due to the sheer amount of DNA, along with the recombinative nature of many genes).
Since an individual's DNA is not entirely derived from just their uniparental direct maternal and paternal lines (haplogroups), the second type of genetic analysis, admixture analysis, is significantly more effective at determining the ancestry of an individual. "Admixture" is a broad term that refers to a myriad of different analyses, all with one commonality: they analyse one's ancestry based on every ancestor that passed down DNA. Admixture analyses one's "mixture" of every ancestral population that they derive ancestry from.
Admixture can be determined using DNA that is sequences in various ways. The first type of genome sequencing is whole genome sequencing, which technically refers to different types of sequencing methods. Whole genome sequencing proper entails sequencing every letter of one's entire genome. This specific method is commonly used in academia. However, with ancient samples, whole genome sequencing does not inherently mean that the entire genome will be covered, as DNA is subject to contamination and degradation over time, meaning that very old samples will only have a portion of their genome covered, as that portion was all that remained of the sample's genome.
One type of whole genome sequencing that is commonly used is shotgun sequencing. Shotgun sequencing entails sequencing smaller segments of the whole genome, as opposed to the entire genome. This method can be effective due to the fact that not every SNP in the genome is useful when determining admixture, so sequencing every SNP in a genome's entirety may not necessarily be useful. Shotgun sequenced samples will typically be designated as ".SG", unless they are of a high enough coverage to enable accurate sequencing and determination of diploid genotypes. If that is the case, then the shotgun sample will be designated as ".DG". Another type of sequencing that is frequently used in more common datasets is in-solution target capture. In datasets, these captures are typically "1240k captures", which means that samples that are sequenced with 1240K captures have data reported for a set of over 1.2 million sites in the genome. Within in-solution target captures, there are also samples sequenced using the "Affymetrix Human Origins array" (sometimes abbreviated to "HO"), which have data reported for set of under 600,000 sites in the genome. Outside of whole genome sequencing exists a method called Sanger diploid genotyping. Sanger sequencing is considered to be a "first generation" method of sequencing that sequences a single fragment of DNA at a time. Sanger sequencing is not cost-effective when sequencing a high volume of segments, and is not as sensitive as newer sequencing methods.
Genomes that are sequenced using these methods can then be used to analyse the admixture of these sequences using various programs. I will be discussing many of these programs in the future, and will be releasing tutorials explaining how to use some of these programs!
Sources
Haplotypes: a cut-out-and-keep guide - Genomics Education Programme
Who We Are and How We Got Here by David Reich
Y-SNP analysis versus Y-haplogroup predictor in the Slovak population