↖
Single Nucleotide Polymorphism, SNP
a germline substitution of a single nucleotide at a specific position, affecting a large proportion of the population (generally >1%)
Single Nucleotide Variation, SNV
a single nucleotide substitution that is usually uncommon in the population
Insertion-Deletion, INDEL
An addition or removal of bases to the genome of an organism; a major source of structural variation in genomes
Simple Tandem Repeats, STR
repeat sequence of 1+ nucleotides where the repeats are adjacent to each other
Variable Number Tandem Repeats, VNTR
a tandem repeat where variations in the length are seen between individuals, allowing them to be used for DNA fingerprinting and forensics
Copy Number Variation, CNV
repeat sections of the genome are repeated different numbers of times between individuals, through deletion or duplication affecting a significant number of base pairs
Deletion
loss of nucleotides from the genome
Duplication
creation of new genetic material through copying an existing series of nucleotides or gene(s)
Inversion
flipping a section of the genome, reversing the order of nucleotides of a section of the genome
Chromosomal Abnormality
a variation in chromosomal structure, either a missing, extra, or irregular chromosome
Monosomy
aneuploidy resulting in a single chromosome where there should be a pair
Aneuploidy
the presence of an abnormal number of chromosomes in a cell
Trisomy
three chromosomes where there should be two
Haplotype
a group of alleles inherited together from a single parent
Linkage Disequilibrium
a measure of the non-random association between alleles, at different loci, in a given population; compares the frequency two alleles are detected at the same loci with the frequency each allele is detected (alone or together) at the same loci. This measure is valid where the frequency of being detected together is higher or lower than expected if the two alleles were under independent segregation
Haplotype Block
an area of an organism’s genome where genetic recombination rarely occurs, showing high levels of linkage disequilibrium; boundaries cannot be directly observed, bioinformatic approaches are necessary
HapMap
a project aiming to develop a haplotype map of the human genome, describing common human variations; aimed to identify genetic variants that affect health, disease, and drug responses
1000 Genomes Project
research project aiming to catalogue human genetic variation, sequencing the genomes of 1000 individuals from different ethnic groups
100,000 Genomes Project
research project that sequenced 100,000 genomes from patients in the UK, focusing on rare diseases and cancers to improve understanding of causes and treatments
UK Biobank
UK study analysing biological samples and healthcare data on 0.5 million participants; aims to discover improved ways to prevent, diagnose, and treat diseases
All of US
research project in the US investigating personalised medical care through phenotypic and genotypic study
Dimensionality Reduction
a method to transform highly complex datasets into 2 or 3 dimensions, allowing data sets with many observations to be simplified for pattern analysis
Principal Component Analysis
a technique of dimensionality reduction, transforming data onto 2 or 3 coordinates, capturing the greatest variation that can be identified
Human Pangenome Project
project aiming to create a reference human genome sequence incorporating the genetic diversity observed across all populations