The Vertebrate Genomes Project (VGP) is a project which aims to generate high-quality, complete reference genomes of all 66,000 vertebrate species. It is an international cooperation project with members from more than 50 separate institutions and was launched in February 2017. [1] [2] [3] [4] [5]
In October 2021, VGP partnered with Colossal Biosciences to sequence and assemble elephant genomes for preservation purposes. [6]
In April 2022, VGP partnered with the Human Genome Project [7] and the African BioGenome Project for sequencing research. [8]
In July 2022, VGP and Colossal Biosciences announced that they successfully sequenced the entire Asian elephant genome; this is the first time that mammalian genetic code has been fully sequenced to this degree since the Human Genome Project was completed in the early 2000s. [9]
In November 2022, VGP successfully sequenced the Nile Rat genome in order to facilitate research on type 2 diabetes and the health effects of circadian rhythm disruption. Not only did researchers sequence an individual rat, but they also sequenced both its parents, allowing them to separate the original rat’s alleles by parental haplotype. The resulting sequence showed that the vast majority of expected protein-coding genes were accounted for. [10] [11]
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.
A DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: G (guanine), C (cytosine), A (adenine) and T (thymine). This is then reported as a text string, called a read. Some DNA sequencers can be also considered optical instruments as they analyze light signals originating from fluorochromes attached to nucleotides.
George McDonald Church is an American geneticist, molecular engineer, chemist, serial entrepreneur, and pioneer in personal genomics and synthetic biology. He is the Robert Winthrop Professor of Genetics at Harvard Medical School, Professor of Health Sciences and Technology at Harvard University and Massachusetts Institute of Technology, and a founding member of the Wyss Institute for Biologically Inspired Engineering at Harvard. Through his Harvard lab Church has co-founded around 50 biotech companies pushing the boundaries of innovation in the world of life sciences and making his lab as a hotbed of biotech startup activity in Boston. In 2018, the Church lab at Harvard made a record by spinning off 16 biotech companies in one year. The Church lab works on research projects that are distributed in diverse areas of modern biology like developmental biology, neurobiology, info processing, medical genetics, genomics, gene therapy, diagnostics, chemistry & bioengineering, space biology & space genetics, and ecosystem. Research and technology developments at the Church lab have impacted or made direct contributions to nearly all "next-generation sequencing (NGS)" methods and companies. In 2017, Time magazine listed him in Time 100, the list of 100 most influential people in the world. In 2022, he was featured among the most influential people in biopharma by Fierce Pharma, and was listed among the top 8 famous geneticists of all time in human history. As of January 2023, Church serves as a member of the Bulletin of the Atomic Scientists' Board of Sponsors.
The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a physical and a functional standpoint. It started in 1990 and was completed in 2003. It remains the world's largest collaborative biological project. Planning for the project started after it was adopted in 1984 by the US government, and it officially launched in 1990. It was declared complete on April 14, 2003, and included about 92% of the genome. Level "complete genome" was achieved in May 2021, with a remaining only 0.3% bases covered by potential issues. The final gapless assembly was finished in January 2022.
The Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC) was established by Richard A. Gibbs in 1996 when Baylor College of Medicine was chosen as one of six worldwide sites to complete the final phase of the international Human Genome Project. Gibbs is the current director of the BCM-HGSC.
Single-molecule real-time (SMRT) sequencing is a parallelized single molecule DNA sequencing method. Single-molecule real-time sequencing utilizes a zero-mode waveguide (ZMW). A single DNA polymerase enzyme is affixed at the bottom of a ZMW with a single molecule of DNA as a template. The ZMW is a structure that creates an illuminated observation volume that is small enough to observe only a single nucleotide of DNA being incorporated by DNA polymerase. Each of the four DNA bases is attached to one of four different fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, the fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW where its fluorescence is no longer observable. A detector detects the fluorescent signal of the nucleotide incorporation, and the base call is made according to the corresponding fluorescence of the dye.
A knockout rat is a genetically engineered rat with a single gene turned off through a targeted mutation used for academic and pharmaceutical research. Knockout rats can mimic human diseases and are important tools for studying gene function and for drug discovery and development. The production of knockout rats was not economically or technically feasible until 2008.
Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.
A reference genome is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead, a reference provides a haploid mosaic of different DNA sequences from each donor. For example, one of the most recent human reference genomes, assembly GRCh38/hg38, is derived from >60 genomic clone libraries. There are reference genomes for multiple species of viruses, bacteria, fungus, plants, and animals. Reference genomes are typically used as a guide on which new genomes are built, enabling them to be assembled much more quickly and cheaply than the initial Human Genome Project. Reference genomes can be accessed online at several locations, using dedicated browsers such as Ensembl or UCSC Genome Browser.
The existence of frozen soft-tissue remains and DNA of woolly mammoths has led to the possibility that the species could be regenerated by scientific means. In 2003 the Pyrenean ibex was briefly revived, giving credence to the idea that the mammoth could be successfully revived. As of today, several methods have been proposed to achieve this goal, including cloning, artificial insemination, and genome editing. Whether it is ethical to create a live mammoth is not universally agreed on.
Disease gene identification is a process by which scientists identify the mutant genotypes responsible for an inherited genetic disorder. Mutations in these genes can include single nucleotide substitutions, single nucleotide additions/deletions, deletion of the entire gene, and other genetic abnormalities.
The $1,000 genome refers to an era of predictive and personalized medicine during which the cost of fully sequencing an individual's genome (WGS) is roughly one thousand USD. It is also the title of a book by British science writer and founding editor of Nature Genetics, Kevin Davies. By late 2015, the cost to generate a high-quality "draft" whole human genome sequence was just below $1,500.
In DNA sequencing, a read is an inferred sequence of base pairs corresponding to all or part of a single DNA fragment. A typical sequencing experiment involves fragmentation of the genome into millions of molecules, which are size-selected and ligated to adapters. The set of fragments is referred to as a sequencing library, which is sequenced to produce a set of reads.
Bat1K is a project to sequence the genomes of all living bat species to the level of chromosomes and then make the data publicly available. The project began in 2017.
Karen Elizabeth Hayden Miga is an American geneticist who co-leads the Telomere-to-Telomore (T2T) consortium that released fully complete assembly of the human genome in March 2022. She is an assistant professor of biomolecular engineering at the University of California, Santa Cruz and Associate Director of Human Pangenomics at the UC Santa Cruz Genomics Institute. She was named as "One to Watch" in the 2020 Nature's 10 and one of Time 100’s most influential people of 2022.
Colossal Biosciences is a biotechnology and genetic engineering company working to de-extinct the woolly mammoth, the Tasmanian tiger, and the dodo. In 2023, it stated that it wants to have woolly mammoth hybrid calves by 2028, and wants to reintroduce them to the Arctic tundra habitat. Likewise, it plans to launch a thylacine research project to release Tasmanian tiger joeys back to their original Tasmanian and broader Australian habitat after a period of observation in captivity.
Adam M. Phillippy is an American bioinformatician serving as senior investigator and head of the Genome Informatics Section at the National Human Genome Research Institute at the National Institutes of Health. He is known for his work in that resulted in the first complete sequence of a human genome.
Circular consensus sequencing (CCS) is a DNA sequencing method that is used in conjunction with single-molecule real-time sequencing to yield highly accurate long-read sequencing datasets with read lengths averaging 15–25 kb with median accuracy greater than 99.9%. These long reads, which are created via the formation of consensus sequencing obtained from multiple passes on a single DNA molecule, can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes.
The African BioGenome Project, or AfricaBP, is an international effort to sequence the genomes of all animals, all plants, all fungi, and all protists that are native to Africa at an estimated cost of $1 billion U.S. dollars. The project prioritizes doing its sequencing work and data storage within the African continent.