In genetics, the term synteny refers to two related concepts:
The Encyclopædia Britannica gives the following description of synteny, using the modern definition: [2]
Genomic sequencing and mapping have enabled comparison of the general structures of genomes of many different species. The general finding is that organisms of relatively recent divergence show similar blocks of genes in the same relative positions in the genome. This situation is called synteny, translated roughly as possessing common chromosome sequences. For example, many of the genes of humans are syntenic with those of other mammals—not only apes but also cows, mice, and so on. Study of synteny can show how the genome is cut and pasted in the course of evolution.
Synteny is a neologism meaning "on the same ribbon"; Greek: σύν , syn "along with" + ταινία , tainiā "band". This can be interpreted classically as "on the same chromosome", or in the modern sense of having the same order of genes on two (homologous) strings of DNA (or chromosomes).
The classical concept is related to genetic linkage: Linkage between two loci is established by the observation of lower-than-expected recombination frequencies between them. In contrast, any loci on the same chromosome are by definition syntenic, even if their recombination frequency cannot be distinguished from unlinked loci by practical experiments. Thus, in theory, all linked loci are syntenic, but not all syntenic loci are necessarily linked. Similarly, in genomics, the genetic loci on a chromosome are syntenic regardless of whether this relationship can be established by experimental methods such as DNA sequencing/assembly, genome walking, physical localization or hap-mapping.
Students of (classical) genetics employ the term synteny to describe the situation in which two genetic loci have been assigned to the same chromosome but still may be separated by a large enough distance in map units that genetic linkage has not been demonstrated.
Shared synteny (also known as conserved synteny) describes preserved co-localization of genes on chromosomes of different species. During evolution, rearrangements to the genome such as chromosome translocations may separate two loci, resulting in the loss of synteny between them. Conversely, translocations can also join two previously separate pieces of chromosomes together, resulting in a gain of synteny between loci. Stronger-than-expected shared synteny can reflect selection for functional relationships between syntenic genes, such as combinations of alleles that are advantageous when inherited together, or shared regulatory mechanisms. [3]
In light of the more recent shift in the meaning of synteny, this conservation of gene content and linkage without preservation of order has also been termed mesosynteny. [4]
The term is currently (since ~2000) more commonly used to describe preservation of the precise order of genes on a chromosome passed down from a common ancestor, [5] [6] [7] [8] despite more "old school" geneticists rejecting what they perceive as a misappopriation of the term, [9] preferring collinearity instead. [10]
The analysis of synteny in the gene order sense has several applications in genomics. Shared synteny is one of the most reliable criteria for establishing the orthology of genomic regions in different species. Additionally, exceptional conservation of synteny can reflect important functional relationships between genes. For example, the order of genes in the "Hox cluster", which are key determinants of the animal body plan and which interact with each other in critical ways, is essentially preserved throughout the animal kingdom. [11]
Synteny is widely used in studying complex genomes, as comparative genomics allows the presence and possibly function of genes in a simpler, model organism to infer those in a more complex one. For example, wheat has a very large, complex genome which is difficult to study. In 1994 research from the John Innes Centre in England and the National Institute of Agrobiological Research in Japan demonstrated that the much smaller rice genome had a similar structure and gene order to that of wheat. [12] Further study found that many cereals are syntenic [13] and thus plants such as rice or the grass Brachypodium could be used as a model to find genes or genetic markers of interest which could be used in wheat breeding and research. In this context, synteny was also essential in identifying a highly important region in wheat, the Ph1 locus involved in genome stability and fertility, which was located using information from syntenic regions in rice and Brachypodium. [14]
Synteny is also widely used in microbial genomics. In Hyphomicrobiales and Enterobacteriales, syntenic genes encode a large number of essential cell functions and represent a high level of functional relationships. [15]
Patterns of shared synteny or synteny breaks can also be used as characters to infer the phylogenetic relationships among several species, and even to infer the genome organization of extinct ancestral species. A qualitative distinction is sometimes drawn between macrosynteny, preservation of synteny in large portions of a chromosome, and microsynteny, preservation of synteny for only a few genes at a time.
Shared synteny between different species can be inferred from their genomic sequences. This is typically done using a version of the MCScan algorithm, which finds syntenic blocks between species by comparing their homologous genes and looking for common patterns of collinearity on a chromosomal or contig scale. Homologies are usually determined on the basis of high bit score BLAST hits that occur between multiple genomes. From here, dynamic programming is used to select the best scoring path of shared homologous genes between species, taking into account potential gene loss and gain which may have occurred in the species' evolutionary histories. [16]
A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.
Selfish genetic elements are genetic segments that can enhance their own transmission at the expense of other genes in the genome, even if this has no positive or a net negative effect on organismal fitness. Genomes have traditionally been viewed as cohesive units, with genes acting together to improve the fitness of the organism. However, when genes have some control over their own transmission, the rules can change, and so just like all social groups, genomes are vulnerable to selfish behaviour by their parts.
Polyploidy is a condition in which the cells of an organism have more than one pair of (homologous) chromosomes. Most species whose cells have nuclei (eukaryotes) are diploid, meaning they have two complete sets of chromosomes, one from each of two parents; each set contains the same number of chromosomes, and the chromosomes are joined in pairs of homologous chromosomes. However, some organisms are polyploid. Polyploidy is especially common in plants. Most eukaryotes have diploid somatic cells, but produce haploid gametes by meiosis. A monoploid has only one set of chromosomes, and the term is usually only applied to cells or organisms that are normally diploid. Males of bees and other Hymenoptera, for example, are monoploid. Unlike animals, plants and multicellular algae have life cycles with two alternating multicellular generations. The gametophyte generation is haploid, and produces gametes by mitosis; the sporophyte generation is diploid and produces spores by meiosis.
Chromosomal crossover, or crossing over, is the exchange of genetic material during sexual reproduction between two homologous chromosomes' non-sister chromatids that results in recombinant chromosomes. It is one of the final phases of genetic recombination, which occurs in the pachytene stage of prophase I of meiosis during a process called synapsis. Synapsis begins before the synaptonemal complex develops and is not completed until near the end of prophase I. Crossover usually occurs when matching regions on matching chromosomes break and then reconnect to the other chromosome.
Genetic recombination is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryotes, genetic recombination during meiosis can lead to a novel set of genetic information that can be further passed on from parents to offspring. Most recombination occurs naturally and can be classified into two types: (1) interchromosomal recombination, occurring through independent assortment of alleles whose loci are on different but homologous chromosomes ; & (2) intrachromosomal recombination, occurring through crossing over.
Molecular evolution describes how inherited DNA and/or RNA change over evolutionary time, and the consequences of this for proteins and other components of cells and organisms. Molecular evolution is the basis of phylogenetic approaches to describing the tree of life. Molecular evolution overlaps with population genetics, especially on shorter timescales. Topics in molecular evolution include the origins of new genes, the genetic nature of complex traits, the genetic basis of adaptation and speciation, the evolution of development, and patterns and processes underlying genomic changes during evolution.
Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.
An inversion is a chromosome rearrangement in which a segment of a chromosome becomes inverted within its original position. An inversion occurs when a chromosome undergoes a two breaks within the chromosomal arm, and the segment between the two breaks inserts itself in the opposite direction in the same chromosome arm. The breakpoints of inversions often happen in regions of repetitive nucleotides, and the regions may be reused in other inversions. Chromosomal segments in inversions can be as small as 1 kilobases or as large as 100 megabases. The number of genes captured by an inversion can range from a handful of genes to hundreds of genes. Inversions can happen either through ectopic recombination between repetitive sequences, or through chromosomal breakage followed by non-homologous end joining.
Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).
Gene mapping or genome mapping describes the methods used to identify the location of a gene on a chromosome and the distances between genes. Gene mapping can also describe the distances between different sites within a gene.
Paleopolyploidy is the result of genome duplications which occurred at least several million years ago (MYA). Such an event could either double the genome of a single species (autopolyploidy) or combine those of two species (allopolyploidy). Because of functional redundancy, genes are rapidly silenced or lost from the duplicated genomes. Most paleopolyploids, through evolutionary time, have lost their polyploid status through a process called diploidization, and are currently considered diploids, e.g., baker's yeast, Arabidopsis thaliana, and perhaps humans.
Copy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals. Copy number variation is a type of structural variation: specifically, it is a type of duplication or deletion event that affects a considerable number of base pairs. Approximately two-thirds of the entire human genome may be composed of repeats and 4.8–9.5% of the human genome can be classified as copy number variations. In mammals, copy number variations play an important role in generating necessary variation in the population as well as disease phenotype.
In genetics, HAPPY Mapping, first proposed by Paul H. Dear and Peter R. Cook in 1989, is a method used to study the linkage between two or more DNA sequences. According to the Single Molecule Genomics Group, it is "Mapping based on the analysis of approximately HAPloid DNA samples using the PolYmerase chain reaction". In genomics, HAPPY mapping can be applied to assess the synteny and orientation of various DNA sequences across a particular genome - the generation of a "genomic" map.
Population genomics is the large-scale comparison of DNA sequences of populations. Population genomics is a neologism that is associated with population genetics. Population genomics studies genome-wide effects to improve our understanding of microevolution so that we may learn the phylogenetic history and demography of a population.
Gene redundancy is the existence of multiple genes in the genome of an organism that perform the same function. Gene redundancy can result from gene duplication. Such duplication events are responsible for many sets of paralogous genes. When an individual gene in such a set is disrupted by mutation or targeted knockout, there can be little effect on phenotype as a result of gene redundancy, whereas the effect is large for the knockout of a gene with only one copy. Gene knockout is a method utilized in some studies aiming to characterize the maintenance and fitness effects functional overlap.
Stephen J. O'Brien is an American geneticist. He is known for his research contributions in comparative genomics, virology, genetic epidemiology, mammalian systematics and species conservation. Member of the National Academy of Sciences and a Foreign Member of the Russian Academy of Sciences. Author or co-author of over 850 scientific articles and the editor of fourteen volumes.
The 2000s witnessed an explosion of genome sequencing and mapping in evolutionarily diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosomal regions from more than a few species is not yet possible. The intense focus on the building of comparative maps for domestic, laboratory and agricultural (cattle) animals has traditionally been used to understand the underlying basis of disease-related and healthy phenotypes.
Genome sequencing of endangered species is the application of Next Generation Sequencing (NGS) technologies in the field of conservation biology, with the aim of generating life history, demographic and phylogenetic data of relevance to the management of endangered wildlife.
Eukaryote hybrid genomes result from interspecific hybridization, where closely related species mate and produce offspring with admixed genomes. The advent of large-scale genomic sequencing has shown that hybridization is common, and that it may represent an important source of novel variation. Although most interspecific hybrids are sterile or less fit than their parents, some may survive and reproduce, enabling the transfer of adaptive variants across the species boundary, and even result in the formation of novel evolutionary lineages. There are two main variants of hybrid species genomes: allopolyploid, which have one full chromosome set from each parent species, and homoploid, which are a mosaic of the parent species genomes with no increase in chromosome number.
Conservation Genomics is the use of genomic study to aide in the preservation and viability of different and diverse organisms and populations. Genomics can be utilized in order to classify or argue diversity, hybridization, and history as well as identity different and similar species. Genomics can evaluate how these measures relate to effective population size as well as other ideas under the umbrella of conservation genetics, and overall biological conservation. Genomic analysis can evaluate the extent to which alleles at certain loci interact with one and other to display nuanced ways which the genome may be intertwined.