Olduvai domain | |
---|---|
Identifiers | |
Symbol | Olduvai |
Pfam | PF06758 |
InterPro | IPR010630 |
SMART | SM01148 |
PROSITE | PS51316 |
The Olduvai domain, known until 2018 as DUF1220 (domain of unknown function 1220) and the NBPF repeat, [1] is a protein domain that shows a striking human lineage-specific (HLS) increase in copy number and appears to be involved in human brain evolution. [2] The protein domain has also been linked to several neurogenetic disorders such as schizophrenia (in reduced copies) and increased severity of autism (in increased copies). [3] In 2018, it was named by its discoverers after Olduvai Gorge in Tanzania, one of the most important archaeological sites for early humans, to reflect data indicating its role in human brain size and evolution. [1]
Olduvai domains form the core of NBPF genes, which first appeared in placental mammals and experienced a rapid expansion in monkeys (simians) through duplication to reach over 20 genes in humans. [3] In humans, Olduvai domains are repeated often dozens of times within these genes. The only other gene an Olduvai domain has been found in is mammalian myomegalin, believed to be the origin of the NBPF genes via duplication. Myomegalin itself arose from a duplication of CDK5RAP2 , and all of these genes have been implicated in the development of neurons.
Olduvai copy number is the highest in humans (~289, with person-to-person variations), reduced in African great apes (~125 copies in chimpanzees, ~99 in gorillas, ~92 in orangutans), further reduced in Old World monkeys (~35), single- or low-copy in non-primate mammals, and absent in non-mammals. [3] Consequently, the Olduvai domain demonstrates the largest HLS increase in copy number of any protein-coding region over any other living species, an additional ~160 copies compared with chimpanzees. The increase in the number of Olduvai copies as one moves from monkeys to apes and then to humans shows strong direct correlations with several brain-related phenotypes, including brain size, neuron number, gyrification index, and gray and white matter volumes. [4] [5] An independent study of 12 primate species found a robust association between Olduvai copy number and brain size and, more specifically, neocortex volume. [6]
In the human genome, Olduvai sequences are located primarily on chromosome 1 in region 1q21.1-q21.2, with several copies also found at 1p36, 1p13.3, and 1p12. They are approximately 65 amino acids in length and are encoded by a two-exon doublet. Olduvai domains can be divided into six primary subtypes that appear in the following order in primate NBPF genes: CON1, CON2, HLS1, HLS2, HLS3, and CON3. [3] Sequences encoding Olduvai domains show rhythmicity, resonance and signs of positive selection, especially in primates, and are expressed in several human tissues including brain, where their expression is restricted to neurons. [2] The various HLS domains do not show any interactions, as suggested by nuclear magnetic resonance backbone chemical shift analyses. [7]
Research has found that the Olduvai domain has a role in the development of neurons. Specifically, it appears to function to increase the number of neural stem cells by prolonging the developmental period of neurons. When Olduvai copy number is reduced, neurons appear to mature faster and divide less. Conversely, when Olduvai copy number is increased, neurons appear to mature for longer and divide in higher numbers. [8] Consistent with this effect, introduction of the NBPF15 gene, encoding 6 Olduvai domains, in human neural stem cells promoted proliferation. [5]
In mouse transgenic experiments, when the single copy of Olduvai is removed from the mouse genome, the resulting “Olduvai-minus” mice produce fewer offspring and show significant hyperactivity. [9]
Olduvai copy number variation has recently been investigated in autism, which is a disorder associated with deletions and duplications of 1q21, yet the causative loci within such regions have not previously been identified. Such research has found that copy number of Olduvai subtype CON1, but not of other Olduvai subtypes, is linearly associated with increasing severity of social impairment in autism. [10] [11] [12] This evidence is relevant for current theories proposing that autism and psychosis are fundamentally related. The precise nature of this relationship is currently under debate, with alternative lines of argument suggesting that the two are diametrically opposed diseases, exist on a continuum, or exhibit a more nuanced relationship. [13]
Schizophrenia is a neurological condition in which there are issues in brain development. [14] In contrast with autism, copy number increase of Olduvai subtypes CON1 and HLS1 is associated with reduced severity of positive symptoms in schizophrenia. [15]
The dosage of the Olduvai protein domain increases along with brain size, which is seen through the evolution from primates to humans. [3] Targeted 1q21 array CGH investigation of the potential association between Olduvai and brain size found that Olduvai copy number decrease is associated with microcephaly in individuals with 1q21 CNVs. [4] Of all 1q21 sequences tested, Olduvai sequences were the only ones to show consistent correlation between copy number and brain size in both disease (micro/macrocephaly) and non-disease populations. In addition, in primates there is a significant correlation between Olduvai copy number and both brain size and brain cortical neuron number. [4]
A 2015 study found that Olduvai copy number is linearly correlated with increased cognitive function, as measured by total IQ and mathematical aptitude scores, a finding replicated in two independent groups from different countries. The study specifically studied the Olduvai variants CON1 and CON2, noting that measurement of the very high copy number HLS1–3 variants had been challenging given technologies currently available. It found that those with a higher number of copies of CON2 had higher scores on the WISC IQ test and the Progressive Achievement Mathematics test. The strength of the association between CON2 and IQ was reported to be greater than that of any other single genetic candidate reported in any previous study. This effect was significantly more profound in males. The CON2 copy number of most of the males ranged from 26 to 33, with a mean of 29, and each additional copy was associated with an average IQ score increase of 3.3. CON1 number, on the other hand, was not found to have a significant association with IQ scores. [16]
Brain region associations were also studied. CON1 and CON2 copy number were found to raise the volumes and areas of all four bilateral lobes of the brain studied. Most notably, right frontal lobe surface area showed the strongest association with both CON1 and CON2 copy number. This association was slightly stronger with CON2 copy number. There were no CON1 or CON2 associations with white matter volume or gyrification index. CON1 and CON2 number had been previously found to correlate to grey matter volume in another study. [4]
These volume and area increases in the grey matter of all cerebral lobes were found to significantly correlate with higher IQ scores. Notably, bilateral temporal surface area appeared to correlate with a progressive increase in IQ, with left temporal surface area being slightly more important. However, it was found that CON2's effects on IQ remained substantial even after eliminating bilateral temporal surface area, right frontal lobe surface area, and total grey matter volume as factors. A portion of CON2's association with IQ, however, was through its effects on bilateral temporal surface area. Notably, this contribution to IQ was larger than that of its effects on right frontal lobe surface area, despite the fact that it increased this area the most. It was concluded that the Olduvai domain appears to have a role in neural stem cell proliferation, since this proliferation seems to be the major contributor to lobe surface area while also explaining the effects of Olduvai dosage that could not be explained by brain region measurements. Corroborating this are stem cell cultures that have also shown Olduvai's proliferative effects on neuronal stem cells. However, Olduvai also had effects on cortical thickness that appeared to be the result of mature neuron cell divisions, corroborated by higher neuron numbers in primates being associated with Olduvai copy number. Additionally, studies have shown that cerebral size in primates is almost exclusively correlated with a linear addition of neurons, rather than neuronal size or density. [16]
It was found that CON2's effects on IQ were strongly dependent on sex. There was no significant association found in females. Additionally, it was found that males with higher CON2 numbers appeared to have the largest increases in IQ over other males of the same age at a mean age of 11 years old. The correlation then appeared to decrease with age. A proportional advantage was also present in younger individuals. This corroborated studies that have shown that brain growth in the brightest children, and children with autism, increases after birth and peaks at around age 11 or 12 before slowing down in adulthood. [16] In the second group, birth head circumference was not found to significantly affect IQ, further corroborating these studies. The second cohort had previously had a genetic analysis rule out any effect on IQ of other genome-wide copy number variations they had, further suggesting a critical period of activity of CON1 and CON2. [16]
This association has important implications for understanding the interplay between cognitive function and autism phenotypes. [17] These findings also provide additional support for the involvement of Olduvai in a genomic trade-off model involving the human brain: the same key genes that have been major contributors to the evolutionary expansion of the human brain and human cognitive capacity may also, in different combinations, underlie psychiatric disorders such as autism and schizophrenia. [18]
Olduvai domains are one of the many genetic elements located in the 1q21.1 region, which has a high number of repeated elements and therefore a high tendency toward deletions and duplications. This has led to several conditions that involve this region being identified, including TAR syndrome and the more general classifications of 1q21.1 deletion syndrome and 1q21.1 duplication syndrome.
Studies of deletions and duplications in the 1q21.1 region have consistently revealed microcephaly in association with deletions and macrocephaly in association with duplications. [19] [20] [21]
Genome sequences indicate that the Olduvai protein domain first appears as part of the myomegalin gene (PDE4DIP) on chromosome 1q36 in mammals at least 200 million years ago. [3] Myomegalin is a paralog (duplicated relative) of CDK5RAP2, a centrosomal protein involved in the cell cycle, of neurons especially that lacks Olduvai sequences but, when mutated, has been implicated in microcephaly. [22] [23] Orthologs of myomegalin can be seen in vertebrates as far back as bony fish, around 450 million years ago; however, the Olduvai domain is not clearly seen until the emergence of mammals. The first Olduvai domain located outside of myomegalin is seen approximately 100–150 million years ago, when the domain was included in a duplication and transposition event that created a new gene, NBPF1, which would eventually later give rise to a family of duplicated NBPF genes. At least one NBPF gene has been found in Laurasiatherians, Euarchontoglires, and elephants (but not other Afrotherians), but not in Xenarthrans (containing sloths). It was also found that several rodents, bats, and eulipotyphla (containing hedgehogs) had lost the gene. [3]
In 2009, it was proposed that the larger brain size conferred by a high number of Olduvai domain copies in humans carried an evolutionary advantage that led to the persistence and maintenance of Olduvai copies within this high range. At the same time, the Olduvai domains, like many other repetitive genetic elements, are highly susceptible to increases and decreases in number of copies, through duplications or deletions. Such changes are not always beneficial and the researchers found that a higher number of copies was associated with autism while a lower number was associated with schizophrenia. Since these disorders are fairly common among humans, it was proposed that this explained their prevalence. [21] This “genomic trade-off” model, in which Olduvai sequences can have beneficial or detrimental effects depending on how they vary, was elaborated on in more detail in a 2018 article that included one of the original authors, in light of new evidence in the intervening years. [17]
In 2012, a genetic explanation for the high instability and persistence of the Olduvai-containing regions was put forward: it was found that the HLS Olduvai domains had been affected by a known pericentric inversion (in which the region around a chromosome's centromere inverts) that occurred between 1p11.2 and 1q21.2 in the human lineage after the separation from chimpanzees. This was theorized to have contributed to their hyper-amplification specifically in humans because pairs of chromosomes in which one contains a pericentric inversion and the other does not (a form of heterozygosity) have difficulties in recombination. This can lead to non-allelic homologous recombination, in which deletions and duplications are much more likely to occur. This, combined with the fact that higher copies of Olduvai domains may have had an evolutionary advantage, could have resulted in the rapid duplication and persistence of Olduvai domains in humans. [3]
There are four human-specific NOTCH2NL genes: NOTCH2NLA, NOTCH2NLB, and NOTCH2NLC, located on 1q21.1, and NOTCH2NLR, located on 1p11.2. While chimpanzee and gorilla have copies of NOTCH2NL, none are functional. Immediately adjacent to, and downstream of, each of these four NOTCH paralogs is an NBPF gene with its Olduvai domains in the same orientation as its NOTCH2NL partner. This striking genomic arrangement suggests that each of the additional copies of NOTCH2NL that appeared in the human genome did not duplicate as a single gene but rather did so as a two-gene module composed of one NOTCH2NL gene and one NBPF gene. While the NOTCH2NL paralogs (and their NBPF partners) went from one gene to four in humans, Olduvai copies encoded by these NBPF genes underwent human-specific hyper-amplification, increasing from 13 copies (encoded by NBPF26 ) to 132 (i.e., adding 119 Olduvai copies encoded by NBPF10 , NBPF14 , and NBPF19 ). [24] If each NBPF/NOTCH2NL gene duplication involved an already expanded NBPF gene, Olduvai copy number would have dramatically increased, with each duplication event instantaneously adding 30–50 Olduvai copies. These striking Olduvai and NOTCH2NL copy number increases are thought to have occurred very recently in human evolution—e.g., within the past 1–3 million years, during the period when the human brain is thought to have undergone its most pronounced expansion. [24]
The paired Olduvai/NOTCH2NL sequences are also strikingly co-regulated, showing high co-expression in outer radial glial cells in the developing human cortex. Taken together, these findings suggest that human-specific Olduvai domains and adjacent NOTCH2NL genes may function in a coordinated, complementary fashion to promote neurogenesis and human brain expansion in a dosage-related manner. [24] Finally, the great majority of human-specific Olduvai sequences, including those adjacent to the three human NOTCH2NL genes, are found in the form of Olduvai triplets, a unique motif described in the following section.
The striking increase in Olduvai copy number in humans was primarily due to the tandem additions within several NBPF genes of a three-domain block, called the Olduvai triplet. These three variants were also found in gorilla and chimpanzee genomes but did not appear to be repeated in triplet form in any ape NBPF gene. [3] Based on this, the variants were given the names HLS1, HLS2, and HLS3, for human lineage-specific, and together they were named the HLS DUF1220 triplet and are now called the Olduvai triplet. [1] Hyper-amplification of the triplet sequences, occurring primarily on four human NBPF genes (NBPF10, NBPF14, NBPF19, and NBPF20) resulted in the addition of ~165 copies of Olduvai specifically to the human lineage since its divergence from the Pan genus (chimpanzees and bonobos). While three of these four genes are adjacent to NOTCH2NL genes, NBPF20 is not. All four of these expanded NBPF genes lie in the chromosome 1q21 region, which is known to be a duplication-rich hotspot of human genome instability and evolution. [25] [26]
The first Olduvai triplet of each NBPF gene in humans is different than the tandemly expanded triplets in each gene [3] in that the latter contain a slightly different HLS1 subtype: in extended triplets, the first HLS1 exon was replaced with the first exon of the CON3 subtype via a recombination event. [27] This mechanism, involving the tandem additions of Olduvai triplets, has been responsible for producing virtually all the human-specific copies of Olduvai in the genome and remains highly active in existing human populations. Extended Olduvai triplets are only found in the human genome: humans have approximately 50 extended triplets (containing 150 Olduvai copies), while no expanded triplets were found in other species. [3]
As mentioned above, the great majority of human-specific Olduvai copies are found primarily in four NBPF genes and organized in a variable number of tandemly arranged three-domain blocks called Olduvai triplets. [1] [26] It has been shown that these human-specific Olduvai domains are post-translationally processed by the furin protease, with a cleavage site occurring once at each triplet. [28] These results indicate that all expanded human-specific NBPF genes encode proproteins consisting of many independent Olduvai triplet proteins, which are activated by furin processing. The findings suggest that the ultimate target of selection was a rapid increase in dosage of autonomously functioning Olduvai triplet proteins and that these proteins are the primary active agent underlying Olduvai’s role in humans.
The Olduvai protein domain was first identified in 2004 in a study of gene copy number differences between human and great ape species using genome-wide array comparative genomic hybridization (arrayCGH). [25] The study analyzed virtually all human genes and found 134 that showed HLS increases in copy number. One of these genes (known at the time as MGC8902 [cDNA IMAGE843276] and later as NBPF15) showed a dramatic copy number increase and encoded six Olduvai domains. [2] The domain remained unnamed as of that time and was given a Pfam placeholder name for domains of unknown function when entered into its database. [1]
The NBPF (neuroblastoma breakpoint family) gene family, which contains all the known Olduvai domains except the one found in myomegalin, was independently identified by Vandepoele et al. in 2005 as a result of a gene (which was named NBPF1 ) being found to have existed at and been disrupted by a chromosomal translocation at 1q36 (i.e. it was located at the breakpoint) in a boy with neuroblastoma reported by G. Laureys et al. in 1990. [29] The researchers noticed that a novel protein domain that seemed to match the Olduvai Pfam entry was present in multiple copies in this gene and in several other places on chromosome 1, which led them to establish 22 NBPF genes, and they named the domain the NBPF repeat. [30]
In 2018, Olduvai was renamed by its discoverers after Olduvai Gorge in Tanzania, one of the most important archaeological sites for early humans, to reflect data indicating its role in human brain size and evolution. [1]
Reelin, encoded by the RELN gene, is a large secreted extracellular matrix glycoprotein that helps regulate processes of neuronal migration and positioning in the developing brain by controlling cell–cell interactions. Besides this important role in early development, reelin continues to work in the adult brain. It modulates synaptic plasticity by enhancing the induction and maintenance of long-term potentiation. It also stimulates dendrite and dendritic spine development in the hippocampus, and regulates the continuing migration of neuroblasts generated in adult neurogenesis sites of the subventricular and subgranular zones. It is found not only in the brain but also in the liver, thyroid gland, adrenal gland, fallopian tube, breast and in comparatively lower levels across a range of anatomical regions.
Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by gene duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes are usually identified when genome sequence analysis finds gene-like sequences that lack regulatory sequences needed for transcription or translation, or whose coding sequences are obviously defective due to frameshifts or premature stop codons. Pseudogenes are a type of junk DNA.
Parvalbumin (PV) is a calcium-binding protein with low molecular weight. In humans, it is encoded by the PVALB gene. It is a member of the albumin family; it is named for its size and its ability to coagulate.
Neurexins (NRXN) are a family of presynaptic cell adhesion proteins that have roles in connecting neurons at the synapse. They are located mostly on the presynaptic membrane and contain a single transmembrane domain. The extracellular domain interacts with proteins in the synaptic cleft, most notably neuroligin, while the intracellular cytoplasmic portion interacts with proteins associated with exocytosis. Neurexin and neuroligin "shake hands," resulting in the connection between the two neurons and the production of a synapse. Neurexins mediate signaling across the synapse, and influence the properties of neural networks by synapse specificity. Neurexins were discovered as receptors for α-latrotoxin, a vertebrate-specific toxin in black widow spider venom that binds to presynaptic receptors and induces massive neurotransmitter release. In humans, alterations in genes encoding neurexins are implicated in autism and other cognitive diseases, such as Tourette syndrome and schizophrenia.
CDK5 regulatory subunit-associated protein 2 is a protein that in humans is encoded by the CDK5RAP2 gene. It has necessary roles in the formation and stability of microtubules from the centrosome and has been found to be linked to human brain size variation in males. Multiple transcript variants exist for this gene, but the full-length nature of only two has been determined.
Neuroblastoma breakpoint family, member 3, also known as NBPF3, is a human gene of the neuroblastoma breakpoint family, which resides on chromosome 1 of the human genome. NBPF3 is located at 1p36.12, immediately upstream of genes ALPL and RAP1GAP.
Myomegalin, also known as phosphodiesterase 4D-interacting protein or cardiomyopathy-associated protein 2, is a protein that in humans is encoded by the PDE4DIP gene. It has roles in the formation of microtubules from the centrosome. Its name derives from the fact that it is highly expressed in units of tubular myofibrils known as sarcomeres and is a large protein, at 2,324 amino acids. It was first characterised in 2000.
Neuregulin 3, also known as NRG3, is a neural-enriched member of the neuregulin protein family which in humans is encoded by the NRG3 gene. The NRGs are a group of signaling proteins part of the superfamily of epidermal growth factor, EGF like polypeptide growth factor. These groups of proteins possess an 'EGF-like domain' that consists of six cysteine residues and three disulfide bridges predicted by the consensus sequence of the cysteine residues.
Neuroblastoma breakpoint family, member 15, also known as NBPF15, is a protein which in humans is encoded by the NBPF15 gene. The gene is 18762 bp long, with mRNA that is 3837 bp long. The gene is located on chromosome 1q21.1. Its sub-cellular location is predicted to be in the nucleus and cytoplasm. It contains what is known as the NBPF repeat, which is a two-exon stretch of sequence that is characteristic of all 21 members of the NBPF gene family. The repeat is considered the ancestral exons, and the NBPF family has been linked to primate evolution.
Neuroblastoma breakpoint family member 10 is a protein that in Homo sapiens is encoded by the NBPF10 gene.
1q21.1 deletion syndrome is a rare aberration of chromosome 1. A human cell has one pair of identical chromosomes on chromosome 1. With the 1q21.1 deletion syndrome, one chromosome of the pair is not complete, because a part of the sequence of the chromosome is missing. One chromosome has the normal length and the other is too short.
1q21.1 duplication syndrome, also known as 1q21.1 microduplication, is an uncommon copy number variant associated with several congenital abnormalities, including developmental delay, dysmorphic traits, autism spectrum disorder, and congenital cardiac defects. Common facial features include frontal bossing, hypertelorism, and macrocephaly. Around 18 and 29% of patients with 1q21.1 microduplications have congenital cardiac abnormalities. 1q21.1 duplication syndrome is caused by microduplications of the BP3-BP4 region. 18-50% are de novo deletions and 50-82% inherited from parents. The 1q21.1 area, one of the largest regions in the human genome, is highly susceptible to copy number variation due to its frequent low-copy duplications. Whole exon sequencing and quantitative polymerase chain reaction can provide a precise molecular diagnosis for children with 1q21.1 microduplication syndrome.
Cognitive genomics is the sub-field of genomics pertaining to cognitive function in which the genes and non-coding sequences of an organism's genome related to the health and activity of the brain are studied. By applying comparative genomics, the genomes of multiple species are compared in order to identify genetic and phenotypical differences between species. Observed phenotypical characteristics related to the neurological function include behavior, personality, neuroanatomy, and neuropathology. The theory behind cognitive genomics is based on elements of genetics, evolutionary biology, molecular biology, cognitive psychology, behavioral psychology, and neurophysiology.
SLIT-ROBO Rho GTPase activating protein 2C is a protein in humans that is encoded by the SRGAP2C gene.
Neuroblastoma breakpoint family, member 1, or NBPF1, is a protein that is encoded by the gene NBPF1 in humans. This protein is member of the neuroblastoma breakpoint family of proteins, a group of proteins that are thought to be involved in the development of the nervous system.
ARHGAP11B is a human-specific gene that amplifies basal progenitors, controls neural progenitor proliferation, and contributes to neocortex folding. It is capable of causing neocortex folding in mice. This likely reflects a role for ARHGAP11B in development and evolutionary expansion of the human neocortex, a conclusion consistent with the finding that the gene duplication that created ARHGAP11B occurred on the human lineage after the divergence from the chimpanzee lineage but before the divergence from Neanderthals.
LINE1 is a family of related class I transposable elements in the DNA of many groups of eukaryotes, including animals and plants, classified with the long interspersed nuclear elements (LINEs). L1 transposons are most ubiquitous in mammals, where they make up a significant fraction of the total genome length, for example they comprise approximately 17% of the human genome. These active L1s can interrupt the genome through insertions, deletions, rearrangements, and copy number variations. L1 activity has contributed to the instability and evolution of genomes and is tightly regulated in the germline by DNA methylation, histone modifications, and piRNA. L1s can further impact genome variation through mispairing and unequal crossing over during meiosis due to its repetitive DNA sequences.
Neuroblastoma breakpoint family member 19, or NBPF19, is a protein that in humans is encoded by the NBPF19 gene. This protein is included in the neuroblastoma breakpoint family of proteins.
Notch homolog 2 N-terminal-like is a family of proteins that in humans consists of 3 proteins and is encoded by NOTCH2NL gene. It appears to play a key role in the development of the prefrontal cortex, a part of the brain.
The neuroblastoma breaking point family (NBPF) is a family of genes involved in neuronal development. The family is highly specific to primates, with minimal similarity or presence in other mammals and no presence in other animals, and its genes' content has been subject to a very high number of duplications in humans. It was described by Vandepoele et al. in 2005 and named as such because NBPF1 was found to be broken by a chromosomal translocation in a neuroblastoma patient.