An endogenous viral element (EVE) is a DNA sequence derived from a virus, and present within the germline of a non-viral organism. EVEs may be entire viral genomes (proviruses), or fragments of viral genomes. They arise when a viral DNA sequence becomes integrated into the genome of a germ cell that goes on to produce a viable organism. The newly established EVE can be inherited from one generation to the next as an allele in the host species, and may even reach fixation.
Endogenous retroviruses and other EVEs that occur as proviruses can potentially remain capable of producing infectious virus in their endogenous state. Replication of such 'active' endogenous viruses can lead to the proliferation of viral insertions in the germline. For most non-retroviral viruses, germline integration appears to be a rare, anomalous event, and the resulting EVEs are often only fragments of the parent virus genome. Such fragments are usually not capable of producing infectious virus, but may express protein or RNA and even cell surface receptors.
EVEs have been identified in animals, plants and fungi. [1] [2] [3] [4] In vertebrates EVEs derived from retroviruses (endogenous retroviruses) are relatively common. Because retroviruses integrate into the nuclear genome of the host cell as an inherent part of their replication cycle, they are predisposed to enter the host germline. In addition, EVEs related to parvoviruses, filoviruses, bornaviruses and circoviruses have been identified in vertebrate genomes. In plant genomes, EVEs derived from pararetroviruses are relatively common. EVEs derived from other, non-retrotranscribing virus families, such as Geminiviridae , have also been identified in plants. Moreover, EVEs related to giant viruses (aka GEVEs) of phylum Nucleocytoviricota (NCLDV) similar to Aureococcus anophagefferens virus (AaV) have been found in 2019/2020. [5]
EVEs are traditionally identified by similarity to known viruses. In 2021, it has been demonstrated that the k-mer composition of endogenous RNA virus resemble that of their exogenous counterparts. As a result, it is now possible to identify novel groups of endogenous RNA viruses whose exogenous relatives have become extinct. [6]
EVEs are a rare source of retrospective information about ancient viruses. Many are derived from germline integration events that occurred millions of years ago, and can be viewed as viral fossils. Such ancient EVEs are an important component of paleovirological studies that address the long-term evolution of viruses. Identification of orthologous EVE insertions enables the calibration of long-term evolutionary timelines for viruses, based on the estimated time since divergence of the ortholog-containing host species groups. This approach has provided minimum ages ranging from 30 to 93 million years for the Parvoviridae , Filoviridae , Bornaviridae and Circoviridae families of viruses, [3] >100 million years in the Flaviviridae , [7] and 12 million years for the Lentivirus genus of the Retroviridae family. EVEs also facilitate the use of molecular clock-based approaches to obtain calibrations of viral evolution in deep time. [8] [9]
EVEs can sometimes provide a selective advantage to the individuals in which they are inserted. For example, some protect against infection with related viruses. [10] [11] In some mammal groups, including higher primates, retroviral envelope proteins have been exapted to produce a protein that is expressed in the placental syncytiotrophoblast, and is involved in fusion of the cytotrophoblast cells to form the syncytial layer of the placenta. In humans this protein is called syncytin, and is encoded by an endogenous retrovirus called (ERVWE1) on chromosome seven. Remarkably, the capture of syncytin or syncytin-like genes has occurred independently, from different groups of endogenous retroviruses, in diverse mammalian lineages. Distinct, syncytin-like genes have been identified in primates, rodents, lagomorphs, carnivores, and ungulates, with integration dates ranging from 10 to 85 million years ago. [12]
A provirus is a virus genome that is integrated into the DNA of a host cell. In the case of bacterial viruses (bacteriophages), proviruses are often referred to as prophages. However, proviruses are distinctly different from prophages and these terms should not be used interchangeably. Unlike prophages, proviruses do not excise themselves from the host genome when the host cell is stressed.
A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. After invading a host cell's cytoplasm, the virus uses its own reverse transcriptase enzyme to produce DNA from its RNA genome, the reverse of the usual pattern, thus retro (backwards). The new DNA is then incorporated into the host cell genome by an integrase enzyme, at which point the retroviral DNA is referred to as a provirus. The host cell then treats the viral DNA as part of its own genome, transcribing and translating the viral genes along with the cell's own genes, producing the proteins required to assemble new copies of the virus. Many retroviruses cause serious diseases in humans, other mammals, and birds.
Retrotransposons are a type of genetic component that copy and paste themselves into different genomic locations (transposon) by converting RNA back into DNA through the reverse transcription process using an RNA transposition intermediate.
Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus like the HTL viruses, HI viruses, and BLV. It belongs to the genus Betaretrovirus. MMTV was formerly known as Bittner virus, and previously the "milk factor", referring to the extra-chromosomal vertical transmission of murine breast cancer by adoptive nursing, demonstrated in 1936, by John Joseph Bittner while working at the Jackson Laboratory in Bar Harbor, Maine. Bittner established the theory that a cancerous agent, or "milk factor", could be transmitted by cancerous mothers to young mice from a virus in their mother's milk. The majority of mammary tumors in mice are caused by mouse mammary tumor virus.
Gammaretrovirus is a genus in the Retroviridae family. Example species are the murine leukemia virus and the feline leukemia virus. They cause various sarcomas, leukemias and immune deficiencies in mammals, reptiles and birds.
Endogenous retroviruses (ERVs) are endogenous viral elements in the genome that closely resemble and can be derived from retroviruses. They are abundant in the genomes of jawed vertebrates, and they comprise up to 5–8% of the human genome.
Jaagsiekte sheep retrovirus (JSRV) is a betaretrovirus which is the causative agent of a contagious lung cancer in sheep, called ovine pulmonary adenocarcinoma.
The murine leukemia viruses are retroviruses named for their ability to cause cancer in murine (mouse) hosts. Some MLVs may infect other vertebrates. MLVs include both exogenous and endogenous viruses. Replicating MLVs have a positive sense, single-stranded RNA (ssRNA) genome that replicates through a DNA intermediate via the process of reverse transcription.
Group-specific antigen, or gag, is the polyprotein that contains the core structural proteins of an Ortervirus. It was named as such because scientists used to believe it was antigenic. Now it is known that it makes up the inner shell, not the envelope exposed outside. It makes up all the structural units of viral conformation and provides supportive framework for mature virion.
Syncytin-1 also known as enverin is a protein found in humans and other primates that is encoded by the ERVW-1 gene. Syncytin-1 is a cell-cell fusion protein whose function is best characterized in placental development. The placenta in turn aids in embryo attachment to the uterus and establishment of a nutrient supply.
HERV-R_7q21.2 provirus ancestral envelope (Env) polyprotein is a protein that in humans is encoded by the ERV3 gene.
Syncytin-2 also known as endogenous retrovirus group FRD member 1 is a protein that in humans is encoded by the ERVFRD-1 gene. This protein plays a key role in the implantation of human embryos in the womb.
Koala retrovirus (KoRV) is a retrovirus that is present in many populations of koalas. It has been implicated as the agent of koala immune deficiency syndrome (KIDS), an AIDS-like immunodeficiency that leaves infected koalas more susceptible to infectious disease and cancers. The virus is thought to be a recently introduced exogenous virus that is also integrating into the koala genome. Thus the virus can transmit both horizontally and vertically. The horizontal modes of transmission are not well defined but are thought to require close contact.
Bovine immunodeficiency virus (BIV) is a retrovirus belonging to the genus Lentivirus. It is similar to the human immunodeficiency virus (HIV) and infects cattle. The cells primarily infected are lymphocytes and monocytes/macrophages.
Paleovirology is the study of viruses that existed in the past but are now extinct. In general, viruses cannot leave behind physical fossils, therefore indirect evidence is used to reconstruct the past. For example, viruses can cause evolution of their hosts, and the signatures of that evolution can be found and interpreted in the present day. Also, some viral genetic fragments which were integrated into germline cells of an ancient organism have been passed down to our time as viral fossils, or endogenous viral elements (EVEs). EVEs that originate from the integration of retroviruses are known as endogenous retroviruses, or ERVs, and most viral fossils are ERVs. They may preserve genetic code from millions of years ago, hence the "fossil" terminology, although no one has detected a virus in mineral fossils. The most surprising viral fossils originate from non-retroviral DNA and RNA viruses.
Mason-Pfizer monkey virus (M-PMV), formerly Simian retrovirus (SRV), is a species of retroviruses that usually infect and cause a fatal immune deficiency in Asian macaques. The ssRNA virus appears sporadically in mammary carcinoma of captive macaques at breeding facilities which expected as the natural host, but the prevalence of this virus in feral macaques remains unknown. M-PMV was transmitted naturally by virus-containing body fluids, via biting, scratching, grooming, and fighting. Cross contaminated instruments or equipment (fomite) can also spread this virus among animals.
Human endogenous retrovirus K (HERV-K) or Human teratocarcinoma-derived virus (HDTV) is a family of human endogenous retroviruses associated with malignant tumors of the testes. Phylogenetically, the HERV-K group belongs to the ERV2 or Class II or Betaretrovirus-like supergroup. Over the past several years, it has been found that this group of ERVs play an important role in embryogenesis, but their expression is silenced in most cell types in healthy adults. The HERV-K family, and particularly its subgroup HML-2, is the youngest and most transcriptionally active group and hence, it is the best studied among other ERVs. Reactivation of it or anomalous expression of HML-2 in adult tissues has been associated with various types of cancer and with neurodegenerative diseases such as amytrophic lateral sclerosis (ALS). Endogenous retrovirus K (HERV-K) is related to mammary tumor virus in mice. It exists in the human and cercopithecoid genomes. Human genome contains hundreds of copies of HERV-K and many of them possess complete open reading frames (ORFs) that are transcribed and translated, especially in early embryogenesis and in malignancies. HERV-K is also found in apes and Old World monkeys. It is uncertain how long ago in primate evolution the full-length HERV-K proviruses which are in the human genome today were created.
Human Endogenous Retrovirus-W (HERV-W) is the coding for a protein that would normally be part of the envelope of one family of Human Endogenous Retro-Viruses, or HERVs.
Gibbon-ape leukemia virus (GaLV) is an oncogenic, type C retrovirus that has been isolated from primate neoplasms, including the white-handed gibbon and woolly monkey. The virus was identified as the etiological agent of hematopoietic neoplasms, leukemias, and immune deficiencies within gibbons in 1971, during the epidemic of the late 1960s and early 1970s. Epidemiological research into the origins of GaLV has developed two hypotheses for the virus' emergence. These include cross-species transmission of the retrovirus present within a species of East Asian rodent or bat, and the inoculation or blood transfusion of a MbRV-related virus into captured gibbons populations housed at medical research institutions. The virus was subsequently identified in captive gibbon populations in Thailand, the US and Bermuda.
Suppressyn (SUPYN) is a protein that in humans is encoded by the ERVH48-1 gene.