HIV integration

Last updated
Hiv virion Hiv gross.png
Hiv virion

AIDS ("acquired immune deficiency syndrome") is caused by the human immunodeficiency virus (HIV). Individuals with HIV have what is referred to as a "HIV infection". When infected semen, vaginal secretions, or blood come in contact with the mucous membranes or broken skin of an uninfected person, HIV may be transferred to the uninfected person ("horizontal transfer"), causing another infection. Additionally, HIV can also be passed from infected pregnant women to their uninfected baby during pregnancy and/or delivery ("vertical transmission"), or via breastfeeding. As a result of HIV infection, a portion of these individuals will progress and go on to develop clinically significant AIDS.

Contents

HIV is a retrovirus, which comprise a large and diverse family of RNA viruses that make a DNA copy of their RNA genome after infection of a host cell. An essential step in the replication cycle of HIV-1 and other retroviruses is the integration of this viral DNA into the host DNA. The RNA genome of progeny virions and the template for translation of viral proteins are made when the integrated viral DNA is transcribed.

Background Information: HIV-1 Integrase

Structural domains of the HIV-1 integrase Structural domains of the HIV-1 integrase.jpg
Structural domains of the HIV-1 integrase

The integration of HIV DNA into the host DNA is a critical step in the HIV life cycle. Understanding the integration process will provide a framework for gaining insight into multiple potential sites of therapeutic intervention for HIV infection and AIDS. HIV's enzyme for inserting the DNA version of its genome into the host cell DNA is called its "integrase". HIV-1 integrase catalyzes the “cut-and-paste” action of clipping the host DNA and joining the proviral genome to the clipped ends. This protein, which is 288 amino acids in length, contains three “domains”, in this order:[ citation needed ]

1. Amino (N)-terminal domain: Sometimes referred to as a "zinc finger", the N-terminal domain is composed of the conserved HHCC, His, and Cys residues, a motif that serves to bind zinc. The function of the N-terminal domain is not completely clear, but is thought to assist the integrase in forming multimers (fixed agglomerations of multiple integrase molecules).

2. The central catalytic domain (or "catalytic core"): The catalytic core encompasses the DDE catalytic triad of amino acids, or acid residues, that manage binding with a divalent metal (usually Mg2+ or Mn2−), forming the active catalytic site. In the case of HIV-1 integrase, the residues are Asp64, Asp116, and Glu152. This domain is also well conserved during evolution.

The HIV-1 catalytic domain appears dimeric in solution and in crystal structures. The vast surface area of the dimer interface indicates that it is biologically significant. The insertion sites on each strand of target DNA are separated by 5 base pairs, which parallel to approximately 15 Å for helical B-form DNA, implying that the catalytic domain (or the functional unit) of integrase should contain a pair of active sites separated by a like spacing. This said, the spacing among the active sites in the virtually spherical dimer is, however, apparently not very well-matched with the spacing among the insertion sites on the two strands of target DNA, as examination of crystal structures appears to reveal that the active sites in the dimers are separated by more than 30 Å when measured in a straight line through the proteins, and by an even greater distance when measured around the circumference of the dimer. Under the assumption that the dimer interface is preserved in the functional integrase multimer, at minimum a tetramer of integrase must be required for the complete integration reaction to proceed.[ citation needed ]

3. The Carboxy (C)- terminal domain : The C-terminal domain non-specifically binds DNA. Since the sites of integration into the target DNA are relatively non-specific, it is thought that this domain may work together in some fashion with the target DNA. Information retrieved from experiments with chimeric integrases show that recognition of the target site is controlled by the core domain. Cross-linking studies also suggest that the C-terminal domain works together with a subterminal region just inside the very ends of the viral DNA.

HIV integrase-binding domain PDB 2b4j EBI.jpg
HIV integrase-binding domain

During the integration process, the HIV integrase enzyme performs two key catalytic reactions. First is the 3’ processing of the HIV DNA, followed by strand transfer of the HIV DNA into the host DNA. The integration of HIV DNA can occur either in dividing or resting cells, and the HIV integrase enzyme can exist in the form of a monomer, dimer, tetramer, and possibly even higher-order forms (such as octomers). Each HIV particle has an estimated 40 to 100 copies of the integrase enzyme.

Integrase functions are unique to retroviruses; human cells are not required to cut-and-paste pieces of DNA into the genome. For this reason, integrase inhibitors are prime targets for developing drug therapies for HIV infection and AIDS, since inhibition of integrase should not hamper the normal operations in human cells.

HIV Integration Mechanism

HIV genome integration HIV genome integration.png
HIV genome integration

HIV integration is the insertion of HIV genetic material into the genome of the infected cell. [1] The process of HIV integration involves six sequential steps:

Binding of HIV Integrase to HIV DNA

The first step of the integration process occurs in the cytoplasm of the host cell following the completion of reverse transcription of the HIV RNA into c-DNA. This step involves the binding of integrase - most likely in the dimer form - to each end of the newly formed HIV c-DNA. The binding takes place at specific sequences in the long terminal repeat regions. The integrase-HIV DNA complex is part of an intracellular nucleoprotein particle known as the "preintegration complex" (PIC). This complex consists of linear HIV DNA, viral proteins, and host proteins. The viral proteins include integrase, nucleocapsid, matrix, viral protein R (Vpr), and reverse transcriptase. Several host proteins can also form part of this complex, although it is unclear whether some or all join the preintegration complex prior to nuclear transport.[ citation needed ]

Processing of the HIV DNA 3’ ends

In the second step of the integration process, which also takes place in the host cytoplasm, the integrase dimer cleaves the viral DNA at each 3’ end. This cleavage reaction removes GT dinucleotides on the 3’-side of a conserved CA dinucleotide region. The cleavage of the dinucleotide at each viral DNA 3’-end generates a dinucleotide 5’ "overhang" and a reactive intermediate that contains a 3’-hydroxyl group. This 3’ processing step is the first of two key catalytic reactions performed by the integrase enzyme, and it prepares the viral DNA for integration into the host DNA. In an alternative view of the DNA binding and 3’-processing reaction, the tetramer form of integrase (not the dimer) binds to the ends of the HIV DNA, and then cleaves the 3’ ends.[ citation needed ]

Translocation of HIV integrase to the host cell nucleus (Nuclear Translocation)

In the third step of the integration process, the preintegration complex is transported into the nucleus of the host cell, entering through one of the nuclear pore complexes.

Binding of the preintegration complex to the host DNA

Inside the nucleus, the host protein lens epithelium-derived growth factor/p75, commonly referred to in abbreviated form as LEDGF/p75, binds to the preintegration complex and the host DNA. The LEDGF/p75 serves as a tethering protein (or bridge) between the preintegration complex and the host DNA. The sequence of binding of the LEDGF/p75, the host DNA, and the preintegration complex remains unclear. In one version, the LEDGF/p75 binds first to the preintegration complex and then to the host DNA. On the other hand, LEDGF/p75 may bind first to the host DNA and then to the preintegration complex. Regardless of the sequence, it is believed that the presence of LEDGF/p75 results in the integrase dimers approaching each other to form a tetramer.[ citation needed ]

Transfer of HIV DNA into the host DNA (Strand Transfer)

The next step, the strand transfer reaction, takes place inside the host cell nucleus and involves the critical step of inserting the HIV DNA into a selected region of the host DNA. The region of insertion contains a weakly conserved palindromic sequence. This strand transfer reaction is initiated as the HIV integrase catalyzes the HIV DNA 3’-hydroxyl group attack on the host DNA. The attack by the HIV DNA occurs on opposite strands of the host DNA in a staggered fashion, typically 4-6 base pairs apart. This reaction leads to separation of the bonds in the host DNA base pairs located between the staggered cuts, and the joining of the HIV 3’-hydroxyl groups with the host DNA 5’ phosphate ends. At this point, the newly joined viral-host DNA region unfolds.[ citation needed ]

Repair of the gaps formed in the strand transfer process ("Gap Repair")

Hiv virion Hiv gross.png
Hiv virion
Reverse transcription Reverse transcription.svg
Reverse transcription

Following the strand transfer process, the HIV-DNA and host DNA junctions have unpaired regions of DNA, referred to as DNA "gaps". In addition, the two base pairs at the end of the 5’ region of the viral DNA remain unpaired after the strand transfer. The insertion of the new HIV DNA and the remaining gaps that flank the integration site is currently thought to induce a host cellular DNA damage response, but much of this mechanism remains speculative. The host DNA damage response is thought to be critical in the final step of integration, known as "gap repair" and may require at least three host enzymes - polymerase, nuclease, and ligase. In the first step of gap repair, it is thought that the polymerase enzyme(s) extend(s) the host DNA on each end and, thus, fill in the gaps. Next, it is possible that host nuclease activity removes the 5’ dinucleotide "flaps" on the HIV DNA. Lastly, it is thought that the DNA ligase enzymes join the remaining unbound segment of the HIV and host DNA strands. Currently this mechanism is largely validated experimentally and is an area under investigation. This gap repair process completes the integration of the HIV DNA into the host DNA, with the fully integrated HIV DNA now being referred to as "proviral DNA".[ citation needed ]

Role of splicing in HIV-1

Recent study suggested that HIV-1 prefers integration into highly spliced genes or genes with more introns. [2] This preference depends on the host chromatin binding protein LEDGF/p75 that interacts with many splicing factors. [2] Another study showed that HIV-1 preference for highly spliced genes depends on another host factor CPSF6, a poly adenylation factor. [3] Cancer genes with high number of introns are highly targeted by HIV-1. [2]

Related Research Articles

<span class="mw-page-title-main">Retrovirus</span> Family of viruses

A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptase enzyme to produce DNA from its RNA genome, the reverse of the usual pattern, thus retro (backwards). The new DNA is then incorporated into the host cell genome by an integrase enzyme, at which point the retroviral DNA is referred to as a provirus. The host cell then treats the viral DNA as part of its own genome, transcribing and translating the viral genes along with the cell's own genes, producing the proteins required to assemble new copies of the virus. Many retroviruses cause serious diseases in humans, other mammals, and birds.

<span class="mw-page-title-main">Reverse transcriptase</span> Enzyme which generates DNA

A reverse transcriptase (RT) is an enzyme used to generate complementary DNA (cDNA) from an RNA template, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobile genetic elements to proliferate within the host genome, and by eukaryotic cells to extend the telomeres at the ends of their linear chromosomes. Contrary to a widely held belief, the process does not violate the flows of genetic information as described by the classical central dogma, as transfers of information from RNA to DNA are explicitly held possible.

<span class="mw-page-title-main">Integrase</span> Class of enzymes

Retroviral integrase (IN) is an enzyme produced by a retrovirus that integrates—forms covalent links between—its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage integrases (recombinases) used in biotechnology, such as λ phage integrase, as discussed in site-specific recombination.

<span class="mw-page-title-main">Transcription (biology)</span> Process of copying a segment of DNA into RNA

Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.

<span class="mw-page-title-main">Ribonuclease H</span> Enzyme family

Ribonuclease H is a family of non-sequence-specific endonuclease enzymes that catalyze the cleavage of RNA in an RNA/DNA substrate via a hydrolytic mechanism. Members of the RNase H family can be found in nearly all organisms, from bacteria to archaea to eukaryotes.

A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transposition. The word "transposase" was first coined by the individuals who cloned the enzyme required for transposition of the Tn3 transposon. The existence of transposons was postulated in the late 1940s by Barbara McClintock, who was studying the inheritance of maize, but the actual molecular basis for transposition was described by later groups. McClintock discovered that some segments of chromosomes changed their position, jumping between different loci or from one chromosome to another. The repositioning of these transposons allowed other genes for pigment to be expressed. Transposition in maize causes changes in color; however, in other organisms, such as bacteria, it can cause antibiotic resistance. Transposition is also important in creating genetic diversity within species and generating adaptability to changing living conditions.

Deoxyribozymes, also called DNA enzymes, DNAzymes, or catalytic DNA, are DNA oligonucleotides that are capable of performing a specific chemical reaction, often but not always catalytic. This is similar to the action of other biological enzymes, such as proteins or ribozymes . However, in contrast to the abundance of protein enzymes in biological systems and the discovery of biological ribozymes in the 1980s, there is only little evidence for naturally occurring deoxyribozymes. Deoxyribozymes should not be confused with DNA aptamers which are oligonucleotides that selectively bind a target ligand, but do not catalyze a subsequent chemical reaction.

Lentivirus is a genus of retroviruses that cause chronic and deadly diseases characterized by long incubation periods, in humans and other mammalian species. The genus includes the human immunodeficiency virus (HIV), which causes AIDS. Lentiviruses are distributed worldwide, and are known to be hosted in apes, cows, goats, horses, cats, and sheep as well as several other mammals.

The genome and proteins of HIV (human immunodeficiency virus) have been the subject of extensive research since the discovery of the virus in 1983. "In the search for the causative agent, it was initially believed that the virus was a form of the Human T-cell leukemia virus (HTLV), which was known at the time to affect the human immune system and cause certain leukemias. However, researchers at the Pasteur Institute in Paris isolated a previously unknown and genetically distinct retrovirus in patients with AIDS which was later named HIV." Each virion comprises a viral envelope and associated matrix enclosing a capsid, which itself encloses two copies of the single-stranded RNA genome and several enzymes. The discovery of the virus itself occurred two years following the report of the first major cases of AIDS-associated illnesses.

Cre-Lox recombination is a site-specific recombinase technology, used to carry out deletions, insertions, translocations and inversions at specific sites in the DNA of cells. It allows the DNA modification to be targeted to a specific cell type or be triggered by a specific external stimulus. It is implemented both in eukaryotic and prokaryotic systems. The Cre-lox recombination system has been particularly useful to help neuroscientists to study the brain in which complex cell types and neural circuits come together to generate cognition and behaviors. NIH Blueprint for Neuroscience Research has created several hundreds of Cre driver mouse lines which are currently used by the worldwide neuroscience community.

<span class="mw-page-title-main">Cre recombinase</span> Genetic recombination enzyme

Cre recombinase is a tyrosine recombinase enzyme derived from the P1 bacteriophage. The enzyme uses a topoisomerase I-like mechanism to carry out site specific recombination events. The enzyme (38kDa) is a member of the integrase family of site specific recombinase and it is known to catalyse the site specific recombination event between two DNA recognition sites. This 34 base pair (bp) loxP recognition site consists of two 13 bp palindromic sequences which flank an 8bp spacer region. The products of Cre-mediated recombination at loxP sites are dependent upon the location and relative orientation of the loxP sites. Two separate DNA species both containing loxP sites can undergo fusion as the result of Cre mediated recombination. DNA sequences found between two loxP sites are said to be "floxed". In this case the products of Cre mediated recombination depends upon the orientation of the loxP sites. DNA found between two loxP sites oriented in the same direction will be excised as a circular loop of DNA whilst intervening DNA between two loxP sites that are opposingly orientated will be inverted. The enzyme requires no additional cofactors or accessory proteins for its function.

<span class="mw-page-title-main">APOBEC3G</span> Protein and coding gene in humans

APOBEC3G is a human enzyme encoded by the APOBEC3G gene that belongs to the APOBEC superfamily of proteins. This family of proteins has been suggested to play an important role in innate anti-viral immunity. APOBEC3G belongs to the family of cytidine deaminases that catalyze the deamination of cytidine to uridine in the single stranded DNA substrate. The C-terminal domain of A3G renders catalytic activity, several NMR and crystal structures explain the substrate specificity and catalytic activity.

Site-specific recombination, also known as conservative site-specific recombination, is a type of genetic recombination in which DNA strand exchange takes place between segments possessing at least a certain degree of sequence homology. Enzymes known as site-specific recombinases (SSRs) perform rearrangements of DNA segments by recognizing and binding to short, specific DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved, and rejoin the DNA strands. In some cases the presence of a recombinase enzyme and the recombination sites is sufficient for the reaction to proceed; in other systems a number of accessory proteins and/or accessory sites are required. Many different genome modification strategies, among these recombinase-mediated cassette exchange (RMCE), an advanced approach for the targeted introduction of transcription units into predetermined genomic loci, rely on SSRs.

<span class="mw-page-title-main">Resistance mutation (virology)</span>

A resistance mutation is a mutation in a virus gene that allows the virus to become resistant to treatment with a particular antiviral drug. The term was first used in the management of HIV, the first virus in which genome sequencing was routinely used to look for drug resistance. At the time of infection, a virus will infect and begin to replicate within a preliminary cell. As subsequent cells are infected, random mutations will occur in the viral genome. When these mutations begin to accumulate, antiviral methods will kill the wild type strain, but will not be able to kill one or many mutated forms of the original virus. At this point a resistance mutation has occurred because the new strain of virus is now resistant to the antiviral treatment that would have killed the original virus. Resistance mutations are evident and widely studied in HIV due to its high rate of mutation and prevalence in the general population. Resistance mutation is now studied in bacteriology and parasitology.

<span class="mw-page-title-main">HIV-1 protease</span> Enzyme involved with peptide bond hydrolysis in retroviruses

HIV-1 protease (PR) is a retroviral aspartyl protease (retropepsin), an enzyme involved with peptide bond hydrolysis in retroviruses, that is essential for the life-cycle of HIV, the retrovirus that causes AIDS. HIV protease cleaves newly synthesized polyproteins at nine cleavage sites to create the mature protein components of an HIV virion, the infectious form of a virus outside of the host cell. Without effective HIV protease, HIV virions remain uninfectious.

<span class="mw-page-title-main">PSIP1</span> Protein found in humans

PC4 and SFRS1 interacting protein 1, also known as lens epithelium-derived growth factor (LEDGF/p75), dense fine speckles 70kD protein or transcriptional coactivator p75/p52, is a protein that in humans is encoded by the PSIP1 gene.

The pre-integration complex (PIC) is a nucleoprotein complex of viral genetic material and associated viral and host proteins which is capable of inserting a viral genome into a host genome. The PIC forms after uncoating of a viral particle after entry into the host cell. In the case of the human immunodeficiency virus (HIV), the PIC forms after the Reverse Transcription Complex (RTC) has reverse transcribed the viral RNA into DNA. The PIC consists of viral proteins, host proteins and the viral DNA. The PIC enters the cellular nucleus through the nuclear pore complex without disrupting the nuclear envelope, thus allowing HIV and related retroviruses to replicate in non-dividing cells. Following nuclear entry, the PIC's DNA payload may be integrated into the host DNA as a "provirus".

<span class="mw-page-title-main">Integrasone</span> Chemical compound

Integrasone is a polyketide natural product, isolated from an unknown fungus, that has been shown to inhibit the HIV-1 integrase enzyme.

The first human immunodeficiency virus (HIV) case was reported in the United States in the early 1980s. Many drugs have been discovered to treat the disease but mutations in the virus and resistance to the drugs make development difficult. Integrase is a viral enzyme that integrates retroviral DNA into the host cell genome. Integrase inhibitors are a new class of drugs used in the treatment of HIV. The first integrase inhibitor, raltegravir, was approved in 2007 and other drugs were in clinical trials in 2011.

Daria J Hazuda is a biochemist known for discovering the first HIV Integrase Strand Transfer Inhibitors (InSTIs) and leading the development of the first HIV integrase inhibitor to gain FDA approval, Isentress (raltegravir). Her lab also determined these inhibitors' mechanism of action and ways the virus could develop resistance to them. Hazuda has also performed extensive research and led the development of antivirals for Hepatitis C Virus (HCV) including Elbasvir and Grazoprevir. She currently serves as Vice President for Infectious Diseases Discovery for Merck and Chief Scientific Officer of their MRL Cambridge Exploratory Science Center.

References

  1. Smith JA, Nunnari G, Preuss M, Pomerantz RJ, Daniel R (2007). "Pentoxifylline suppresses transduction by HIV-1-based vectors". Intervirology. 50 (5): 377–86. doi:10.1159/000109752. PMID   17938572. S2CID   34357038.
  2. 1 2 3 Singh, Parmit Kumar; Plumb R., Matthew; Ferris, Andrea L.; Iben, James R.; Wu, Xiaolin; Fadel, Hind J.; Luke, Brian T.; Esnault, Caroline; Poeschla, Eric M. (2015). "LEDGF/p75 interacts with mRNA splicing factors and targets HIV-1 integration to highly spliced genes". Genes Dev. 29 (21): 2287–97. doi:10.1101/gad.267609.115. PMC   4647561 . PMID   26545813.
  3. Sowd GA, Serrao E, Wang H, Wang W, Fadel HJ, Poeschla EM, et al. (2016). "A critical role for alternative polyadenylation factor CPSF6 in targeting HIV-1 integration to transcriptionally active chromatin". Proc Natl Acad Sci U S A. 113 (8): E1054–63. Bibcode:2016PNAS..113E1054S. doi: 10.1073/pnas.1524213113 . PMC   4776470 . PMID   26858452.

Further reading