The integrated form of HIV-1, also known as the provirus, is approximately 9.8 kilobases in length.(1) Both ends of the provirus are flanked by a repeated sequence known as the long terminal repeats (LTRs). The genes of HIV are located in the central region of the proviral DNA and encode at least nine proteins (Figure
1).(2) These proteins are divided into three classes:
The major structural proteins, Gag, Pol, and Env
The regulatory proteins, Tat and Rev
The accessory proteins, Vpu, Vpr, Vif, and Nef
The first part of this chapter reviews the individual viral proteins and their functions. The second part discusses factors regulating the transcription and processing of viral mRNA.
| Viral Proteins and Their Functions|
| Structural Proteins|
The gag gene gives rise to the 55-kilodalton (kD) Gag precursor protein, also called p55, which is expressed from the unspliced viral mRNA. During translation, the N terminus of p55 is myristoylated,(3) triggering its association with the cytoplasmic aspect of cell membranes. The membrane-associated Gag polyprotein recruits two copies of the viral genomic RNA along with other viral and cellular proteins that triggers the budding of the viral particle from the surface of an infected cell. After budding, p55 is cleaved by the virally encoded protease(4) (a product of the pol gene) during the process of viral maturation into four smaller proteins designated MA (matrix [p17]), CA (capsid [p24]), NC (nucleocapsid [p9]), and p6.(4)
The MA polypeptide is derived from the N-terminal, myristoylated end of p55. Most MA molecules remain attached to the inner surface of the virion lipid bilayer, stabilizing the particle. A subset of MA is recruited inside the deeper layers of the virion where it becomes part of the complex which escorts the viral DNA to the nucleus.(5) These MA molecules facilitate the nuclear transport of the viral genome because a karyophilic signal on MA is recognized by the cellular nuclear import machinery. This phenomenon allows HIV to infect nondividing cells, an unusual property for a retrovirus.(6)
The p24 (CA) protein forms the conical core of viral particles. Cyclophilin A has been demonstrated to interact with the p24 region of p55 leading to its incorporation into HIV particles.(7,8) The interaction between Gag and cyclophilin A is essential because the disruption of this interaction by cyclosporine A inhibits viral replication.(9)
The NC region of Gag is responsible for specifically recognizing the so-called packaging signal of HIV.(10) The packaging signal consists of four stem loop structures located near the 5' end of the viral RNA, and is sufficient to mediate the incorporation of a heterologous RNA into HIV-1 virions.(11) NC binds to the packaging signal through interactions mediated by two zinc-finger motifs. NC also facilitates reverse transcription.(12)
The p6 polypeptide region mediates interactions between p55 Gag and the accessory protein Vpr, leading to the incorporation of Vpr into assembling virions.(13) The p6 region also contains a so-called late domain which is required for the efficient release of budding virions from an infected cell.
| Gag-Pol Precursor|
The viral protease (Pro), integrase (IN), RNase H, and reverse transcriptase (RT) are always expressed within the context of a Gag-Pol fusion protein.(14) The Gag-Pol precursor (p160) is generated by a ribosomal frame shifting event, which is triggered by a specific cis-acting RNA motif(15) (a heptanucleotide sequence followed by a short stem loop in the distal region of the Gag RNA). When ribosomes encounter this motif, they shift approximately 5% of the time to the pol reading frame without interrupting translation. The frequency of ribosomal frameshifting explains why the Gag and the Gag-Pol precursor are produced at a ratio of approximately 20:1.
During viral maturation, the virally encoded protease cleaves the Pol polypeptide away from Gag and further digests it to separate the protease (p10), RT (p50), RNase H (p15), and integrase (p31) activities. These cleavages do not all occur efficiently, for example, roughly 50% of the RT protein remains linked to RNase H as a single polypeptide (p65).
The HIV-1 protease is an aspartyl protease(16) that acts as a dimer. Protease activity is required for cleavage of the Gag and Gag-Pol polyprotein precursors during virion maturation as described previously. The three-dimensional structure of the protease dimer has been determined.(17,18) Knowledge of this structure has led to a class of drugs directed toward inhibiting the HIV protease function. These antiviral compounds have greatly improved treatment for HIV-infected individuals.
The pol gene encodes reverse transcriptase. Pol has RNA-dependent and DNA-dependent polymerase activities. During the process of reverse transcription, the polymerase makes a double-stranded DNA copy of the dimer of single-stranded genomic RNA present in the virion. RNase H removes the original RNA template from the first DNA strand, allowing synthesis of the complementary strand of DNA. Viral DNA can be completely synthesized within 6 hours after viral entry, although the DNA may remain unintegrated for prolonged periods.(19) Many cis-acting elements in the viral RNA are required for the generation of viral DNA. For example, the TAR element, a small RNA stem-loop structure located at the 5' end of viral RNAs and containing the binding site for Tat, is required for the initiation of reverse transcription.(20) The predominant functional species of the polymerase is a heterodimer of p65 and p50. All of the pol gene products can be found within the capsid of free HIV-1 virions. Because the polymerase does not contain a proof-reading activity, replication is error-prone and introduces several point mutations into each new copy of the viral genome. The crystal structure of HIV-1 RT has been determined.(21)
| Integrase (In)|
The IN protein mediates the insertion of the HIV proviral DNA into the genomic DNA of an infected cell. This process is mediated by three distinct functions of IN.(22) First, an exonuclease activity trims two nucleotides from each 3' end of the linear viral DNA duplex. Then, a double-stranded endonuclease activity cleaves the host DNA at the integration site. Finally, a ligase activity generates a single covalent linkage at each end of the proviral DNA. It is believed that cellular enzymes then repair the integration site. No exogenous energy source, such as ATP, is required for this reaction. The accessibility of the chromosomal DNA within chromatin, rather than specific DNA sequences, seems to influence the choice of integration sites.(23) Sites of DNA kinking within chromatin are thus "hot-spots" for integration, at least in vitro.(24) It is possible to promote integration within specific DNA regions by fusing integrase to sequence-specific DNA binding proteins.(25) Preferential integration into regions of open, transcriptionally active, chromatin may facilitate the expression of the provirus. Viral genes are not efficiently expressed from nonintegrated proviral DNA.(26)
The 160 kD Env (gp160) is expressed from singly spliced mRNA. First synthesized in the endoplasmic reticulum, Env migrates through the Golgi complex where it undergoes glycosylation with the addition of 25 to 30 complex N-linked carbohydrate side chains that are added at asparagine residues. Env glycosylation is required for infectivity.(27) A cellular protease cleaves gp160 to generate gp41 and gp120. The gp41 moeity contains the transmembrane domain of Env, while gp120 is located on the surface of the infected cell and of the virion through noncovalent interactions with gp41. Env exists as a trimer on the surface of infected cells and virions.(28)
Interactions between HIV and the virion receptor, CD4, are mediated through specific domains of gp120.(29) The structure of gp120 has recently been determined.(30) The gp120 moeity has nine highly conserved intrachain disulfide bonds. Also present in gp120 are five hypervariable regions, designated V1 through V5, whose amino acid sequences can vary greatly among HIV-1 isolates. One such region, called the V3 loop, is not involved in CD4 binding, but is rather an important determinant of the preferential tropism of HIV-1 for either T lymphoid cell lines or primary macrophages.(31) Sequences within the V3 loop interact with the HIV co-receptors CXCR4 and CCR5, which belong to the family of chemokine receptors and partially determine the susceptibility of cell types to given viral strains.(32,33) The V3 loop is also the principal target for neutralizing antibodies that block HIV-1 infectivity.(34) The gp120 moeity also interacts with the protein DC-SIGN which is expressed on the cell surface of dendritic cells. Interaction with DC-SIGN increases the efficiency of infection of CD4 positive T cells.(35) Further, it is believed that DC-SIGN can facilitate mucosal transmission by transporting HIV to lymphoid tissues. The gp41 moiety contains an N-terminal fusogenic domain that mediates the fusion of the viral and cellular membranes, thereby allowing the delivery of the virions inner components into the cytoplasm of the newly infected cell.(36) A new class of antiviral therapeutics, which prevent membrane fusion, are showing promise in clinical trials.
| Regulatory Proteins|
Tat is a transcriptional transactivator that is essential for HIV-1 replication.(37) The 72 and 101 amino acid long forms of Tat are expressed by early fully spliced mRNAs or late incompletely spliced HIV mRNAs, respectively. Both forms function as transcriptional activators and are found within the nuclei and nucleoli of infected cells. Tat is an RNA binding protein, unlike conventional transcription factors that interact with DNA.(38,39) Tat binds to a short-stem loop structure, known as the transactivation response element (TAR), that is located at the 5' terminus of HIV RNAs. Tat binding occurs in conjunction with cellular proteins that contribute to the effects of Tat. The binding of Tat to TAR activates transcription from the HIV LTR at least 1000-fold.
The mechanism of Tat function has recently been elucidated. Tat acts principally to promote the elongation phase of HIV-1 transcription, so that full-length transcripts can be produced.(40,41) In the absence of Tat expression, HIV generates primarily short (>100 nucleotides) transcripts. Stimulation of polymerase elongation is accomplished by the recruitment of a serine kinase which phosphorylates the carboxylterminal domain (CTD) of RNA polymerase II. This kinase, which is known as CDK9, is part of a complex which binds directly to Tat.(42) Tat function requires a cellular co-factor, known as Cyclin T, which facilitates the recognition of the TAR loop region by the Cyclin T-Tat complex.(43) The cellular uptake of Tat released by infected cells has been observed,(41) although the impact of this phenomenon on pathogenesis is unknown. Tat has been shown to activate the expression of a number of cellular genes including tumor necrosis factor beta(44) and transforming growth factor beta,(45) and to downregulate the expression of other cellular genes including bcl-2(46) and the chemokine, MIP-1 alpha.(47)
Rev is a 13-kD sequence-specific RNA binding protein.(48) Produced from fully spliced mRNAs, Rev acts to induce the transition from the early to the late phase of HIV gene expression.(49) Rev, which is encoded by two exons, accumulates within the nuclei and nucleoli of infected cells. Rev binds to a 240-base region of complex RNA secondary structure, called the Rev response element (RRE), that lies within the second intron of HIV.(50) Rev binds to a "bubble" within a double-stranded RNA helix containing a non-Watson-Crick G-G basepair.(51) This structure, known as the Rev high affinity binding site, is located in a region of the RRE known as stem loop 2.
The binding of Rev to the RRE facilitates the export of unspliced and incompletely spliced viral RNAs from the nucleus to the cytoplasm. Normally, RNAs that contain introns (ie, unspliced or incompletely spliced RNA) are retained in the nucleus. High levels of Rev expression can lead to the export of so much intron containing viral RNA that the amount of RNA available for complete splicing is decreased, which, in turn, reduces the levels of Rev expression. Therefore, this ability of Rev to decrease the rate of splicing of viral RNA generates a negative feedback loop whereby Rev expression levels are tightly regulated.(52)
Rev has been shown to contain at least three functional domains.(53) An arginine-rich RNA binding mediates interactions with the RRE. A multimerization domain is required for Rev to function.(54) Rev is believed to exist as a homo-tetramer in solution.(55) Rev also contains an effector domain, which is a specific nuclear export signal (NES).(56,57) The export of the viral RNA by Rev is through a pathway typically used by the small nuclear RNAs (snRNAs) and the ribosomal 5s RNA rather than the normal pathway for cellular mRNAs.(57) Rev Export is mediated through interactions with the NES receptor known as CRM1. NES mutants of Rev are dominant negative.(53) Inhibition is caused by the formation of non-functional multimers between NES-mutant and wild type Rev monomers.(58) Rev is absolutely required for HIV-1 replication: proviruses that lack Rev function are transcriptionally active but do not express viral late genes and thus do not produce virions.
| Accessory Proteins|
In addition to the gag, pol, and env genes contained in all retroviruses, and the tat and rev regulatory genes, HIV-1 contains four additional genes: nef, vif, vpr and vpu, encoding the so-called accessory proteins. HIV-2 does not contain vpu, but instead harbors another gene, vpx. The accessory proteins are not absolutely required for viral replication in all in vitro systems, but represent critical virulence factors in vivo. Nef is expressed from a multiply spliced mRNA and is therefore Rev independent. In contrast, Vpr, Vpu, and Vif are the product of incompletely spliced mRNA, and thus are expressed only during the late, Rev-dependent phase of infection from singly spliced mRNAs. Most of the small accessory proteins of HIV have multiple functions as described below.
Nef (an acronym for negative factor) is a 27-kD myristoylated protein that is encoded by a single exon that extends into the 3' LTR. Nef, an early gene of HIV, is the first viral protein to accumulate to detectable levels in a cell following HIV-1 infection.(49) Its name is a consequence of early reports claiming that Nef down-regulated transcriptional activity of the HIV-1 LTR. It is no longer believed, however, that Nef has a direct effect on HIV gene expression. Nef has been shown to have multiple activities, including the downregulation of the cell surface expression of CD4, the perturbation of T cell activation, and the stimulation of HIV infectivity.
Nef acts post-translationally to decrease the cell-surface expression of CD4, the primary receptor for HIV.(59) Nef increases the rate of CD4 endocytosis and lysosomal degradation.(60) The cytoplasmic tail of CD4, and in particular a dileucine repeat sequence contained in its membrane proximal region, is key for the effect of Nef on CD4.(60) CD4 downregulation appears to be advantageous to viral production because an excess of CD4 on the cell surface has been found to inhibit Env incorporation and virion budding.(61,62) Nef also down-regulates the cell surface expression of Class I MHC, albeit to a lesser degree.(63) The downregulation of Class I MHC decreases the efficiency of the killing of HIV infected cells by cytotoxic T cells.
Nef perturbs T cell activation. Studies in the Jurkat T cell line indicated that Nef expression has a negative effect on induction of the transcription factor NF-kappa B and on IL-2 expression.(64) In contrast, results obtained in Nef transgenic mice revealed that Nef led to elevated T cell signalling.(65) The expression of a CD8-Nef chimeric molecule in Jurkat cells had either positive or negative effects depending on the cellular localization of the hybrid Nef molecule.(66) When the CD8-Nef protein accumulated in the cytoplasm, there was a block in normal signaling through the T cell receptor. When the CD8-Nef chimera was expressed at high levels on the cell surface, however, spontaneous activation followed by apoptosis was detected. Together, these observations suggest that Nef can exert pleiomorphic effects on T cell activation depending on the context of expression. Consistent with this model, Nef has been found to associate with several different cellular kinases that are present in helper T lymphocytes.
Nef also stimulates the infectivity of HIV virions.(67) HIV-1 particles produced in the presence of Nef can be up to ten times more infectious than virions produced in the absence of Nef. Nef is packaged into virions, where it is cleaved by the viral protease during virion maturation.(68) The importance of this event, however, is not clear. Virions produced in the absence of Nef are less efficient for proviral DNA synthesis, although Nef does not appear to influence directly the process of reverse transcription.(69) The downregulation of CD4 and the effect on virion infectivity by Nef are genetically distinct as demonstrated by certain mutations that affect only one of these two activities.(70)
There is compelling genetic evidence that the Nef protein of simian immunodeficiency virus is absolutely required for high-titer growth and the typical development of disease in adult animals.(71) It is possible, however, for Nef-defective mutants of SIV to cause disease in newborn animals.(72) Further, Nef-defective virions do cause an AIDS-like disease in infected animals although onset is delayed.(73)
The Vpr protein is incorporated into viral particles. Approximately 100 copies of Vpr are associated with each virion.(74) Incorporation of Vpr into virions is mediated through specific interactions with the carboxyl-terminal region of p55 Gag,(19) which corresponds to p6 in the proteolytically processed protein.
Vpr plays a role in the ability of HIV to infect nondividing cells by facilitating the nuclear localization of the preintegration complex (PIC).(75) Vpr is present in the PIC. However, rather than tethering additional nuclear localization signals to the PIC, Vpr may act as a nucleocytoplasmic transport factor by directly tethering the viral genome to the nuclear pore. Consistent with this model, Vpr expressed in cells is found associated with the nuclear pore and can be biochemically demonstrated to bind to components of the nuclear pore complex.(76) Vpr can also block cell division.(77) Cells expressing Vpr accumulate in the G2 phase of the cell cycle.(78) The expression of Vpr has been shown to prevent the activation of the p34cdc2/cyclin B complex, which is a regulator of the cell cycle important for entry into mitosis.(79,80) Accordingly, expression of a constitutively active mutant of p34cdc2 prevents the Vpr-induced accumulation of cells in the G2 phase of the cell cycle.
Vpr has also been shown to interact with the cellular protein uracil-DNA glycosylase (UNG).(81) The biological consequences of this phenomenon have yet to be determined. Another enzyme involved with the modification of deoxyuracil (dUTP), deoxyuracil phosphatase (dUTPase), is expressed by two lentiviruses that do not contain a vpr gene: equine infectious anemia virus and feline immunodeficiency virus. It is believed that the dUTPase depletes the dUTP within the cell thus preventing the deleterious consequences of dUTP incorporation into viral DNA.(82)
The 16-kD Vpu polypeptide is an integral membrane phosphoprotein that is primarily localized in the internal membranes of the cell.(83) Vpu is expressed from the mRNA that also encodes env. Vpu is translated from this mRNA at levels tenfold lower than that of Env because the Vpu translation initiation codon is not efficient.(84) The two functions of Vpu, the down-modulation of CD4 and the enhancement of virion release, can be genetically separated.(85)
In HIV-infected cells, complexes form between the viral receptor, CD4, and the viral envelope protein in the endoplasmic reticulum causing the trapping of both proteins to within this compartment. The formation of intracellular Env-CD4 complexes thus interferes with virion assembly. Vpu liberates the viral envelope by triggering the ubiquitin-mediated degradation of CD4 molecules complexed with Env.(86)
Vpu also increases the release of HIV from the surface of an infected cell. In the absence of Vpu, large numbers of virions can be seen attached to the surface of infected cells.(87)
Vif is a 23-kD polypeptide that is essential for the replication of HIV in peripheral blood lymphocytes, macrophages, and certain cell lines.(88) In most cell lines, Vif is not required, suggesting that these cells may express a protein that can complement Vif function. These cell lines are called permissive for Vif mutants of HIV. Virions generated in permissive cells can infect nonpermissive cells but the virus subsequently produced is noninfectious.
Complementation studies indicate that it is possible to restore the infectivity of HIV Vif mutants by expression of Vif in producer cells but not in target cells.(89) These results indicate that Vif must be present during virion assembly. Vif is incorporated into virions of HIV.(90) This phenomenon, however, might be nonspecific because Vif is also incorporated into heterologous retroviruses such as murine leukemia viruses.(91) Studies producing HIV from heterokaryons generated by the fusion of permissive and non-permissive cells revealed that non-permissive cells contain a naturally occurring antiviral factor that is overcome by Vif.(92) Further support for a model that Vif is counteracting an antiviral cellular factor comes from the observation that Vif proteins from different lentiviruses are species specific.(93) For instance, HIV Vif can modulate the infectivity of HIV-2 and SIV in human cells while SIV Vif protein does not function in human cells. This observation suggests that cellular factors, rather than viral components, are the target of Vif action. Vif-defective HIV strains can enter cells but cannot efficiently synthesize the proviral DNA.(89) It is not clear whether the Vif defect affects reverse transcription per se, viral uncoating, or the overall stability of the viral nucleoprotein complex. Vif mutant virions have improperly packed nucleoprotein cores as revealed by electron microscopic analyses.(94)
| Regulation of HIV Gene Expression|
The regulation of HIV gene expression is accomplished by a combination of both cellular and viral factors. HIV gene expression is regulated at both the transcriptional and post-transcriptional levels. The HIV genes can be divided into the early genes and the late genes.(68,69) The early genes, tat, rev, and nef, are expressed in a Rev-independent manner. The mRNAs encoding the late genes, gag, pol, env, vpr, vpu, and vif require Rev in order to be cytoplasmically localized and expressed.
| Transcription of the Proviral Genome|
HIV transcription is mediated by a single promoter in the 5' LTR. Expression from the 5' LTR generates a 9-kb primary transcript that has the potential to encode all nine HIV genes. The primary transcript is roughly 600 bases shorter than the provirus. The primary transcript can be spliced into one of more than 30 mRNA species(95) or packaged without further modification into virion particles (to serve as the viral RNA genome).
The LTRs are composed of three subregions designated U3, R, and U5.(96) These regions are named because of their location within the primary transcript of HIV. The U3 region (for unique 3' sequence) is approximately 450-basepairs (bp) in length and is located at the 5' end of each LTR. The U3 region contains most of the cis-acting DNA elements, which are the binding sites for cellular transcription factors. The central region of each LTR contains the 100-bp R (for repeated sequence) region. Transcription begins at the first base of the R region and polyadenylation occurs immediately after the last base of R. The U5 region (for unique 5' sequence) is 180-bp in length and contains the Tat binding site and packaging sequences of HIV. The 3' end of U5 is defined by the location of a lysyl tRNA binding site. The lysyl tRNA acts as a primer for reverse transcription.
| Regulation of Transcription|
The LTR of HIV contains DNA binding sites for several cellular transcription factors. Key among the DNA binding sites required for the activation of the transcription of the HIV provirus are those for the NF-kappa B family of transcription factors.(97) Two adjacent NF-kappa B sites are present in the U3 region of the HIV-1 LTR. The NF-kappa B protein allows the virus to be responsive to the activation state of the infected T cell. Stimulation of the T cell receptor (TCR) causes the inactive form of NF-kappa B, localized in the cytoplasm, to be translocated into the nucleus where it induces the expression of a series of T cell activation-specific genes. NF-kappa B and subsequent activation of HIV transcription can also be induced by the cytokines tumor necrosis factor alpha (TNF alpha)(98) and interleukin-1 (IL-1).(99) The HIV LTR also contains binding sites for the constitutive transcription factors SP-1, Lef, and Ets, along with binding sites for the inducible transcription factors NF-AT and AP-1.(100,101) Lef and NF-At are all T cell specific factors. The SP-1 binding sites are essential for the function of the HIV promoter.
The initial activation of the HIV LTR is a consequence of inducible and constitutive cellular transcription factors. Activation of the LTR by cellular transcription factors leads primarily to the generation of short transcripts.(40) These short transcripts are caused in part by an element located just downstream from the site of the initiation of transcription, known as the inducer of short transcripts (IST).(102) Some complete transcripts, however, are generated and allow the production of the Tat protein. The Tat protein then interacts with the TAR element to greatly increase the levels of transcription of viral RNAs. The Tat protein thus plays a key role in the activation and maintenance of high levels of transcription from the proviral DNA.
| mRNA Splicing and Cellular Localization|
The primary HIV-1 transcript contains multiple splice donors (5' splice sites) and splice acceptors (3' splice sites), which can be processed to yield more than 30 alternative mRNAs.(95) Many of the mRNAs are polycistronic; ie, they contain the open reading frame of more than one protein. The polycistronic mRNAs typically express a single gene product. Open reading frame choice is governed by the efficiency of the initiation codon and the proximity of the initiation codon to the 5' end of the mRNA.(103) HIV-1 mRNAs fall into three size classes (see Figure
Unspliced RNA. The unspliced 9-kb primary transcript can be expressed to generate the Gag and Gag-Pol precursor proteins or be packaged into virions to serve as the genomic RNA.
Incompletely spliced RNA. These mRNAs use the splice donor site located nearest the 5' end of the HIV RNA genome in combination with any of the splice acceptors located in the central region of the virus. These RNAs can potentially express Env, Vif, Vpu, Vpr, and the single-exon form of Tat. These heterogeneous mRNAs are 4- to 5-kb long and retain the second intron of HIV.
Fully spliced RNA. These mRNAs have spliced out both introns of HIV and have the potential to express Rev, Nef, and the two-exon form of Tat. These heterogeneous mRNAs do not require the expression of the Rev protein.
Normally, intron-containing RNAs must be completely spliced before they can exit the nucleus. This regulation is essential because it prevents the translation of intronic sequences contained in partially spliced mRNAs. The Rev protein binds to viral RNAs that retain intron sequences, and directs their export from the nucleus. This export allows the unconventional viral RNAs to bypass the normal "check point" of RNA splicing. The fully spliced viral mRNAs exit the nucleus by using the export pathway followed by the majority of cellular mRNAs. Threshold levels of Rev are necessary for exporting intron-containing HIV mRNAs, explaining why those encode the viral late gene products. In contrast, the proteins encoded by the fully spliced mRNAs, Nef, Tat, and Rev, can be produced immediately, and are thus early viral gene products.
|| || Geijtenbeek TB, Kwon DS, Torensma R, van Vliet SJ, van Duijnhoven GC, Middel J, Cornelissen IL, Nottet HS, KewalRamani VN, Littman DR, Figdor CG, van Kooyk Y. DC-SIGN, a dendritic cell-specific HIV-1-binding protein that enhances trans-infection of T cells. Cell. 2000;100:587-97.|