CDV3 (gene)

Last updated
CDV3
Identifiers
Aliases CDV3 , H41, CDV3 homolog
External IDs HomoloGene: 133862 GeneCards: CDV3
Gene location (Human)
Ideogram human chromosome 3.svg
Chr. Chromosome 3 (human) [1]
Human chromosome 3 ideogram.svg
HSR 1996 II 3.5e.svg
Red rectangle 2x18.png
Band 3q22.1Start133,573,730 bp [1]
End133,590,261 bp [1]
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

n/a

RefSeq (protein)

n/a

Location (UCSC) Chr 3: 133.57 – 133.59 Mb n/a
PubMed search [2] [3]
Wikidata
View/Edit Human View/Edit Mouse

Protein CDV3 homolog also known as carnitine deficiency-associated gene expressed in ventricle 3 is a protein that in humans is encoded by the CDV3 gene.

Contents

CDV3 is a biomarker for hepatocellular carcinoma. [4] CDV3 has been considered as a potential target for gene therapy. It encodes the protein Histone H4. [5] Related gene families include plasma proteins and predicted intracellular proteins. [6]

Gene

Aliases

The CDV3 protein is also commonly known as tyrosine-phosphorylated protein 36 (TPP36). TPP36 isoforms have been found to be substrates of Abl tyrosine kinase. [7]

Locus

The CDV3 gene is on chromosome 3 (3q22.1).

Chromosome location of CDV3 from NCBI Gene. CDV3 Chromosome Location.jpg
Chromosome location of CDV3 from NCBI Gene.

Exons

There were variations in the listed number of exons in CDV3 between genetic databases. The number of exons vary based on the isoform in question, with most transcript isoforms having 5 exons. [8]

Span

The exons of human CDV3 gene's longest transcript isoform span 16,711 bp. [9]

Transcripts

Isoforms

CDV3 has seven isoforms, [8] and more are continuously added to databases as they are discovered. Currently there are isoforms a-f.

Protein

Molecular weight: 27.3 kD

Protein length: 258 aa

Isoelectric point: 5.89 [10]

Motifs

A SAPS analysis [11] on the human CDV3 protein sequence found one uncharged cluster segment from 28-75 aa. There were no signs of high scoring hydrophobic segments. One high scoring transmembrane segment was found from 28-55 aa. CDV3 was found to have significant maximal spacing from 27-76 aa.

Repeats

The following repetitive structures were found for the protein.

Aligned matching blocks:

[45-52]  AGAAGGGA

[66-73]  AGAAGPGA

with superset:

  [32-36]   AGAAG

  [45-49]   AGAAG

  [  66- 70]   AGAAG

______________________________

[134-137]   MEKS

[213-216]   MEKS

______________________________

Simple tandem repeat:

[31-43]   AAGAA_GSAGGSSG

[44-54]   AAGAAGGGAGA

Predicted Motifs

PROSITE found several potential motifs in CDV3. [12]  

MotifPredicted Site (Base Pair Location)
Alanine-rich region: 28-77
Glycine-rich region: 33-72
Predicted protein kinase C (PKC) phosphorylation sites25-27, 107-109, 178-180, 179-181, 201-203, 207-209
Casein kinase ii phosphorylation site79-82, 107-110, 207-210
Tyrosine kinase phosphorylation site237-244

Predicted Secondary Structure

Conceptual translation of the longest CDV3 isoform annotated with CDV3's predicted secondary structure and conserved amino acids. Predicted secondary structure of CDV3.png
Conceptual translation of the longest CDV3 isoform annotated with CDV3's predicted secondary structure and conserved amino acids.

The following programs were used to develop this figure: JPred, CFSSP, and GOR4. The majority of the CDV3 structure is hypothesized to be alpha helices and random coil.

Predicted 3D Structure

The 3D structure of CDV3 was predicted through amino acid submission to the Zhang Lab and their I-TASSER program.

Predicted CDV3 3D structure from I-TASSER. Predicted CDV3 3D Structure.png
Predicted CDV3 3D structure from I-TASSER.

Gene regulation

Promoter

There are currently six different predicted promoters based on supporting transcripts. The following promoters were found using Genomatix . Promoter GXP_141972 was chosen for further analysis because of the large number of supporting transcripts, and it was found to be conserved in 14 of 14 orth. loci.  

Promoter NameCoordinatesSize# of Supporting Transcripts
GXP_141970133585623 - 13358672311011
GXP_141972133572563 - 133574180161812
GXP_141973133587952 - 13358905211011
GXP_6749779133573434 - 133574748131513
GXP_7542845133569573 – 1335747481101*
GXP_7542846133583006 - 1335841491144*

*No transcript assigned.

Expression patterns

CDV3 is ubiquitously expressed, and at relatively high levels, in all tissues examined in the humans. Higher expression existed in certain diseases.

Gene profile

Various experiments showing expression of CDV3 demonstrated different patterns of tissue expression; however, it is concluded that the gene is expressed ubiquitously throughout all tissue types with more expression within tissues involved in the immune system and skeletal muscle tissue. [8]

HPA RNA-seq Normal Tissue Expression from NCBI Gene entry on CDV3. CDV3 Human HPA RNA-seq Normal Tissue Expression.png
HPA RNA-seq Normal Tissue Expression from NCBI Gene entry on CDV3.

The expression of CDV3 generally decreases throughout fetal development, but expression levels remain high.

Tissue-specific circular RNA induction during human fetal development from NCBI Gene entry on CDV3. Tissue-specific circular RNA induction for CDV3.png
Tissue-specific circular RNA induction during human fetal development from NCBI Gene entry on CDV3.
RNA sequencing of total RNA from 20 human tissues from NCBI Gene entry on CDV3. CDV3 RNA sequencing of total RNA from 20 human tissues.png
RNA sequencing of total RNA from 20 human tissues from NCBI Gene entry on CDV3.
Illumina bodyMap2 transcriptome from NCBI Gene entry on CDV3. CDV3 Illumina bodyMap2 transcriptome.png
Illumina bodyMap2 transcriptome from NCBI Gene entry on CDV3.

Protein Level Regulation

A conceptual translation was made from NCBI reference sequence NM_017548.4. Amino acids conserved in at least 70% of vertebrate orthologous proteins are bolded (seen in the section below).

A conceptual translation showing predicted sites of CDV3 protein regulation. Predicted Sites of CDV3 Protein Regulation.png
A conceptual translation showing predicted sites of CDV3 protein regulation.

Evolution

Orthologs

The following orthologs were found through the NCBI database [8] . The date of divergence between species and Homo sapies was determined using TimeTree. The sequence identity and similarity were found using BLAST.

Genus and SpeciesCommon NameTaxonomic GroupDate of Divergence (Median Time)Accession NumberSequence Length (aa)Sequence IdentitySequence Similar
Homo sapiensHumanMammalia0Q9UKY7258100100
Macaca mulattaRhesus macaqueMammalia28.1AFH331102579898
Callithrix jacchusCommon marmosetMammalia42.6JAB086582579898
Castor canadensisAmerican beaverMammalia88JAV418192658989
Mus musculusHouse mouseMammalia89.8Q4VAA2.22817379
Lonchura striata domesticaSociety finchBird320OWK553842486274
Xenopus laevisAfrican clawed frogAmphibia353NP_0010805152405873
Electrophorus electricusElectric eelFish432XP_0268601272305574
Oryzias melastigmaMarine MedakaFish432XP_0241363002305063
Danio rerioZebrafishFish432NP_9978862364859

Paralogs

No human paralogs were found for CDV3 GeneCards and GenesLikeMe databases through the Weizmann Institute of Science. There were not any other relevant sources when the Google Search was conducted.

Phylogenetic tree

A phylogenetic tree was developed from the species listed in the table above using "One Click Mode" on Phylogeny.fr.

Phylogenetic tree of species with CDV3 orthologs using Phylogeny.fr Phylogenetic Tree of Species with CDV3 Orthologs.png
Phylogenetic tree of species with CDV3 orthologs using Phylogeny.fr

Interacting proteins

Interacting ProteinSources Supporting the InteractionFunctionCommon Tissues
MYCIntAct [13] , mentha [14] Family of regular genes and proto-oncongenes; code for transcription factors; persistently expressed in cancerUterus, cervix, leukemia, carcinoma
EWSR1mentha [14] , BioGRID [15] EWS RNA-binding protein 1; EWS protein function is not fully understoodBrain, lymph, placenta, carcinoma, colon, cervix, liver, ubiquitous
RBM3mentha [14] , BioGRID [15] RNA binding motif (RNP1, RNA recognition motif) protein 3Placenta, carcinoma, T-cell, cervix, liver, colon
U2AF2mentha [14] , BioGRID [15] U2 small nuclear RNA auxiliary factor 2; necessary for splicing; non-snRNP proteinLymph, carcinoma, colon, lymphoblast, cervix, T-cell, liver
ELAVL1mentha [14] , BioGRID [15] ELAV like RNA binding protein 1; stabilizes ARE-containing mRNAs; associated with several diseases and cancerIntestine, cervix, lymphoblast, carcinoma, T-cell, colon, brain, muscle, thymus, ubiquitous
Pr55 (Gag)mentha [14] , BioGRID [15] , NCBI [16] Many diverse functions such as assembly and virion maturation; vital to HIV life cycle; cellular biotinylated CDV3 mouse homolog was found to be incorporated into this particle
PIAS2NCBI [8] Encodes inhibitor of activated STAT family; aids in sumoylation of target proteins

Clinical significance

As earlier in the article, CDV3 has been found to be expressed in patients with various cancers and HIV. CDV3 has also been found to interact with Pr55 in the HIV retrovirus. Without further testing in expression, it is hard to determine how levels alter depending on disease state or the role this gene plays in these illnesses.

Related Research Articles

KIAA0895 protein-coding gene in the species Homo sapiens

KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.

Interferon-inducible GTPase 5 protein-coding gene in the species Homo sapiens

Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.

Transmembrane protein 241 is a ubiquitous sugar transporter protein which in humans is encoded by the TMEM241 gene.

ANKRD24 protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

Leucine-rich repeats and iq motif containing 1 protein-coding gene in the species Homo sapiens

Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

C6orf62 protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

C21orf58 protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

C16orf46 Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

FAM71E1 mammalian protein found in Homo sapiens

FAM71E1, also known as Family With Sequence Similarity 71 Member E1, is a protein that in humans is encoded by the FAM71E1 gene. It is thought to be ubiquitously expressed at low levels throughout the body, and it is conserved in vertebrates, particularly mammals and some reptiles. The protein is localized to the nucleus and can be exported to the cytoplasm.

Uncharacterized protein Chromosome 1 Open Reading Frame 27 is a protein in humans, encoded by the C1orf27 gene. It is accession number NM_017847. This is a membrane protein that is 3926 base pairs long with the most extensive string of amino acids being 454aa long. C1orf27 exhibits cytoplasmic expression in epidermal tissues. Predicted associated biological processes of the gene include cell fate specification and developmental properties.

TMEM44 protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

C9orf25 protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.

TEX9 protein-coding gene in the species Homo sapiens

Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene.The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

SMCO3 protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

C1orf94 protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000091527 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. Xiao H, Zhou B, Jiang N, Cai Y, Liu X, Shi Z, Li M, Du C (June 2018). "The potential value of CDV3 in the prognosis evaluation in Hepatocellular carcinoma". Genes & Diseases. 5 (2): 167–171. doi:10.1016/j.gendis.2018.01.003. PMC   6147043 . PMID   30258946.
  5. "H41 - Histone H4 - Physarum polycephalum (Slime mold) - H41 gene & protein". www.uniprot.org.
  6. "Tissue expression of CDV3". The Human Protein Atlas.
  7. Tsuchiya K, Kawano Y, Kojima T, Nagata K, Takao T, Okada M, Shinohara H, Maki K, Toyama-Sorimachi N, Miyasaka N, Watanabe M, Karasuyama H (February 2003). "Molecular cloning and characterization of TPP36 and its isoform TPP32, novel substrates of Abl tyrosine kinase". FEBS Letters. 537 (1–3): 203–9. doi:10.1016/S0014-5793(03)00127-3. PMID   12606058.
  8. 1 2 3 4 5 6 "CDV3 CDV3 homolog [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-17.
  9. "Nucleotide Links for Gene (Select 55573) - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-17.
  10. Kozlowski LP (October 2016). "IPC - Isoelectric Point Calculator". Biology Direct. 11 (1): 55. doi:10.1186/s13062-016-0159-9. PMC   5075173 . PMID   27769290.
  11. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-17.
  12. "Figure 4—figure supplement 2. Additional analysis of BioID hits". doi: 10.7554/elife.20882.011 .Cite journal requires |journal= (help)
  13. "interaction_id:EBI-3962281". IntAct.
  14. 1 2 3 4 5 6 "Results - mentha: the interactome browser". mentha.uniroma2.it. Retrieved 2019-05-18.
  15. 1 2 3 4 5 "CDV3 Result Summary | BioGRID". thebiogrid.org. Retrieved 2019-05-18.
  16. "CDV3 CDV3 homolog [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-18.