WD Repeat and Coiled Coil Containing Protein

Last updated
WDCP
Identifiers
Aliases WDCP , C2orf44, PP384, WD repeat and coiled coil containing, MMAP
External IDs OMIM: 616234 MGI: 3040699 HomoloGene: 49822 GeneCards: WDCP
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_025203
NM_001142319

NM_001170858
NM_173416

RefSeq (protein)

NP_001135791
NP_079479

NP_001164329
NP_775592

Location (UCSC) Chr 2: 24.03 – 24.05 Mb Chr 12: 4.89 – 4.91 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. [5] WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA. [6]

Contents

Gene

WDCP is located in chromosome 2, specifically locus 2p23.3 on the minus strand, in humans. The total gene is 20,235 bp long, from 24,029,340 – 24,047,575. WDCP is located in between the MFSD2B and FKBP1B genes. [7] The total gene contains 4 exons, the details of which can be seen in the table below. [8]

Exon NumberLength (bp)Start and End Positions
Exon 17824047393-24047343
Exon 2183624039512-24037677
Exon 311824032946-24032829
Exon 4181624031162-24029347

Table 1. Exons of WDCP and their various lengths.

Common aliases of the gene include chromosome 2, open reading frame 44 (c2orf44), MMAP, and PP384. [9]

mRNA

The WDCP isoform 1 is encoded by mRNA-WD repeat and coiled-coil containing, transcript variant 1. The total RNA transcript is 18,045 bp long and is transcribed from the WDCP gene from nucleotides 24,029,347 - 24,047,391. [9] The coding DNA sequence is 3848 nucleotides long. The 5’ UTR contains 7,897 nucleotides, and the 3’ UTR contains 1,597 nucleotides.

There are two known transcript variants of WDCP: WDCP transcript variant 2 and WDCP transcript variant X1. Information about the two transcripts can be seen below. [8]

Transcript VariantAccession Number [9] Alternative Splicing PatternTranscript Length (bp)5' UTR Length3' UTR Length
WDCP Transcript Variant 2 [10] NP_001142319Removal of Exon 3186979321776
WDCP Transcript Variant X1 [11] XM_017005029Removal of Exon 420399689

Table 2. Transcript Variants of WDCP with their alternative splicing pattern in comparison to WDCP transcript variant 1.

Protein

Primary sequence

WDCP protein isoform 1 is 721 amino acids in length. Its molecular weight is 79 kDa and the theoretical isoelectric point is 6.2. [12] The protein sequence for WDCP Protein Isoform 1 is shown below. [13]

       1 MELGKGKLLR TGLNALHQAV HPIHGLAWTD GNQVVLTDLR LHSGEVKFGD SKVIGQFECV       61 CGLSWAPPVA DDTPVLLAVQ HEKHVTVWQL CPSPMESSKW LTSQTCEIRG SLPILPQGCV      121 WHPKCAILTV LTAQDVSIFP NVHSDDSQVK ADINTQGRIH CACWTQDGLR LVVAVGSSLH      181 SYIWDSAQKT LHRCSSCLVF DVDSHVCSIT ATVDSQVAIA TELPLDKICG LNASETFNIP      241 PNSKDMTPYA LPVIGEVRSM DKEATDSETN SEVSVSSSYL EPLDLTHIHF NQHKSEGNSL      301 ICLRKKDYLT GTGQDSSHLV LVTFKKAVTM TRKVTIPGIL VPDLIAFNLK AHVVAVASNT      361 CNIILIYSVI PSSVPNIQQI RLENTERPKG ICFLTDQLLL ILVGKQKLTD TTFLPSSKSD      421 QYAISLIVRE IMLEEEPSIT SGESQTTYST FSAPLNKANR KKLIESLSPD FCHQNKGLLL      481 TVNTSSQNGR PGRTLIKEIQ SPLSSICDGS IALDAEPVTQ PASLPRHSST PDHTSTLEPP      541 RLPQRKNLQS EKETYQLSKE VEILSRNLVE MQRCLSELTN RLHNGKKSSS VYPLSQDLPY      601 VHIIYQKPYY LGPVVEKRAV LLCDGKLRLS TVQQTFGLSL IEMLHDSHWI LLSADSEGFI      661 PLTFTATQEI IIRDGSLSRS DVFRDSFSHS PGAVSSLKVF TGLAAPSLDT TGCCNHVDGM      721 A

Figure 1. Protein sequence of WDCP protein isoform 1.

Compositional analysis of WDCP Isoform 1 shows no extremely high or low levels of particular amino acids. The protein contains no positive, negative, or mixed charged clusters. [14]

There are two isoforms of WDCP, as seen in the table below.

Isoform NameAccession No.Length (aa)Molecular Weight [12] Isoelectric Point [12]
WDCP Protein Isoform 2 [15] NP_001135791622696.5
PREDICTED: WDCP Protein Isoform X1 [16] XP_016860518617686.4

Table 3. Table of WDCP protein Isoforms and Protein Information.

Secondary structure

The secondary structure of WDCP Protein Isoform 1 consists of 47 random coils (429 residues, 59.5%), 19 alpha-helices (160 residues, 22.19%), and 31 extended strands (132 residues, 18.31%). [17]

Tertiary and quaternary structure

There are two predicted disulfide bonds in WDCP, one between cysteine residues 574 and 623, and the other between cysteine residues 713 and 714. [18]

Domains and motifs

WDCP protein domains include two tryptophan-aspartic acid repeat sites, multiple phosphorylation sites, and a domain that interacts with the hemopoietic cell kinase. [19]

Tissue expression

Across various tissue types, WDCP shows increased mRNA expression in white blood cells (3.0 RPKM), thymus (3.6 RPKM), lymph nodes, bone marrow, and testes. [9] WDCP exhibits increased protein expression in endocrine tissues, and well as the kidney and urinary bladder. [20] Across multiple tissue lines in the GTEx database, WDCP expression seemed to be highest in Epstein-Barr Virus transformed lymphocytes and lowest in the pancreas. [21] NCBI GEO Records reveal that overall WDCP expression is in the 65-70th percentile according to the Universal Human Reference RNA. [22]

In fetal tissue, WDCP mRNA expression is highest in the lung at 17 weeks at 3.75 RPKM, the heart at 10 weeks at 3.5 RPKM, and in the intestine at 11 weeks 3.0 RPKM. At 17 weeks, WDCP expression in the intestine drops down from 3.0 RPKM to 0.75 RPKM. The fetal kidney at 20 weeks exhibits the lowest WDCP expression, at 0.5 RPKM. [9]

Regulation of expression

Epigenetic

WDCP does not have any CpG islands associated with its promoter. WDCP has relatively low levels of H3K27ac, but higher levels of H3K4me1 and H3K4me3 across various cell types, including HeLa, HUVEC, and leukemia cell lines. [8]

Transcriptional

The GeneHancer promoter for WDCP is listed as GH02J024045. The transcription factor binding sites associated with this promoter and confirmed with a ChIP signal include HNF4A, CEBPB, ERG1, FOS1, ETS1, and E2F6. [8] The binding sites for FOS, EGR1, and ETS1 are located in a DNase hypersensitive site.

Post-transcriptional

There are two transcript variants of WDCP detailed in the table in the mRNA section.

Translational and mRNA stability

The mRNA secondary structures of the UTR regions exhibited a high number of predicted stem-loop structures in the WDCP transcript. The 5' UTR region closest to the start codon contained about 22 predicted loops. Stem loops in the 5' UTR near the start codon could indicate lower levels of expression. [23] There are 108 predicted loops in the 3' UTR region. [24] There are no known miRNA targets in the 3' UTR.

Post-translational modifications

Figure 3. WDCP conceptual translation. Annotations include repeat sites, known post-translational modification sites, and protein-protein interaction sites WDCP wiki conceptual tln the end.png
Figure 3. WDCP conceptual translation. Annotations include repeat sites, known post-translational modification sites, and protein-protein interaction sites

WDCP Isoform 1 contains the following post-translational modifications:

Glycation is the addition of a sugar molecule to an amino acid and is associated with pathologies including renal failure and diabetes. [25] Glycation is predicted to occur at lysine residues: 5, 7, 83, 189, 244, 262, 294, 325, 389, 405, 407, 461, 552, and 617.

Acetylation is the addition of an acetyl group at the starting methionine residue. This is usually associated with metabolic-relating pathways. WDCP has one confirmed acetylation site at the starting methionine residue. [26]

Phosphorylation is the addition of a phosphate group to amino acids. It is mainly associated with cellular signaling pathways and can instigate tumor development. Serine, Threonine, and Tyrosine phosphorylation sites were identified in 27 residues at a NetPhos threshold of 0.9. [27] Phosphorylation was detected at:

Possible kinases that interact with WDCP include Casein kinase 1, Casein kinase 2, cAMP, cGMP, P38MAPK, DNAPK, Protein kinase A, and Protein kinase C. [27]

SUMOylation is the addition of a small ubiquitin-like modifier to lysine residues in proteins. SUMOYlation sites in WDCP include lysine residues 47, 152, 298, 310, 709, with lysine residues 47 and 152 having the highest probability of SUMOylation. [28] SUMOylation can affect protein-protein interactions and affect protein ubiquitination. [29]

Palmitoylation is the addition of a fatty acid chain to cysteine residues. There is one confirmed site of palmitoylation at cysteine residue 714. [30]

GalNAc O-Glycosylation is the addition of a sugar molecule to a serine or threonine residue, which possibly increases structural stability. [31] Some of these residues overlap with phosphorylation sites, indicating that these residues can switch between a phosphorylation site. [32] [33] These sites were detected at:

N-glycosylation is the addition of a sugar molecule to an asparagine residue. Asparagine residue 483 is the only detected N-glycosylation site in WDCP. [34]

There were no sites of amidation, C-linked mannosylation, GPI modification sites, non-classical protein secretion, transmembrane helices or regions, prediction of R and K cleavage sites, lipoprotein sites, sulfonated tyrosines, or Twin Arginine signal peptides. [35]

Subcellular localization

WDCP Isoform 1 has no transmembrane domains, actin-binding motifs, ER retention motifs, or Golgi transport signals. The protein is most likely located in the nucleus, with a reliability score of 47.8%, and a 30.4% chance of being located in the cytoplasm. [36] [37] Close orthologs of WDCP Isoform 1 have shown similar results for orthologous proteins, where the protein is most likely located in the nucleus. [37] In addition, there are two predicted nuclear localization sequences in WDCP, starting at residues 401 and 581. [38]

Immunostaining of WDCP has shown localization in the nucleoli of osteosarcoma cells, as well as the cytoplasm of kidney cells. [39]

Function

The function of WDCP is currently not well-understood, but due to increased expression levels in the bone marrow and thymus, the protein could have possible relations to immune function and development. Its location in the nucleus, relation to the MRN complex, an abundance of phosphorylation sites, and associations with various cancers could indicate a role in cell growth regulation or a proto-oncogenic function.

Interacting proteins

WDCP has known interactions with HCK, where a proline-rich region of WDCP binds to the Src homology 3 domain of HCK. As mentioned before, WDCP was known to exist in a fusion with ALK. This fusion changes the structure of ALK, which results in constitutive signaling. [40]

Studies have confirmed interactions between WDCP and RuvB-like proteins 1 and 2 in human embryonic kidney cells, which belong to a family of AAA proteins associated with ATPase activity, C1q and tumor necrosis factor related protein 2 and DYNLT1. [41] [42] [43]

Based on the transcription factor binding sites listed in the transcriptional regulation section, WDCP could have possible interactions with the following transcription factors:

Clinical significance

Studies have linked WDCP to various cancers, including colorectal cancer, leukemia, and osteosarcomas. WDCP levels are higher in colorectal cancer metastases compared to the primary tumor. [51] GEO Records show elevated levels of WDCP in leukemia cell lines, which are regulated with Imatinib, a drug used to treat chronic myelogenous leukemia. [52] This pattern is also seen in HeLa cell lines when treated with Casiopenias, small molecules with an active Cu2+ that allow the molecule to bind to tumors and induce apoptosis. [53] [54]

Homology

Figure 4. WDCP evolutionary rate graph Homology Graph.jpg
Figure 4. WDCP evolutionary rate graph

There are no paralogs of WDCP, but orthologs of this gene were found in primates, rodents, reptiles, birds, fish, amphibians, echinoderms, and possibly fungi. There are no orthologs in prokaryotes or plants. There were no organisms with proteins containing homologous domains. [55]

The graph to the right shows the rate of evolution of WDCP in comparison to the evolution rate of the fibrinogen alpha-chain (NCBI: NP_068657) and cytochrome c (NCBI: NP_061820). As seen in the graph to the right, the evolution rate of WDCP is faster than that of cytochrome c, but slower than the evolution of the fibrinogen alpha-chain.

While there are some sequences in WDCP that are conserved (which can be seen in the conceptual translation), there are very few known conserved domains among the various orthologs. There is one conserved glycation site detected through a multiple sequence alignment, lysine 389. [56] The table below shows a list of orthologs, the evolutionary date of divergence between the organism and humans, and the % identity between WDCP Isoform 1 and the orthologous protein sequence.

OrganismAccession number [55] Date of divergence (MYA) [57]  % ID (compared to Homo sapiens)
Chimpanzee XP_001143574699
Rhesus monkey NP_001181022.22995
Mouse NP_001164329.18965
Common wall lizard XP_02857819431851
Crested ibis XP_00946559232451
Central bearded dragon XP_02064372431849
African clawed frog NP_00109023635251
Barn owl KFV5883031849
Whale shark XP_02037449546548
Zebrafish NP_00101355243346
Tropical clawed frog XP_01794997535240
Black-legged tick XP_022781485.173629
Octopus XP_02963420873627

Table 4. Table of organisms with a WDCP orthologous protein.

Notes

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000163026 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000051721 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Yakirevich E, Resnick MB, Mangray S, Wheeler M, Jackson CL, Lombardo KA, et al. (August 2016). "Oncogenic ALK Fusion in Rare and Aggressive Subtype of Colorectal Adenocarcinoma as a Potential Therapeutic Target". Clinical Cancer Research. 22 (15): 3831–40. doi: 10.1158/1078-0432.ccr-15-3000 . PMID   26933125.
  6. "WDCP Gene". GeneCards. GeneCards Suite. Retrieved 28 April 2020.
  7. "Gene: WDCP (ENSG00000163026) - Summary - Homo sapiens - Ensembl genome browser 89". may2017.archive.ensembl.org. Retrieved 3 May 2020.
  8. 1 2 3 4 "UCSC Genome Browser Entry on WDCP gene in humans". UCSC Genome Browser. University of California, Santa Cruz. Retrieved 19 April 2020.
  9. 1 2 3 4 5 "WDCP WD repeat and coiled coil containing [ Homo sapiens (human) ]". National Center for Biotechnology Information. NIH. Retrieved 29 April 2020.
  10. "WD repeat and coiled-coil-containing transcript variant (Homo sapiens)". National Center for Biotechnology Information. National Institutes of Health. May 2020. Retrieved 3 May 2020.
  11. "PREDICTED: Homo sapiens WD repeat and coiled coil containing (WDCP), transcript variant X1, mRNA". National Center for Biotechnology Information. National Institute of Health. 2 March 2020. Retrieved 3 May 2020.
  12. 1 2 3 "Compute pI/MW tool". ExPASy. SIB Bioinformatics Resource Portal. Retrieved 3 May 2020.
  13. "WD repeat and coiled-coil-containing protein isoform 1 [Homo sapiens]". National Center for Biotechnology Information. National Institute of Health. Retrieved 3 May 2020.
  14. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. EMBL-EBI. Retrieved 3 May 2020.
  15. "WD repeat and coiled-coil containing protein isoform 2". National Center for Biotechnology Information. National Institute of Health. Retrieved 3 May 2020.
  16. "WD repeat and coiled-coil-containing protein isoform X1 [Homo sapiens]". National Center for Biotechnology Information. National Institute of Health. Retrieved 3 May 2020.
  17. "abi GOR IV Protein Secondary Structure Prediction Method". Rhone-Alpes Bioinformatic Pole Gerland Site. Prabi-Gerland. Retrieved 3 May 2020.
  18. "DISULFIND - Cysteines Disulfide Bonding State and Connectivity Predictor". disulfind.dsi.unifi.it. Retrieved 3 May 2020.
  19. "WD repeat and coiled-coil-containing protein isoform 1 [Homo sapiens]". National Center for Biotechnology Information. National Institute of Health. Retrieved 19 April 2020.
  20. Human Protein Atlas Entry on WDCP
  21. "GTEx Expression for WDCP". GTEx Portal. Retrieved 3 May 2020.
  22. "GDS3113 Records on WDCP Expression in Various Normal Tissues / 116986". www.ncbi.nlm.nih.gov. Retrieved 3 May 2020.
  23. Lamping E, Niimi M, Cannon RD (July 2013). "Small, synthetic, GC-rich mRNA stem-loop modules 5' proximal to the AUG start-codon predictably tune gene expression in yeast". Microbial Cell Factories. 12 (1): 74. doi: 10.1186/1475-2859-12-74 . PMC   3765126 . PMID   23895661.
  24. "Displaying 15/20May03-15-15-53/20May03-15-15-53_1.jpg". unafold.rna.albany.edu. SUNY Albany. Retrieved 3 May 2020.
  25. Wautier, J.-L., & Schmidt, A. M. (2004). Protein Glycation. Circulation Research, 95(3), 233–238. doi: 10.1161/01.res.0000137876.28454.64
  26. "Terminus". exPASy. SIB Bioinformatics Portal. Retrieved 3 May 2020.
  27. 1 2 "NetPhos 3.1 Server". DTU Bioinformatics. Department of Bio and Health Informatics. Retrieved 3 May 2020.
  28. "SUMOplot Analysis Program". Abcepta. Retrieved 3 May 2020.
  29. Wilkinson KA, Henley JM (May 2010). "Mechanisms, regulation and consequences of protein SUMOylation". The Biochemical Journal. 428 (2): 133–45. doi:10.1042/BJ20100158. PMC   3310159 . PMID   20462400.
  30. "CSS-Palm. Prediction of Palmitoylation Site". CSS-Palm. Prediction of Palmitoylation Site. The Cuckoo WorkGroup. Retrieved 3 May 2020.
  31. Van den Steen P, Rudd PM, Dwek RA, Opdenakker G (1 January 1998). "Concepts and principles of O-linked glycosylation". Critical Reviews in Biochemistry and Molecular Biology. 33 (3): 151–208. doi:10.1080/10409239891204198. PMID   9673446.
  32. "NetOGlyc 4.0 Server". DTU Bioinformatics. Department of Bio and Health Informatics. Retrieved 3 May 2020.
  33. "YinOYang 1.2". DTU Bioinformatics. Department of Bio and Health Informatics. Retrieved 3 May 2020.
  34. "NetNGlyc". DTU Bioinformatics. Department of Bio and Health Informatics. Retrieved 3 May 2020.
  35. "Proteomics". exPASy. SIB Bioinformatics Resource Portal. Retrieved 3 May 2020.
  36. "SOSUI". SOSUI.
  37. 1 2 "PSORT II Prediction". exPASy. SIB Bioinformatics Portal. Retrieved 3 May 2020.
  38. Kosugi, Shunichi. "cNLS Mapper". cNLS Mapper. Archived from the original on 22 November 2021. Retrieved 3 May 2020.
  39. "C2orf44 Antibody (PA5-59410)". www.thermofisher.com. Retrieved 29 April 2020.
  40. Yokoyama N, Miller WT (January 2015). "Molecular characterization of WDCP, a novel fusion partner for the anaplastic lymphoma tyrosine kinase ALK". Biomedical Reports. 3 (1): 9–13. doi:10.3892/br.2014.374. PMC   4251150 . PMID   25469238.
  41. Ewing RM, Chu P, Elisma F, Li H, Taylor P, Climie S, et al. (2007). "Large-scale mapping of human protein-protein interactions by mass spectrometry". Molecular Systems Biology. 3: 89. doi:10.1038/msb4100134. PMC   1847948 . PMID   17353931.
  42. Huttlin EL, Bruckner RJ, Paulo JA, Cannon JR, Ting L, Baltier K, et al. (May 2017). "Architecture of the human interactome defines protein communities and disease networks". Nature. 545 (7655): 505–509. Bibcode:2017Natur.545..505H. doi:10.1038/nature22366. PMC   5531611 . PMID   28514442.
  43. Cloutier P, Poitras C, Durand M, Hekmat O, Fiola-Masson É, Bouchard A, et al. (May 2017). "R2TP/Prefoldin-like component RUVBL1/RUVBL2 directly interacts with ZNHIT2 to regulate assembly of U5 small nuclear ribonucleoprotein". Nature Communications. 8: 15615. Bibcode:2017NatCo...815615C. doi:10.1038/ncomms15615. PMC   5460035 . PMID   28561026.
  44. Zhang B, Wang J, Wang X, Zhu J, Liu Q, Shi Z, et al. (September 2014). "Proteogenomic characterization of human colon and rectal cancer". Nature. 513 (7518): 382–7. Bibcode:2014Natur.513..382.. doi:10.1038/nature13438. PMC   4249766 . PMID   25043054.
  45. "CEBPB CCAAT enhancer binding protein beta [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. National Institutes of Health.
  46. "EGR1 early growth response 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. National Institutes of Health.
  47. Knapska E, Kaczmarek L (November 2004). "A gene for neuronal plasticity in the mammalian brain: Zif268/Egr-1/NGFI-A/Krox-24/TIS8/ZENK?". Progress in Neurobiology. 74 (4): 183–211. doi:10.1016/j.pneurobio.2004.05.007. PMID   15556287. S2CID   39251786.
  48. Wang ZQ, Liang J, Schellander K, Wagner EF, Grigoriadis AE (December 1995). "c-fos-induced osteosarcoma formation in transgenic mice: cooperativity with c-jun and the role of endogenous c-fos". Cancer Research. 55 (24): 6244–51. PMID   8521421.
  49. Gallant S, Gilkeson G (2006). "ETS transcription factors and regulation of immunity". Archivum Immunologiae et Therapiae Experimentalis. 54 (3): 149–63. doi:10.1007/s00005-006-0017-z. PMID   16652219. S2CID   10512011.
  50. Trimarchi, Jeffrey M.; Fairchild, Brian; Wen, Jessica; Lees, Jacqueline A. (13 February 2001). "The E2F6 transcription factor is a component of the mammalian Bmi1-containing polycomb complex". Proceedings of the National Academy of Sciences. 98 (4): 1519–1524. Bibcode:2001PNAS...98.1519T. doi: 10.1073/pnas.98.4.1519 . ISSN   0027-8424. PMC   29289 . PMID   11171983.
  51. "GDS1780 Records of WDCP: Colorectal cancer progression: polysomal mRNA profiles". NCBI GEO. National Institute of Health. Retrieved 3 May 2020.
  52. "GDS3048 Record on WDCP". NCBI GEO. National Institute of Health. Retrieved 29 April 2020.
  53. "GDS4665 Records on WDCP: HeLa cell line response to chemotherapeutic Casiopeinas". NCBI GEO. National Institute of Health. Retrieved 3 May 2020.
  54. Mejia C, Ruiz-Azuara L (December 2008). "Casiopeinas IIgly and IIIia induce apoptosis in medulloblastoma cells". Pathology & Oncology Research. 14 (4): 467–72. doi:10.1007/s12253-008-9060-x. PMID   18521723. S2CID   28722785.
  55. 1 2 "Basic Local Alignment Search Tool". National Center for Biotechnology Information. National Institute of Health. Retrieved 31 March 2020.
  56. "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 3 May 2020.
  57. "TimeTree". TimeTree: The Timescale of Life. Temple University Institute for Genomics and Evolutionary Medicine Center of Biodiversity. Retrieved 3 May 2020.

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">C20orf27</span> Protein-coding gene in the species Homo sapiens

UPF0687 protein C20orf27 is a protein that in humans is encoded by the C20orf27 gene. It is expressed in the majority of the human tissues. One study on this protein revealed its role in regulating cell cycle, apoptosis, and tumorigenesis via promoting the activation of NFĸB pathway.

<span class="mw-page-title-main">Interferon-inducible GTPase 5</span> Protein-coding gene in the species Homo sapiens

Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.

<span class="mw-page-title-main">Proline-rich 12</span> Protein-coding gene in the species Homo sapiens

Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.

<span class="mw-page-title-main">Coiled-coil domain containing protein 120</span> Protein-coding gene in humans

Coiled coil domain containing protein 120 (CCDC120), also known as JM11 protein, is a protein that, in humans, is encoded by the CCDC120 gene. The function of CCDC120 has not been formally identified but structural components, conservation, and interactions can be identified computationally.

Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">Transmembrane protein 255A</span> Mammalian protein found in Homo sapiens

Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.

Hematopoietic SH2 Domain Containing (HSH2D) protein is a protein encoded by the hematopoietic SH2 domain containing (HSH2D) gene.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">CCDC121</span> Protein found in humans

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.