C2orf74

Last updated
Chromosomal location of C2orf74. Image made using NCBI Genome Decorator Page C2orf74 location.png
Chromosomal location of C2orf74. Image made using NCBI Genome Decorator Page

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). [2] Isoform 1 of the gene is 19,713 base pairs long. [2] C2orf74 has orthologs in 135 different species, [2] including primarily placental mammals and some marsupials.

Contents

The protein encoded by the C2orf74 gene has two isoforms, the longest of which (isoform 1) is 187 amino acids in length. [3] This protein is linked to the development of autoimmune disorders such as ankylosing spondylitis [4] and diseases affecting the colon [5] [6] [7]

Gene

C2orf74 is a gene located on the plus strand at 2p15 in humans. [2] It is 19,713 base pairs in length beginning at 61,145,116 and ending at 61,164,828 and includes 8 exons. [2] Other genes within its neighborhood include KIAA841, LOC105374759, LOC105374758, LOC339803, AHSA2P, USP34, and SNORA70B. [2]

Transcripts

Transcript variants

C2orf74 has 6 validated mRNA products created via alternative splicing that give rise to two different isoforms. [2] An extended version of Isoform 1 has also been sequenced utilizing a 5' in frame start codon, though this protein product is not formally acknowledged as a separate isoform by NCBI. [8]

C2orf74 Transcript variants
NamemRNA accession numbertranscript lengthnumber of exonsprotein lengthisoform
Transcript variant 1NM_001143959.41097 bp5187 aa1
Transcript variant 2NM_001143960.3851 bp4115 aa2
Transcript variant 3NM_001316317.2737 bp3115 aa2
Transcript variant 4NM_001367069.11002 bp5115 aa2
Transcript variant 5NM_001367070.11124 bp6115 aa2
Transcript variant 6NM_001367071.1973 bp5115 aa2
Transcript variant 1 extensionA8MZ971097 bp5194 aa1+

The above table is a compilation of the transcript variants of C2orf74 acknowledged on the C2orf74 gene page of NCBI.

Proteins

There are two known isoforms of the C2orf74 encoded protein. Isoform 1 is derived from transcript variant 1, and is 187 amino acids in length. [3] There is a putative N-terminal extension of this isoform that utilizes a 5' start codon and adds 7 amino acids to the start of isoform 1, bringing the length of the protein up to 194 amino acids. [8] Isoform 2 is derived from any one of transcript variants 2, 3, 4, 5, or 6. [2] It is created using an alternative promoter, features a different 5'UTR, and a shorter N-terminal end that excludes the first 3 exons that comprise the N-terminal end of exon 1. The result is a shorter protein 115 amino acids in length that lacks a highly conserved transmembrane domain featured at the N-terminal end of isoform 1. [9]

C2orf74 protein isoforms
NameTranscript variantPeptide lengthDomains present
Isoform 1 extension1 extension194 aaTMEM, DUF
Isoform 11187 aaTMEM, DUF
Isoform 22,3,4,5,6115 aaDUF

C2orf74 MSA.png

The above figure depicts a conceptual translation of isoform 1 of C2orf74 made using SixFrame. [10] Exon boundaries are depicted in blue font. The 5'UTR of this protein is shown to have an upstream in frame stop codon (red), and an upstream in frame start codon (green). The putative N-terminal extension is depicted in light gray. The N-terminal transmembrane domain is highlighted in lavender. Regions conserved among orthologs are highlighted in cyan, while regions prone to deletion are highlighted in gray. Phosphorylation sites are highlighted in red with the phosphorylated amino acid underlined. Significant SNPs are highlighted in pink with a key pictured to the right detailing the type of change and reason for inclusion. Polyadenylation signals in the 3'UTR are highlighted in orange.

Isoform 1

Isoform 1 of the C2orf74 protein has a calculated molecular weight of approximately 21 kDa, and a pI of 5.74. [11] [12] It does not display any unique amino acid composition, cysteine spacing, number of multiplets, or periodicity. [13] This protein isoform has a putative 7 aa N-terminal extension [8] It contains a 21 aa transmembrane region at position 7. [3]

Domains

The transmembrane region begins 7 amino acids from the N-terminal end of the protein, and ends at the 29th amino acid in humans. This region has been identified by NCBI, [3] as well as being supported by biochemical analysis. The biochemical qualities characterizing this region as a transmembrane region include a neutral charge cluster and a high-scoring hydrophobic segment, as well as alpha-helical secondary structure. [13] [14] This region is also highly conserved among all orthologs, indicating it as a region of functional significance. [15]

The region downstream of the transmembrane region is considered a domain of unknown function (DUF) within pfam 15484. [3] Approximately 52% of this portion of the protein is considered to be disordered, making confidence in prediction of domain function difficult. [16] However, the C-terminal end is highly conserved among all orthologs. [15]

Antibody staining results from The Human Protein Atlas. Immunocytochemical antibody staining results are listed as showing localization to the centrosome (Green, frame A). Other examples of the same antibody staining as well as immunohistochemical results show strong presence of this gene in the cytoplasm. (Green, frame B. Dark brown, frame C). Subcellular localization imaging C2orf74.png
Antibody staining results from The Human Protein Atlas. Immunocytochemical antibody staining results are listed as showing localization to the centrosome (Green, frame A). Other examples of the same antibody staining as well as immunohistochemical results show strong presence of this gene in the cytoplasm. (Green, frame B. Dark brown, frame C).

Structure

C2orf74 isoform 1 is shown to be dominated primarily by helical secondary structure, with only short regions being predicted to include beta sheet conformations. [14] Predictions of tertiary structure tend to showcase a globular DUF, at the end of a helical transmembrane domain. [16] [18] Structural predictions of isoform 2 which includes only the DUF also appear to be strictly globular in conformation. [16] [18]

subcellular localization

The presence of a transmembrane domain indicates that Isoform 1 of the C2orf74 product is found within a membranous cellular structure. Analysis of likely subcellular localization among orthologs indicates the C2orf74 product is most likely found in the nuclear membrane, mitochondria, or endoplasmic reticulum. [19] Immunocytochemical imaging shows C2orf74 to be localized to the centromere, while immunohistochemical imaging shows it to be centralized in the cytosol. [17]

Gene level regulation

Promoter

Birds eye view of the coding region of the C2orf74 gene and the promoters in the region. Red boxes represent C2orf74 exons while blue arrows represent promoter regions. C2orf74 Exon-promoter map .png
Birds eye view of the coding region of the C2orf74 gene and the promoters in the region. Red boxes represent C2orf74 exons while blue arrows represent promoter regions.

C2orf74 has 3 possible promoters that produce complete protein isoforms. Isoform 1 could be made by either GXP_6040264 or GXP_2056207, though GXP_6040264 shows the most promise, as it has a higher number of CAGE tags (249) than GXP_2056207 (133), and is conserved among several orthologs. Isoform 2 is made by the promoter GXP_649849. [20]

GXP_6040264 contains over 300 transcription factor binding sites, with a fork head domain factor (V$FKHD), a bromodomain and PHD domain transcription factor (V$BPTF), and a sex/testes determining and related HMG box factor (V$SORY) being the most conserved regions among mammals. [20]

Expression

C2orf74 is expressed at minimal levels in several cell types. Due to the low levels of expression, meaningful trends in localization are difficult to discern. [2] In situ hybridization of C2orf74 and some RNA sequencing assays indicate potential for localization in the cerebellum. [2] [21] Microarray data from NCBI GEO indicates lower levels of C2orf74 expression in individuals with colorectal tumors such as adenomas or cancerous colorectal tumors when compared to normal mucosa or tumors of non-colorectal origin such as carcinomas. [22]

Transcript level regulation

Predicted 3D structure of the human 3'UTR. Stem-loops have been colored red, yellow, green, cyan, blue, purple, and magenta. Potential mi-RNA binding sites are labelled in light pink, and polyadenylation sites are labelled in orange. 3'UTR structure-update.png
Predicted 3D structure of the human 3'UTR. Stem-loops have been colored red, yellow, green, cyan, blue, purple, and magenta. Potential mi-RNA binding sites are labelled in light pink, and polyadenylation sites are labelled in orange.

The 5' region of transcript variant 1 is 232 bp in length and features an upstream in frame stop codon as well as an upstream in frame start codon. [9] When expressed, this start codon would add a 7 aa N-terminal extension to transcript variant 1. [8] Analysis of potential 3D structure of the 5'UTR of isoform 1 shows the presence of 2 hairpin structures. The 5' UTR of transcript variants 2 through 6 differs from that of transcript variant 1. However, the 5' UTR differs a great degree between orthologs, indicating that it may not be a region of great importance in terms of transcriptional regulation.

The 3' UTR is conserved among all human transcript variants, though it does not show significant conservation among mammalian species. It is 301 bp in length, and contains two polyadenylation signals at 981 bp and 1071 bp respectively. [9] It also contains two partially conserved mi-RNA binding sites at 73 bp (has-mir-241) and 270 bp (has-miR-23), [23] though neither of the mi-RNAs predicted to bind appear to be present in the human transcriptome. [24] The human 3'UTR is found to be rich in stem-loop structures

Protein level regulation

C2orf74 is predicted to have 4 CK2 phosphorylation sites, as well as 3 PKC phosphorylation sites. [25] The presence of CK2 and PKC phosphorylation sites are common among many orthologs. Myristoylation sites are also common among c2orf74 orthologs, though they are less conserved. [26]

Significance of Phosphorylation sites

CK2

Caesin Kinase 2 is a protein kinase that is serine/threonine specific and plays a significant role in cell signaling pathways related to cell cycling, regulation, and development. Association with C2orf74 may implicate it as a member of an intracellular phosphorylation chain governing cell development, and explain its association with conditions such as cancer and autoimmunity.

PKC

Protein kinase C is a family of protein kinases that are serine and threonine specific and play a role in regulating a broad range of cellular functions, particularly those involving phosphorylation cascades. As with CK2, C2orf74's association with PKC may implicate it as a signaling molecule involved in a phosphorylation cascade. This may provide context as to the nature of C2orf74's relationship to autoimmune disease and cancer.

Homology

Orthologs

C2orf74 first appeared in mammals and is found in animals as distantly related to humans as marsupials. [27] The table below highlights 20 selected orthologs from various mammalian clades arranged by date of divergence from the human lineage. Red tiles indicate high similarity to the human sequence and blue tiles indicate low similarity. In general, the samples follow the pattern in which more recent evolutionary diversion results in more similar genotypes. Notable exceptions, however, include the galago, mouse, and manatee.

C2orf74 orthologs .png

Rate of Evolution

The figures below show in more detail the evolutionary history of C2orf74. To the right is a comparison of the divergence rate of C2orf74 compared to that of cytochrome C and fibrinogen alpha. Given that fibrinogen alpha in this figure serves as a standard example of a rapidly changing protein, one can see that C2orf74 is evolving quite quickly.

Figure 4: Rate of evolution comparison between C2orf74, Cytochrome C, and Fibrinogen Alpha. C2orf74 appears to evolve even faster than Fibrinogen alpha, which serves as a standard for rapidly evolving genes. Evolution rate comparison of C2orf74, Cytochrome C, and fibrinogen alpha.png
Figure 4: Rate of evolution comparison between C2orf74, Cytochrome C, and Fibrinogen Alpha. C2orf74 appears to evolve even faster than Fibrinogen alpha, which serves as a standard for rapidly evolving genes.

Protein interactions

Transcription factors

There are three types of transcription factors that have been predicted to bind to C2orf74. These transcription factors are POT1, SMAGP, and SRPK1.

POT1

POT1 is a telomere end binding protein. It is as of yet unclear how this relates to the predicted function of C2orf74 given previous research and predictions of subcellular localization.

SMAGP

SMAGP is a small transmembrane and glycosylated protein. [28] Association with SMAGP makes sense given the subcellular localization of both structures to the nuclear membrane. Its possible that association with SMAGP may aid C2orf74 as a protein complex associated with intracellular signaling pathways.

SRPK1

SRPK1 is a protein kinase localized to the nucleus and cytoplasm. Association with SRPK1 also makes sense for C2orf74 given the subcellular localization of both proteins and implication in phosphorylative processes.

Clinical significance

Disease association

Bowel disease

Several studies have been able to link differential C2orf74 functionality to bowel disease. Two separate studies have identified C2orf74 as a potential susceptibility locus for Crohn's disease. [6] [7] Furthermore, various studies reported in NCBI GEO show differential expression of C2orf74 in benign and cancerous colorectal tumor tissues. [29]

Left: Microarray data from NCBI GEO showing decreased level of C2orf74 expression in colorectal cancer cells regardless of whether they were positive or negative for CD133 (a proposed biomarker for cancer.), but not in other types of cancerous cells such as carcinoma associated fibroblasts. Right: Microarray data from NCBI GEO showing decreased level of C2orf74 expression in colorectal adenomas, but not in normal mucosa. Note that adenomas are benign tumors that arise from normal mucosa, making the difference in C2orf74 expression relevant. C2orf74 microarrays.png
Left: Microarray data from NCBI GEO showing decreased level of C2orf74 expression in colorectal cancer cells regardless of whether they were positive or negative for CD133 (a proposed biomarker for cancer.), but not in other types of cancerous cells such as carcinoma associated fibroblasts. Right: Microarray data from NCBI GEO showing decreased level of C2orf74 expression in colorectal adenomas, but not in normal mucosa. Note that adenomas are benign tumors that arise from normal mucosa, making the difference in C2orf74 expression relevant.

Autoimmune disease

Aside from Crohn's disease, C2orf74 has also been found to be a susceptibility locus for ankylosing spondylitis, [4] and generally for other nondescript autoimmune conditions. [5] The SNP believed to play a role in C2orf74's relationship to ankylosing spondylitis is found within the coding region of the gene, and is denoted in the conceptual translation found in the Protein section above. [6]

Mutations (SNPs of interest)

At 36aa there is a missense SNP that may be either a Tyrosine (Tyr, Y) or an Aspartate (Asp, D). This is caused by a SNP is associated with ankylosing spondylitis can be found at 319 bp on transcript variant 1 [4]

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">RNF128</span> Protein-coding gene in the species Homo sapiens

E3 ubiquitin-protein ligase RNF128 is an enzyme that in humans is encoded by the RNF128 gene.

Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.

<span class="mw-page-title-main">TMEM176B</span> Protein-coding gene in the species Homo sapiens

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">ZNF337</span> Protein-coding gene in the species Homo sapiens

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875

<span class="mw-page-title-main">FAM155B</span> Protein-coding gene in humans

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. "Genome Decoration Page". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  2. 1 2 3 4 5 6 7 8 9 10 "C2orf74 chromosome 2 open reading frame 74 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-09-30.
  3. 1 2 3 4 5 "uncharacterized protein C2orf74 isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-09-30.
  4. 1 2 3 Wang, Mengmeng; Xin, Lihong; Cai, Guoqi; Zhang, Xu; Yang, Xiao; Li, Xiaona; Xia, Qing; Wang, Li; Xu, Shengqian; Xu, Jianhua; Shuai, Zongwen (2017-05-11). "Pathogenic variants screening in seventeen candidate genes on 2p15 for association with ankylosing spondylitis in a Han Chinese population". PLOS ONE. 12 (5): e0177080. Bibcode:2017PLoSO..1277080W. doi: 10.1371/journal.pone.0177080 . ISSN   1932-6203. PMC   5426703 . PMID   28493913.
  5. 1 2 Gabrielsen, Ingvild S. M.; Amundsen, Silja Svanstrøm; Helgeland, Hanna; Flåm, Siri Tennebø; Hatinoor, Nimo; Holm, Kristian; Viken, Marte K.; Lie, Benedicte A. (2016-07-15). "Genetic risk variants for autoimmune diseases that influence gene expression in thymus". Human Molecular Genetics. 25 (14): 3117–3124. doi: 10.1093/hmg/ddw152 . ISSN   0964-6906. PMID   27199374.
  6. 1 2 3 Franke, Andre; McGovern, Dermot P. B.; Barrett, Jeffrey C.; Wang, Kai; Radford-Smith, Graham L.; Ahmad, Tariq; Lees, Charlie W.; Balschun, Tobias; Lee, James; Roberts, Rebecca; Anderson, Carl A. (December 2010). "Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci". Nature Genetics. 42 (12): 1118–1125. doi:10.1038/ng.717. ISSN   1546-1718. PMC   3299551 . PMID   21102463.
  7. 1 2 Kenny, Eimear E.; Pe'er, Itsik; Karban, Amir; Ozelius, Laurie; Mitchell, Adele A.; Ng, Sok Meng; Erazo, Monica; Ostrer, Harry; Abraham, Clara; Abreu, Maria T.; Atzmon, Gil (2012). "A genome-wide scan of Ashkenazi Jewish Crohn's disease suggests novel susceptibility loci". PLOS Genetics. 8 (3): e1002559. doi: 10.1371/journal.pgen.1002559 . ISSN   1553-7404. PMC   3297573 . PMID   22412388.
  8. 1 2 3 4 "RecName: Full=Uncharacterized protein C2orf74 - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-10-25.
  9. 1 2 3 "Homo sapiens chromosome 2 open reading frame 74 (C2orf74), transcript variant 1, mRNA". 2020-09-16.{{cite journal}}: Cite journal requires |journal= (help)
  10. "Six-Frame Translation". www.bioline.com. Retrieved 2020-12-19.
  11. "C2orf74 Gene - GeneCards | CB074 Protein | CB074 Antibody". www.genecards.org. Retrieved 2020-12-19.
  12. "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-12-19.
  13. 1 2 "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-12-19.
  14. 1 2 "Bioinformatics Toolkit". toolkit.tuebingen.mpg.de. Retrieved 2020-12-19.
  15. 1 2 "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-12-19.
  16. 1 2 3 "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2020-12-19.
  17. 1 2 "The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-12-19.
  18. 1 2 "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-12-19.
  19. "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-12-19.
  20. 1 2 "Genomatix" (in German). Archived from the original on 2001-02-24. Retrieved 2020-12-19.
  21. "Brain Map - brain-map.org". portal.brain-map.org. Retrieved 2020-12-19.
  22. "Home - GEO DataSets - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  23. "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-12-19.
  24. "miRDB - MicroRNA Target Prediction Database". www.mirdb.org. Retrieved 2020-12-19.
  25. "Motif Scan". myhits.sib.swiss. Retrieved 2020-12-19.
  26. "ExPASy - Myristoylation tool". web.expasy.org. Retrieved 2020-12-19.
  27. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-12-19.
  28. "SMAGP Gene - GeneCards | SMAGP Protein | SMAGP Antibody". www.genecards.org. Retrieved 2020-12-19.
  29. "GEO Profile Links for Gene (Select 339804) - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.