C20orf96

Last updated

C20orf96 (Chromosome 20 open reading frame 96) is a protein-coding gene in humans. [1] It codes for an unknown protein known as uncharacterized protein C20orf96, predicted to be a nuclear protein. The function and biological processes of the gene is not well understood by the scientific community yet.

Contents

Gene aliases

C20orf96 is also known by the alias DJ1103G7.2. Orthologs found in other organisms are known as C20orf96 homolog isoform X [Species].

Gene

Location

C20orf96 is located on the short arm of chromosome 20 at 20p13, base pairs 270,863 to 290,778 on the complementary strand. It is a member of the DUF4618 superfamily, with the DUF position being from amino acid 104 to 363. [1] Neighboring genes are DEFB129, DEFB132, ZCCHC3, and NRSN2-AS1.

Location of C20orf96 as well as neighboring genes, as given by NCBI Gene Location c20orf96.jpg
Location of C20orf96 as well as neighboring genes, as given by NCBI Gene

Expression

Aceview states that the gene is expressed 2.2 times the amount an average gene is expressed, and the sequence has been seen in the brain, testis, uterus, kidney, thymus, breast, kidney tumor, and 84 other places. [2] C20orf96 has been found to bind to transcription factor binding sites AREB6, GATA-1, GATA-2, GATA-3, ATF6, c-Myc, Max, CHOP-10, AMLa, and C/EBPalpha. [3] TYW5 and ALOXE3 are two proteins that interact with C20orf96. TYW5 is a protein coding gene and also incorporates tRNA modification in the nucleus. [4] ALOXE3 is also a protein coding gene, more focusing on Prostaglandin 2 biosynthesis. [5]

mRNA

Alternative splicing

C20orf96 can be split up into five different variants. All variants code for similar proteins. The variant used for this article was Variant 1. The splicing is shown below.

ExonExon Size (bp)
118
248
3117
4117
5159
699
7156
8102
9171
10117
1160

Protein

General properties

The protein made by C20orf96 is 363 amino acids in length. The predicted molecular weight is 42.9 kdal, and the isoelectric point is 8.99. [6] [7]

Composition

The ratios of this protein are similar to others. The unusual amounts of amino acids are found as alanine, glutamic acid, glycine, and glutamine. Glutamic acid and glutamine both have a slightly larger amount than normal, alanine is slightly below normal, and glycine is low compared to other proteins.

A- : 13( 3.6%)
C  :  4( 1.1%)
D  : 15( 4.1%)
E+ : 38(10.5%)
F  : 10( 2.8%)
G--:  4( 1.1%)
H  :  9( 2.5%)
I  : 17( 4.7%)
K  : 34( 9.4%)
L  : 40(11.0%)
M  : 13( 3.6%)
N  : 10( 2.8%)
P  : 19( 5.2%)
Q+ : 35( 9.6%)
R  : 24( 6.6%)
S  : 25( 6.9%)
T  : 21( 5.8%)
V  : 22( 6.1%)
W  :  4( 1.1%)
Y  :  6( 1.7%)

C20orf96 has a neutral charge, with no additional positive, negative, or mixed charge clusters.

Post-translational modification

NetPhos predicts 10 serine sites, 4 threonine sites, and 3 tyrosine sites over the 363 amino acids. [8] No signal peptides were detected. There were also no transmembrane sequences.

Secondary structure

Most of the structure is made up of alpha-helices, with short coils on both ends. At the end of the sequence, there is also a small, four amino acid beta-strand.

Phosphorylation Potential for C20orf96.gif

Tertiary structure

There are no known crystal structures for this protein. The tertiary structure is mainly composed of coiled-coil regions.

Tertiary.jpg

Homology

C20orf96 is found in many eukaryotes. Orthologs have been found in most organisms in the kingdom animalia, with the lineage going back to the phylum Chordata. No paralogs for C20orf96 have been found.

Mutations

The most prevalent mutations of C20orf96 in humans are M1V, Q6K, T13I, T13S, M94I, S117G, S117T, S117N, L142V, D159N, M215T, D238E, R251C, R251G, Q263E, R279C, E304G, I305F, M313I, R328W, R342I, and L353V. The format these mutations are given in are the amino acid that is supposed to be there, the amino acid number, and then the mutation.

Connection to diseases

C20orf96 has been linked to Moyamoya Disease, increasing the odds of contracting the disease by 4.5 times. [9] It is also associated with lymph node metastasis in colorectal cancer. [10] A study found that C20orf96 has shown positive selection for containing a hub gene, which is a gene that is ranked in the top 20 for connectivity. [11]

Related Research Articles

Transmembrane protein 134

Transmembrane protein 134 is a protein encoded by the TMEM134 gene. TMEM134 does not have any other known aliases. There are two transmembrane domains and a domain of unknown function (DUF872). Evolutionary, the majority of the organisms that have this gene are primates and mammals, although there are some organisms dating back to Drosophila and C. elegans. Through current research, there has not been any confirmed function of TMEM134.

FAM63A

Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.

DMAC1

Transmembrane protein 261 is a protein that in humans is encoded by the TMEM261 gene located on chromosome 9. TMEM261 is also known as C9ORF123 and DMAC1, Chromosome 9 Open Reading Frame 123 and Transmembrane Protein C9orf123 and Distal membrane-arm assembly complex protein 1.

Transmembrane protein 268

Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.

C8orf48

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.

VXN

Vexin is a protein encoded by VXN gene. VXN is found to be highly expressed in regions of the brain and spinal cord.

PRR29

PRR29 is a protein located on human chromosome 17 that in humans is encoded by the PRR29 gene.

OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

TMEM176B

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

C12orf60

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

C12orf66 is a protein that in humans is encoded by the C12orf66 gene. The C12orf66 protein is one of four proteins in the KICSTOR protein complex which negatively regulates mechanistic target of rapamycin complex 1 (mTORC1) signaling.

C6orf62

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

C17orf53

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

SHLD1

SHLD1 or shieldin complex subunit 1 is a gene on chromosome 20. The C20orf196 gene encodes an mRNA that is 1,763 base pairs long, and a protein that is 205 amino acids long.

C15orf39

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

C19orf44

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

References

  1. 1 2 "C20orf96 chromosome 20 open reading frame 96 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-09.
  2. Thierry-Mieg, Danielle; Thierry-Mieg, Jean. "AceView: Gene:C20orf96, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2016-05-09.
  3. "www.genecards.org/cgi-bin/carddisp.pl?gene=C20orf96". www.genecards.org. Retrieved 2016-05-09.
  4. "www.genecards.org/cgi-bin/carddisp.pl?gene=tyw5". www.genecards.org. Retrieved 2016-05-09.
  5. "www.genecards.org/cgi-bin/carddisp.pl?gene=ALOXE3&keywords=aloxe3". www.genecards.org. Retrieved 2016-05-09.
  6. "SAPS". SDSC Biology Workbench. Retrieved 2016-05-08.
  7. "IP". SDSC Biology Workbench. Retrieved 2016-05-08.
  8. "NetPhos 2.0 Server". www.cbs.dtu.dk. Retrieved 2016-05-08.
  9. Sigdel, T., Shoemaker, L., Chen, R., Li, L., Butte, A., Sarwal, M. and Steinberg, G. (2013). Immune response profiling identifies autoantibodies specific to Moyamoya patients. Orphanet Journal of Rare Diseases, 8(1), p.45.
  10. Sasaki, K., Oga, A., Furuya, T., Chochi, Y., Ito, H., Adachi, J., Uchiyama, T., Kawauchi, S. and Nakao, M. (2011). DNA copy number aberrations associated with the clinicopathological features of colorectal cancers: Identification of genomic biomarkers by array-based comparative genomic hybridization. Oncology Reports, 25, pp.1603-1611.
  11. Konopka, G., Friedrich, T., Davis-Turak, J., Winden, K., Oldham, M., Gao, F., Chen, L., Wang, G., Luo, R., Preuss, T. and Geschwind, D. (2012). Human-Specific Transcriptional Networks in the Brain. Neuron, 75(4), pp.601-617.