TEX55 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | TEX55 , TSCPA, chromosome 3 open reading frame 30, testis expressed 55, C3orf30 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1921913 HomoloGene: 17614 GeneCards: TEX55 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). [5] [6] TEX55 (accession number: NM_152539.3) is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30. [6]
The TEX55 gene is 13,893 bp and spans from base pair 119,146,151 to 119,160,042. [6] This gene is flanked by immunoglobulin superfamily member 11 and Uroplakin1B. [5]
The promoter region of TEX55 has multiple SRY box-6 and SOX/SRY-sex/testis determining and related HMG box transcription factor binding sites, as well as an X-linked zinc finger binding site. This indicates that the sex chromosomes may play a role in post-translational modification and expression. [7]
TEX55 has orthologs in many mammals including, bats, dolphins, and even aardvarks. [8] According to BLAST the TEX55 protein cannot be found outside of clade Mammalia. [8] The most distant ortholog, found using BLAST, was in the aardvark, which is thought to have diverged an estimated 105 MYA. [9] However, according to GeneCard, distant orthologs have also been found in chickens, lizards (Anolis carolinensis), and zebrafish. [6]
The mRNA of TEX55 is 1800 base pairs long and has three exons. [6] According to GeneCard, the TEX55 mRNA has 3 theoretical splice forms, but only the one containing all three exons have been studied and characterized. [6] The 5’ UTR of the mRNA has an RFX1 binding site, which binds to a stem-loop structure just upstream of the start codon, used to activate transcription. [7] [10]
The translated protein of the TEX 55 mRNA is 536 AA, a predicted molecular weight of 60 kD, had an isoelectric point of 5.51, and is highly conserved at the C-terminus. [11] Tex55 has a slightly high amount of Glutamine and a slightly low amount of Leucine, which compared to the protein database swp23s.q. [11] Multiple sequence alignment of TEX55 and 20 mammalian orthologs show that there are 28 residues, concentrated in the C-terminus, that are conserved between all proteins. [8] [12] The highly conserved residues are outlined in the conceptual translation and multiple sequence analysis. Through function-region analysis, researchers found that this protein may act as an anchoring protein of cAMP-dependent type-II PK, and might be an A-kinase anchoring proteins. [13] [14]
Analysis of the sumoylation sites indicate that Lys 14 has a high probability of being sumoylated. [17] The TEX55 protein has a high number of potential phosphorylation/O-glycosylation sites. [18] [19]
All secondary structure prediction analysis indicate that the C-terminus of Tex 55 has a high probability of being an alpha-helix, and indicate that there is little to no amount of beta-sheets. Secondary structure analysis tools predict that the majority of the Tex 55 protein is coiled domains, and alpha-helices.
Tertiary structures of TEX55 was generated using Phyre2. The C-terminus, which is highly conserved, was calculated to have a 30 residue alpha-helix that has relatively high confidence (82.3%). The highly conserved, high-confidence, alpha-helix is colored in red in the 3D structure image of TEX55 to the above. The overall tertiary structure of TEX55 is globular.
Tex55 has two motifs according to GeneCard: EF-Hand Calcium Binding Domain 10 and Uroplakin 1B, both of which are found in the middle of the protein. [6] Uroplakin 1B is known to regulate cell development, activation, growth, and motility. [20] This could indicate why abnormalities in TEX55 expression leads to sperm with altered morphology. [13] [21]
Analysis the cellular localization probability of Tex55 and its orthologs indicate that it is most likely located in the nucleus of the cell. Below is a list of orthologs and the probability of finding that protein in the specified cellular location. [22]
Organism | Nucleus | Cytoplasm | Cytoskeletal | Golgi | Mitochondria | Plasma Membrane |
---|---|---|---|---|---|---|
Human | 43.5% | 34.8% | 13.0% | 0% | 8.7% | 0% |
Vampire Bat | 60.9% | 17.4% | 13.0% | 4.3% | 0% | 4.3% |
Tree Shrew | 82.6% | 17.4% | 0% | 0% | 8.7% | 0% |
Cat | 73.9% | 17.4% | 0% | 0% | 8.7% | 0% |
Southern White Rhino | 65.2% | 17.4% | 4.3% | 0% | 8.7% | 4.3% |
Lemur | 56.5% | 30.4% | 13.0% | 0% | 0% | 0% |
Beluga Whale | 52.2% | 26.1% | 13.0% | 0% | 4.3% | 4.3% |
Expression of TEX55 mRNA can be found in most tissues in the human body, from the brain to the prostate. [5] However, the protein produced by this mRNA has been shown to be produced mainly in the testis of mammals, according to NCBI. [5] Analysis done by the Human Protein Atlas indicates that the TEX55 protein can be found not only in the testis, but also the bronchus, fallopian tubes, and endometrium. [23]
Being produced mainly in the testis of mammals, researchers believe that the protein product of TEX55 plays a role in spermatogenesis. [13] It has been shown that individuals with Cryptorchidism and Sertoli-cell-only syndrome, which are both associated with sterility, do not produce this protein in their testis. [13] Microarray analysis of individuals with Teratozoospermia, a condition that is characterized by ~96% of sperm morphology being altered, indicates that TEX55 expression is reduced by ~20%. [21] [24] In clinical research, the TEX55 protein products have been detected in mice starting at 38 days old, then up regulated for at least 6 month. [13]
Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.
C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
FAM71E1, also known as Family With Sequence Similarity 71 Member E1, is a protein that in humans is encoded by the FAM71E1 gene. It is thought to be ubiquitously expressed at low levels throughout the body, and it is conserved in vertebrates, particularly mammals and some reptiles. The protein is localized to the nucleus and can be exported to the cytoplasm.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
FAM71E2, also known as Family With Sequence Similarity 71 Member E2, is a protein that, in humans, is encoded by the FAM71E2 gene. Aliases include C19orf16, Protein FAM71E2, Chromosome 19 open reading frame 16, and Putative Protein FAM71E2. The gene is primarily conserved in mammals, but it is also conserved in two reptile species.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
C10orf53 is a protein that in humans is encoded by the C10orf53 gene. The gene is located on the positive strand of the DNA and is 30,611 nucleotides in length. The protein is 157 amino acids and the gene has 3 exons. C10orf53 orthologs are found in mammals, birds, reptiles, amphibians, fish, and invertebrates. It is primarily expressed in the testes and at very low levels in the cerebellum, liver, placenta, and trachea.