CXorf49

Last updated

CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).

Contents

Gene

The image shows the exact location of CXorf49 on the minus strand of the X chromosome. CXorf49 location.png
The image shows the exact location of CXorf49 on the minus strand of the X chromosome.

The CXorf49 gene has one alias CXorf49B. [1] The recname A8MYA2 also refers to the protein coded by CXorf49 or CXorf49B. [2]

CXorf49 is located on the X chromosome at Xq13.1. It is 3912 base pairs long and the gene sequence has 6 exons. [3] CXorf49 has one protein coding transcript. [4]

Protein

The protein has 514 amino acids and a molecular mass of 54.4 kDa. [5] The isoelectric point is 9.3. Compared to other human proteins CXorf49 is glycine- and proline-rich, but the protein has lower levels of asparagine, isoleucine, tyrosine and threonine(Statistical Analysis of Protein Sequences, SAPS [6] ).

Domains

Image of the protein with the domain of unknown function. Protein cxorf49.png
Image of the protein with the domain of unknown function.

The domain of unknown function, DUF4641, is almost the entire protein. It is 433 amino acids long, from amino acid 80 until amino acid number 512. [7] DUF4641 is a part of pfam15483. [8] The domain is proline- and arginine-rich, but DUF4641 has lower levels of isoleucine, tyrosine and threonine compared to other proteins in human (Analysis of Protein Sequences, SAPS [6] ). DUF4641 has an unusual spacing between lysine residues and positive charged amino acids (Analysis of Protein Sequences, SAPS [6] ).

Post-translation modifications

CXorf49 is predicted to have several post-translational sites. This include sites for N-acetyltransferase (NetAcet 1- [9] ), glycation of ε amino groups of lysines (NetGlycate 1.0 [10] ), mucin type GalNAc O-glycosylation (NetOglyc 4.0 [11] ), phosphorylation (NetPhos 2.0 [12] ), sumoylation (SUMOplot Analysis Program [13] ) and O-ß-GlcNAc attachment(YinOYang WWW [14] ).

Subcellular localization

The CXorf49 protein has been predicted to be located in the cell nucleus (PSORT II [15] ).

Expression

Promoter region

The promoter region of CXorf49 is located between base pair 71718051 and 71718785 on the minus strand of the X chromosome and it is 735 bp long (Genomatix’s ElDorado program [16] ). One of the most frequent transcription factor binding-sites in the promoter region are sites for Y-box binding factor.

Expression

Though expression of CXorf49 is very low in human cells, is it somewhat higher in connective tissues, testis and uterus(NCBI-Unigene [17] ).

Interactions

The protein CXorf49 has not yet been shown to interact with other proteins (PSICQUIC [18] ).

CXorf49 is found to be one of the components of a small group of the HL-60 cell proteome that were most prone to form 4-Hydroxy-2-nonenal(HNE) adducts, upon exposure to nontoxic (10 μM) HNE concentrations, along with heat shock 60 kDa protein 1. [19]

Homology

Using BLAST [20] no orthologs for CXorf49 are found in single celled organisms, fungi or plants whose genomes have been sequenced. For multicellular organisms orthologs are found in mammals. The table below show a selection of the mammal orthologs. They are listed after time of divergence from human.

Genus and species nameCommon nameAccession NumberSequence lengthIdentity to human protein
Pan troglodytesChimpanzeeXP_001137982514 aa98 %
Callithrix jacchusCommon marmosetXP_008987719487 aa65 %
Galeopterus variegatusMalayan flying lemurXP_008574823525 aa54 %
Tupaia chinensisChinese tree shrewXP_006168003527 aa35 %
Chinchilla lanigeraLong-tailed chinchillaXP_013358263307 aa49 %
Mus musculusHouse mouseNP_081944513 aa36 %
Canis lupus familiarisDogXP_850392526 aa54 %
Odobenus rosmarus divergensPacific walrusXP_012422579530 aa51 %
Mustela putorius furoFerretXP_004777306544 aa50 %
Lipotes vexilliferChinese river dolphinXP_007452050529 aa45 %
Ovis areisSheepXP_004022229536 aa45 %
Capra hircusGoatXP_005700711538 aa44 %
Myotis lucifugusLittle brown batXP_006083036500 aa42 %
Myotis davidiiDavid's myotisXP_006759573495 aa42 %
Bos taurusCattleNP_001092664534 aa42 %
Equus asinusAsinusXP_014707878723 aa42 %
Trichechus manatus latirostrisFlorida manateeXP_012415455505 aa44 %
Dasypus novemcinctusNine-banded armadilloXP_004475873497 aa44 %
Orycteropus afer aferAardvarkXP_007957133477 aa38 %

Phylogeny

CXorf49 has developed from aardvarks, to the human protein over 105.0 million years.

This phylogenetic tree made with CRUSTALW on SDSC Biology Workbench shows how CXorf49 in Human (Hsa), Chimpanzee(Ptro), Malayan flying lemur(Gava), Sheep (Ovari), Pacific walrus(Ord), Aardvark(Oafaf), Chinese tree shrew (Tuchi) and House mouse(Mmus) has diverged over time. Phylogenetic tree1.png
This phylogenetic tree made with CRUSTALW on SDSC Biology Workbench shows how CXorf49 in Human (Hsa), Chimpanzee(Ptro), Malayan flying lemur(Gava), Sheep (Ovari), Pacific walrus(Ord), Aardvark(Oafaf), Chinese tree shrew (Tuchi) and House mouse(Mmus) has diverged over time.

Related Research Articles

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">LOC101059915</span> Protein-coding gene in the species Homo sapiens

LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.

<span class="mw-page-title-main">TEX9</span> Protein-coding gene in the species Homo sapiens

Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

<span class="mw-page-title-main">C7orf26</span> Human protein-encoding gene on chromosome 7

c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.

<span class="mw-page-title-main">C5orf46</span> Protein-coding gene in the species Homo sapiens

C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67.

<span class="mw-page-title-main">C16orf90</span> Protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">C20orf202</span>

C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">C12orf54</span> Protein-coding gene in humans

C12orf54 is a protein in humans that is encoded by the C12orf54 gene.

References

  1. "Homo sapiens chromosome X open reading frame 49 (CXorf49), mRNA - Nucleotide - NCBI". Ncbi.nlm.nih.gov. 2015-09-28. Retrieved 2016-04-28.
  2. "RecName: Full=Uncharacterized protein CXorf49 - Protein - NCBI". Ncbi.nlm.nih.gov. 2015-09-28. Retrieved 2016-04-28.
  3. "CXorf49 chromosome X open reading frame 49 [Homo sapiens (human)] - Gene - NCBI". Ncbi.nlm.nih.gov. Retrieved 2016-04-28.
  4. "Gene & protein Summary: cxorf49". Ebi.ac.uk. Retrieved 2016-04-28.
  5. "CXorf49 Gene(Protein Coding) Chromosome X Open Reading Frame 49". GeneCards. Retrieved 2016-04-28.
  6. 1 2 3 4 "SDSC Biology Workbench". seqtool.sdsc.edu. Archived from the original on 2003-08-11. Retrieved 2016-05-06.
  7. "uncharacterized protein CXorf49 [Homo sapiens] - Protein - NCBI". Ncbi.nlm.nih.gov. 2015-09-28. Retrieved 2016-04-28.
  8. "NCBI CDD Conserved Protein Domain DUF4641". www.ncbi.nlm.nih.gov. Retrieved 2016-05-06.
  9. "NetAcet 1.0 Server". Cbs.dtu.dk. Retrieved 2016-04-28.
  10. "NetGlycate 1.0 Server". Cbs.dtu.dk. Retrieved 2016-04-28.
  11. "NetOGlyc 4.0 Server". Cbs.dtu.dk. 2013-05-15. Retrieved 2016-04-28.
  12. "NetPhos 2.0 Server". Cbs.dtu.dk. Retrieved 2016-04-28.
  13. "SUMOplot Analysis Program". Abgent. Retrieved 2016-04-28.
  14. "YinOYang 1.2 Server". Cbs.dtu.dk. Retrieved 2016-04-28.
  15. http://psort.hgc.jp/cgi-bin/runpsort.pl%5B%5D
  16. "Genomatix's ElDorado". Archived from the original on 2021-04-03. Retrieved 2016-05-06.
  17. "EST Profile - Hs.632817". Ncbi.nlm.nih.gov. Retrieved 2016-04-28.
  18. "PSIQUIC". Archived from the original on 2014-12-17.
  19. Arcaro, Alessia; Daga, Martina; Cetrangolo, Giovanni Paolo; Ciamporcero, Eric Stefano; Lepore, Alessio; Pizzimenti, Stefania; Petrella, Claudia; Graf, Maria; Uchida, Koji; Mamone, Gianfranco; Ferranti, Pasquale; Ames, Paul R. J.; Palumbo, Giuseppe; Barrera, Giuseppina; Gentile, Fabrizio (2015). "Generation of Adducts of 4-Hydroxy-2-nonenal with Heat Shock 60 kDa Protein 1 in Human Promyelocytic HL-60 and Monocytic THP-1 Cell Lines". Oxidative Medicine and Cellular Longevity. 2015: 296146. doi: 10.1155/2015/296146 . PMC   4452872 . PMID   26078803.
  20. Protein BLAST