METTL26

Last updated
METTL26
Identifiers
Aliases METTL26 , C16orf13, JFP2, Chromosome 16 open reading frame 13, methyltransferase like 26
External IDs MGI: 1915597 HomoloGene: 16917 GeneCards: METTL26
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_026686

RefSeq (protein)

NP_080962

Location (UCSC) Chr 16: 0.63 – 0.64 Mb Chr 17: 26.09 – 26.1 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. [5] Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. [6] [7] Underexpression of this gene has also been linked to disease consequences in humans. [8]

Contents

Gene

Exon breakdown of C16orf13 transcript variant 1. The AdoMet-MTase domain is also included in the diagram. C16orf13 Exons.jpg
Exon breakdown of C16orf13 transcript variant 1. The AdoMet-MTase domain is also included in the diagram.

METTL26 is located on the short arm of chromosome 16 in humans, in the thirteenth open reading frame. [9] There are five transcript variants of this gene, named 1, 2, 3, 4, and 7. The longest cDNA transcript (transcript variant 1) contains 854 base pairs. [10] This transcript is composed of six exons, all of which contribute to the major superfamily included in the protein, the methyltransferases superfamily. [11] The primary transcript of this gene is 1,919 base pairs long. [12]

Species distribution

Dot plot showing sequence similarity at the 5' and 3' ends of the Human gene and its Chimpanzee ortholog. C16orf13DotPlot.tiff
Dot plot showing sequence similarity at the 5' and 3' ends of the Human gene and its Chimpanzee ortholog.

Using the Dotlet program, a dot plot was constructed comparing the Human gene with its Chimpanzee ortholog.

The plot indicates sequence conservation at the beginning and end of the gene, suggesting conservation and similarity in the 5' and 3' untranslated regions.

This sequence similarity in the 5’ UTR and 3’ UTR does not extend past mammalian species, and shows almost no similarity in a Dot Plot of the Human gene with distantly related species, such as Xenopus tropicalis.

A multiple sequence alignment conducted using the SDSC Biology Workbench [13] reveals little sequence similarity among species more distantly related than primates in the upstream region of the gene. Near the start of transcription site in the human C16orf13 gene, there is high conservation among the primates in which upstream data was available, specifically the human, orangutan, and rhesus monkey C16orf13 gene orthologs. High sequence similarity among primates is evident throughout the promoter region, the 5' UTR, and the C16orf13 gene.

The graph below shows selected gene orthologs for C16orf13 transcript variant 1. These data are collected from NCBI BLAST. [14]

SpeciesOrganism Common nameGene Common nameNCBI accession number Sequence identity Expected value Sequence length (bp)Time since split from humans, MYA (Data from TimeTree.org)
Homo sapiens Human C16orf13NM_032366.3100%08540
Pan troglodytes Chimpanzee LOC467858NM_032366.398%07846.4
Canis lupus familiaris Dog C6H16orf13XM_547214.388%086594.4
Mus musculus Mouse 0610011F06RikNM_026686.286%082592.4
Xenopus (Silurana) tropicalis Western clawed frog c16orf13NM_001039734.1BLAST search found no significant similarityBLAST search found no significant similarity993371.2

Tissue distribution

The human expression profile from NCBI UniGene suggests that this gene has widespread expression in many different tissues in the body. [15] This expression profile suggests that this gene is a “housekeeping gene,” one that has important effects in all cells, regardless of tissue. The highest levels of expression appear to be in the adrenal gland, lung, and parathyroid. [15] There are many additional sites besides these highest three where the gene is expressed in high levels. There seems to be no real similarity in the few tissues where the gene is not expressed. This expression data does not seem to give any clues into specific function, except to suggest that the gene is involved in a “housekeeping” function of nearly all cells.

Gene neighborhood

Approximate location of the C16orf13 gene on Chromosome 16 Chromosome16Schematic.tiff
Approximate location of the C16orf13 gene on Chromosome 16
Gene neighborhood of C16orf13 C16orf13GeneNeighborhood.tiff
Gene neighborhood of C16orf13

The C16orf13 gene is located near the end of chromosome 16, potentially subject to deletion mutations.

The surrounding genes of the C16orf13 gene include hypothetical protein LOC100287175 and LOC100138285 to the right and RAB40C and WFIKKN1 to the left. This gene is located on the minus strand, along with LOC100138285. The other surrounding genes are oriented in the opposite way on the plus strand. The gene neighborhood is represented in the schematic below, originally from NCBI Gene.

Protein

The protein that this gene codes for is known as UPF0585, where UPF signals unknown protein function. There are five isoforms of this protein, corresponding to the five splice variants of the gene. [16] The isoforms are named a, b, c, d, and g [16] As mentioned above, the conserved domain detected in a BLAST search of this amino acid sequence is a methyltransferase superfamily.

Conservation

A multiple sequence alignment conducted using the protein tools in the SDSC Biology Workbench [13] reveals some sequence similarity among distantly related protein orthologs, as far back as archaea, in the region known to code for the methyltransferase domain. The methyltransferase superfamily portion of the protein appears more highly conserved among many of the more closely related orthologous proteins in a diverse array of species.

Species distribution

The C16orf13 has homologs in many species, including distant orthologs in fungi and plants. [17] [18] There are no known paralogs of this protein [19] [20] This gene and its protein are very highly conserved in primates and mammals, particularly in the functional methyltransferase domain.

The graph below shows selected protein orthologs for C16orf13 transcript variant 1. These data are collected from NCBI BLAST.

SpeciesOrganism Common nameProtein Common nameNCBI accession number Sequence identity Expected value Sequence length (aa)Time since split from humans, MYA (Data from TimeTree.org)
Homo sapiens Human UPF0585, isoform a NP_115742.3 100%02040
Pan troglodytes Chimpanzee LOC467858 XP_001154838.1 98%1E-1502046.4
Canis lupus familiaris Dog LOC490093 XP_547214.3 91%4E-14120494.4
Mus musculus Mouse 0610011F06Rik NP_080962.1 87%5E-13420492.4
Xenopus (Silurana) tropicalis Western clawed frog UPF0585 protein C16orf13 homolog NP_001034823.2 58%1E-82203371.2

Predicted properties

Predicted secondary structure of the protein based on results obtained using CHOFAS, GOR4, and PELE programs on SDSC Biology Workbench. The most confident areas (those that appeared in multiple programs) are boxed in addition to highlighted. The dashes in the sequences represent exon splice sites. C16orf13SecondaryStructure.tiff
Predicted secondary structure of the protein based on results obtained using CHOFAS, GOR4, and PELE programs on SDSC Biology Workbench. The most confident areas (those that appeared in multiple programs) are boxed in addition to highlighted. The dashes in the sequences represent exon splice sites.

The protein secondary structure can be predicted using algorithms to predict the occurrence of alpha helices and beta sheets within the protein. An analysis of the protein structure was conducted using the CHOFAS, GOR4, and PELE algorithms in the SDSC Biology Workbench. [21] The analyses were combined and included in the adjacent diagram. Only structures that appeared in more than one output were included.

Interactions

There are few known interactions for this protein. No interactions were found in the GeneCards database [9] or in the MINT database. [22] A STRING search resulted in two gene outputs. [23] These two gene interactions, though, are both in the evidence category of gene neighborhood, which does not necessarily suggest that these genes are interacting in any meaningful way, or are even expressed at the same time. There is no strong evidence, currently, for interactions with this protein.

Disease linkage

Data from microarray experiments has linked over expression of this gene to cancer in various tissues, particularly breast and gastric cancer. In addition, under expression of this gene is also linked to disease, particularly connective tissue disease, nutritional and metabolic disorders, and digestive disorders. The canSAR Workbench database reveals microarray data that may link over or under expression of the C16orf13 gene to various carcinomas [24]

Related Research Articles

<span class="mw-page-title-main">C2CD4D</span> Mammalian protein found in Homo sapiens

C2CD4D, or C2 calcium-dependent domain-containing protein 4D is a protein product of the human genome. The gene that codes for this protein is found on chromosome 1, from 150,076,963 to 150,079,657. The gene contains 2 exons and encodes 353 amino acids. Synonyms for C2CD4D are "FAM148D" and NP_001129475. C2CD4D contains a conserved metal binding domain that is a known as Protein kinase C conserved region 2, subgroup 1. This motif is known to be a member of the C2 superfamily, which is present in phospholipases, protein kinases C, and synaptotagmins. The amino acid sequence of C2CD4D can be accessed at Prior to any post translational modification, C2CD4D has a molecular weight of 37.6 kdal. Although scientists have not yet determined where C2CD4D functions within the cell, C2CD4D has a predicted isoelectric point of 11.636 which severely limits the places in which it can be effective. In addition, C2CD4D does not contain any predicted transmembrane domains or any predicted signal peptides.

WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.

<span class="mw-page-title-main">C11orf86</span> Protein-coding gene in the species Homo sapiens

Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

<span class="mw-page-title-main">FAM210B</span> Protein-coding gene in the species Homo sapiens

FAM210B is a gene that which in Homo sapiens encodes the protein FAM210B. It has been conserved throughout evolutionary history, and is highly expressed in multiple tissues within the human body. FAM210B's primary location is the endoplasmic reticulum.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">LOC101059915</span> Protein-coding gene in the species Homo sapiens

LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">SCRN3</span> Protein-coding gene in the species Homo sapiens

Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000130731 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000025731 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "C16orf13 - UPF0585 protein C16orf13 - human protein (Identifiers)". Nextprot.org. Retrieved 2012-05-18.
  6. "Breast Cancer Database". Itb.cnr.it. Retrieved 2012-05-18.
  7. Oh JH, Yang JO, Hahn Y, Kim MR, Byun SS, Jeon YJ, Kim JM, Song KS, Noh SM, Kim S, Yoo HS, Kim YS, Kim NS (December 2005). "Transcriptome analysis of human gastric cancer". Mamm. Genome. 16 (12): 942–54. doi:10.1007/s00335-005-0075-2. PMID   16341674. S2CID   69278.
  8. "C16orf13 Disease Atlas". NextBio. Retrieved 2012-05-18.[ permanent dead link ]
  9. 1 2 GeneCards Human Gene Database. "C16orf13 Gene - GeneCards | CP013 Protein | CP013 Antibody". GeneCards. Retrieved 2012-05-18.
  10. "Homo sapiens chromosome 16 open reading frame 13 (C16orf13), transcrip - Nucleotide - NCBI". Ncbi.nlm.nih.gov. 2012-04-04. Retrieved 2012-05-18.
  11. "Homo sapiens chromosome 16 open reading frame 13 (C16orf13), transcrip - Nucleotide - NCBI". Ncbi.nlm.nih.gov. 2012-04-04. Retrieved 2012-05-18.
  12. "Homo sapiens chromosome 16, GRCh37.p5 Primary Assembly - Nucleotide - NCBI". Ncbi.nlm.nih.gov. 2012-04-04. Retrieved 2012-05-18.
  13. 1 2 "SDSC Biology Workbench". Workbench.sdsc.edu. Retrieved 2012-05-18.
  14. "BLAST: Basic Local Alignment Search Tool".
  15. 1 2 "EST Profile - Hs.239500". Ncbi.nlm.nih.gov. Retrieved 2012-05-18.[ permanent dead link ]
  16. 1 2 "C16orf13 chromosome 16 open reading frame 13 [Homo sapiens] - Gene - NCBI". Ncbi.nlm.nih.gov. Retrieved 2012-05-18.
  17. GeneCards Human Gene Database. "C16orf13 Gene - GeneCards | CP013 Protein | CP013 Antibody". GeneCards. Retrieved 2012-05-18.
  18. "Ensembl genome browser 67: Homo sapiens - Orthologues - Gene: C16orf13 (ENSG00000130731)". Useast.ensembl.org. Retrieved 2012-05-18.
  19. GeneCards Human Gene Database. "C16orf13 Gene - GeneCards | CP013 Protein | CP013 Antibody". GeneCards. Retrieved 2012-05-18.
  20. "Ensembl genome browser 67: Homo sapiens - Comparative Genomics - Gene: C16orf13 (ENSG00000130731)". Useast.ensembl.org. Retrieved 2012-05-18.
  21. Chou PY; Fasman GD (2006). "Advances in Enzymology and Related Areas of Molecular Biology". Advances in Enzymology - and Related Areas of Molecular Biology. pp. 45–148. doi:10.1002/9780470122921.ch2. ISBN   9780470122921. PMID   364941.[ permanent dead link ]
  22. "HomoMINT database". Mint.bio.uniroma2.it. Retrieved 2012-05-18.[ permanent dead link ]
  23. "STRING: functional protein association networks". String-db.org. Retrieved 2012-05-18.
  24. "Gene Q96S19 | Protein METTL26 - Gene expression | canSAR Black".