MFSD6L

Last updated
MFSD6L
Identifiers
Aliases MFSD6L , major facilitator superfamily domain containing 6 like
External IDs MGI: 2384904; HomoloGene: 17128; GeneCards: MFSD6L; OMA:MFSD6L - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_152599

NM_146004

RefSeq (protein)

NP_689812

NP_666116

Location (UCSC) Chr 17: 8.8 – 8.8 Mb Chr 11: 68.45 – 68.45 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse
Predicted Tertiary Structure of the MFSD6L Protein Tertiary structure of MFSD6L protein.png
Predicted Tertiary Structure of the MFSD6L Protein

Major facilitator superfamily domain containing 6 like (MFSD6L) is a protein encoded by the MFSD6L gene in humans. [5] The MFSD6L protein is a transmembrane protein that is part of the major facilitator superfamily (MFS) that uses chemiosmotic gradients to facilitate the transport of small solutes across cell membranes.

Contents

Gene

A figure depicting the location (17p13.1) of the Homo sapiens MFSD6L gene on chromosome 17. Human chromosome 17, MFSD6L on 17p13.1 ideogram.jpg
A figure depicting the location (17p13.1) of the Homo sapiens MFSD6L gene on chromosome 17.

In the human genome, the MFSD6L gene is located on chromosome 17 (17p13.1). [5] The DNA sequence encoding the polypeptide encompasses 2,256 bases, starting from 8,797,110 bp to 8,799,365 bp. [6] Additionally, the gene sequence resides on the minus strand. [5] The MFSD6L gene has one alias called FLJ35773. [5]

The encoding DNA sequence results in only one exon in the translated mRNA sequence. [6]

The tumor suppressor gene TP53 was also found within the gene neighborhood of MFSD6L at 17p13.1. [7]

mRNA Transcript

The MFSD6L gene was not found to have other isoforms due to the presence of only one exon in the MFSD6L encoding sequence.

Protein

The MFSD6L protein has a precursor molecular weight of approximately 64 kDa, consisting of 586 amino acids. [5] After post-translational modifications, such as glycosylation, the mature MFSD6L protein's molecular weight increases to 72 kDa. Of the amino acids consisting the MFSD6L protein, leucine was found to have increased levels compared to most other human proteins. This increase in leucine is also present in the MFSD6L protein of the house mouse and chimpanzee. [8] The protein also has an isoelectric point of 8.87 pI. [9]

MFSD6L predicted tertiary structure with greyed area signifying cell membrane region MFSD6L Predicted Tertiary structure.jpg
MFSD6L predicted tertiary structure with greyed area signifying cell membrane region

The peptide sequence contains 11 transmembrane regions that cross the plasma membrane. Additionally, there are also two MFS regions starting at the 28th and 368th encoding amino acids. [10]

For the secondary structure of the MFSD6L protein, there are 16 predicted alpha helices and 3 predicted beta sheets. [11] The large amount of alpha helices within the structure of MFSD6L can be attributed to the protein being a transmembrane solute transporter since alpha helices are usually the part of the protein's structure that is positioned within the cell membrane.

Within the tertiary structure, there was a disulfide bond predicted between the two cysteines at the 29th and 311th amino acids. [12]

Expression and regulation

Gene level

Tissue expression by array profiling of MFSD6L gene MFSD6L Tissue Array.png
Tissue expression by array profiling of MFSD6L gene

There was only one promoter region, spanning 1,107 bp, found for MFSD6L using the Genomatix Gene2Promoter database. [13] For the part of the promoter region closest to the start of the 5' UTR of the MFSD6L gene, there were several transcription factor binding sites found. A transcription factor binding site of note was the site for the p53 tumor suppressor protein. [13]

The MFSD6L gene was found to be highly expressed in the pancreas, salivary glands, and the thyroid. [14]

Inspection of in-situ hybridization expression of MFSD6L gene shows that the gene is particularly expressed within glandular cells within their respective tissues.

Expression of the MFSD6L was found to be upregulated as a result of glucose starvation. [15]

In-situ hybridization of MFSD6L protein in colorectal tissue. Highest abundance found within glandular cells of the colorectal tissue. MFSD6L Expression in Colorectal Tissue.jpg
In-situ hybridization of MFSD6L protein in colorectal tissue. Highest abundance found within glandular cells of the colorectal tissue.

Transcript level

Secondary structure of MFSD6L mRNA 5' UTR MFSD6L mRNA Predicted 5' UTR secondary structure.jpg
Secondary structure of MFSD6L mRNA 5' UTR

Since there is only one exon and no introns within the MFSD6L gene, There is no splicing performed on the MFSD6L mRNA. [10] Translation of the MFSD6L protein initiates at the end of the 5' UTR, which is the first 245 nucleotides of the MFSD6L mRNA. There are conserved stem-loop regions across mammalian orthologs, which infer possible miRNA binding sites.

Protein level

The subcellular localization of the MFSD6L protein is predicted to be within the cell membrane via DeepLoc tool. [16] This is supported by it being a solute symporter similar to MFS proteins. The first 28 amino acids of the translated MFSD6L protein contains the signal peptide. [17]

Additionally, n-glycosylation sites were predicted at the 110th, 129th, and 224th amino acids of the protein sequence. [16] A serine phosphorylation site at the 429th amino acid was also predicted and verified by presence within other mammalian orthologs. [18]

Evolution

Paralogs

The MFSD6 protein was found to be the only paralog to the human MFSD6L protein. [5]

Orthologs

Through BLAST sequence analysis, the MFSD6L protein was found to have orthologs in a many mammalian species, especially among primates, and flying foxes. [19] There were some orthologs found in the Reptilia and Amphibia classes, albeit not as great in number as in the Mammalia. Among fish, there were significantly more orthologs found amongst ray-finned fishes than cartilaginous fishes. Additionally, the jaw-less fish, the sea lamprey, was also found to be an ortholog.

There were also multiple orthologs found amongst invertebrates, such as echinoderms and mollusks.

No significant orthologs of the MFSD6L protein were found amongst insects; however there were orthologs found in the bacteria. Specifically, the Anaerolinea genus, which contains thermophilic bacteria were found to have orthologs with the human protein due to its regions of MFS being identical to MFS regions found in the human protein. The following table shows some examples of orthologs of the human MFSD6L.

Genus and SpeciesCommon NameTaxonomic GroupMedian Time since Divergence (MYA)Accession Number (from NCBI)Sequence
Length (aa)
Sequence
Identity (%)
Sequence
Similarity (%)
Homo sapiens HumanPrimates0NP_689812.3586100%100%
Mus musculus House MouseRodentia89NP_666116.158668%77%
Monodon monoceros NarwhalArtiodactyla94XP_029068193.159573%81%
Molossus molossus Velvety Free-tailed BatChiroptera94XP_036125248.160072%79%
Dermochelys coriacea Leatherback Sea TurtleTestudines318XP_038228264.165345%59%
Terrapene carolina triunguis Three-toed Box TurtleTestudines318XP_024071382.165444%57%
Patagioenas fasciata monilis Band-tailed PigeonColumbiformes318OPJ90083.165543%55%
Dromaius novaehollandiae EmuCasuariiformes318XP_025976398.162842%55%
Microcaecilia unicolor Tiny Cayenne CaecilianGymnophiona351.7XP_030063970.165244%58%
Xenopus tropicalis Western-clawed FrogAnura351.7XP_002937042.261142%56%
Bufo bufo Common ToadAnura351.7XP_040293432.161539%54%
Scleropages formosus Asian ArowanaOsteoglossiformes433NP_001003586.158540%55%
Danio rerio ZebrafishCypriniformes433XP_018612492.254233%49%
Callorhinchus milii Australian GhostfishChimaeriformes465XP_042198386.163041%57%
Petromyzon marinus Sea LampreyPetromyzontiformes599XP_032823230.173526%38%
Anneissia japonicaSea LilyComatulida627XP_033125701.161628%47%
Gigantopelta aegis Deep Sea SnailNeomphalida736XP_041362795.163727%46%
Crassostrea gigas Pacific OysterOstreida736XP_011445242.261525%44%
Octopus sinesisEast Asian Common OctopusOctopoda736XP_02965532663424%43%
Tetranychus urticae Red Spider MiteTrombidiformes736XP_015795313.187214%25%
Anaerolinealis Anaerolinealis bacteriumAnaerolineales4090MBN1451601.138919%34%

Homologous gomains

The main homologous domains found within the MFSD6L protein are the MFS regions. Since MFS includes a large amount of solute transporter proteins within its superfamily, there are many MFS proteins that have the same homologous MFS domains.

Most distant homologs

Through BLAST sequence analysis, the most distant homologs were the organisms within the Cnidaria phylum, which mainly consists of jellyfish, sea anemones, and corals. [19] Searching with BLAST for the MFSD6L gene at an older diverging phylum, the Porifera, revealed no homologous MFSD6L protein.

Predicted emergence date

As a result of the MFSD6L protein's presence in Cnidaria and absence in Porifera, the estimated emergence date of the MFSD6L gene lies between 687 and 777 MYA, which are the divergence dates found from TimeTree. [20] From the corrected % divergence chart and calculations of the corrected % divergence of the Homo sapiens MFSD6 paralog, the estimated date of emergence of the MFSD6L protein was found to be around 736 MYA.

Secondary structure of the MFSD6L mRNA 3' UTR MFSD6L mRNA Predicted 3' UTR secondary structure.jpg
Secondary structure of the MFSD6L mRNA 3' UTR

Interacting proteins

The MFSD6L protein was not found to have any experimentally-verified protein-protein interactions. [21]

Function

The polypeptide sequence contains many transmembrane regions, identifying the MFSD6L protein as a transmembrane protein for transporting solutes across the plasma membrane of a cell. Tertiary structure prediction tools suggest that the structure of the MFSD6L protein is similar to 1PV6A, a β-galactosides symporter which uses proton gradients to transport solutes. [22] [23] As a result, the function of the MFSD6L protein could possibly a sugar symporter. This is additionally supported by the fact that the expression of MFSD6L was upregulated due to glucose starvation.

Tertiary structure of the MFSD6L protein with the serine phosphorylation site and signal sequence marked Tertiary structure of the MFSD6L Protein.png
Tertiary structure of the MFSD6L protein with the serine phosphorylation site and signal sequence marked

Clinical significance

Disease association

A disease associated with the MFSD6L gene is the Tetralogy of Fallot, which is a series of four congenital heart defects that can cause low oxygenation of blood. [5] This is due to a ventricular septal defect that causes the mixing of oxygenated and deoxygenated blood in the left ventricle of the heart.

The MFSD6L gene was also found to be a candidate gene taking part in the disease Pediatric Cataract. [24]

A corrected % divergence chart of the MFSD6L protein. Cytochrome C and Fibrinogen Alpha Chain is also added for comparison. MFSD6L Divergence Chart.png
A corrected % divergence chart of the MFSD6L protein. Cytochrome C and Fibrinogen Alpha Chain is also added for comparison.

Mutations

Various SNP's were found within the encoding sequence of the MFSD6L protein sequence as shown below.

Amino Acid PositionmRNA PositionOriginal NucleotideSNPOriginal Amino AcidVariant Amino AcidCodon
3281212TTSer [S]Leu [L]2
3991424GAGly [G]Ser [S]1
4061446CTSer [S]Leu [L]2
5711941CTThr [T]Ile [I]2
5741949GAAsp [D]Asn [N]1
5751952TGTrp [W]Gly [G]1

Related Research Articles

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.

<span class="mw-page-title-main">SLC46A3</span> Protein-coding gene in the species Homo sapiens

Solute carrier family 46 member 3 (SLC46A3) is a protein that in humans is encoded by the SLC46A3 gene. Also referred to as FKSG16, the protein belongs to the major facilitator superfamily (MFS) and SLC46A family. Most commonly found in the plasma membrane and endoplasmic reticulum (ER), SLC46A3 is a multi-pass membrane protein with 11 α-helical transmembrane domains. It is mainly involved in the transport of small molecules across the membrane through the substrate translocation pores featured in the MFS domain. The protein is associated with breast and prostate cancer, hepatocellular carcinoma (HCC), papilloma, glioma, obesity, and SARS-CoV. Based on the differential expression of SLC46A3 in antibody-drug conjugate (ADC)-resistant cells and certain cancer cells, current research is focused on the potential of SLC46A3 as a prognostic biomarker and therapeutic target for cancer. While protein abundance is relatively low in humans, high expression has been detected particularly in the liver, small intestine, and kidney.

Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.

<span class="mw-page-title-main">TMEM249</span> Protein-coding gene in the species Homo sapiens

TMEM249 is a protein that in humans is encoded by the C8orfk29 gene.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

Atypical Solute Carrier Families are novel plausible secondary active or facilitative transporter proteins that share ancestral background with the known solute carrier families (SLCs). However, they have not been assigned a name according to the SLC root system, or been classified into any of the existing SLC families.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">FAM155B</span> Protein-coding gene in humans

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

<span class="mw-page-title-main">TMEM247</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 247 is a multi-pass transmembrane protein of unknown function found in Homo sapiens encoded by the TMEM247 gene. Notable in the protein are two transmembrane regions near the c-terminus of the translated polypeptide. Transmembrane protein 247 has been found to be expressed almost entirely in the testes.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">TMEM101</span>

Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">TEDDM1</span> Protein-coding gene in the species Homo sapiens

Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.


<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000185156 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000048329 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 5 6 7 "MFSD6L Gene". www.genecards.org. Retrieved 2021-10-01.
  6. 1 2 "MFSD6L major facilitator superfamily domain containing 6 like [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-10-24.
  7. "TP53 Gene". www.genecards.org. Retrieved 2021-12-15.
  8. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2021-12-14.
  9. "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2021-12-14.
  10. 1 2 "Homo sapiens major facilitator superfamily domain containing 6 like (MFSD6L), mRNA". 2021-06-27.{{cite journal}}: Cite journal requires |journal= (help)
  11. "PredictProtein - Protein Sequence Analysis, Prediction of Structural and Functional Features". predictprotein.org. Retrieved 2021-12-14.
  12. "DiANNA". clavius.bc.edu. Archived from the original on 2022-07-24. Retrieved 2021-12-14.
  13. 1 2 "Genomatix Software Suite". Genomatix. Archived from the original on 2012-01-14. Retrieved 2021-12-14.
  14. "GDS3113 / 224438". www.ncbi.nlm.nih.gov. Retrieved 2021-12-15.
  15. Weldai, Lydia (2018-04-16). "Do Major Facilitator Superfamily Domain Containing Proteins Respond to Glucose Starvation?". Digitala Vetenskapliga Arkivet.
  16. 1 2 "Services". www.healthtech.dtu.dk. Retrieved 2021-12-16.
  17. "Protter - interactive protein feature visualization". wlab.ethz.ch. Retrieved 2021-12-16.
  18. "PhosphoSitePlus". www.phosphosite.org. Retrieved 2021-12-16.
  19. 1 2 "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2021-12-17.
  20. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2021-12-17.
  21. "MFSD6L protein (human) - STRING interaction network". string-db.org. Retrieved 2021-12-17.
  22. "I-TASSER server for protein structure and function prediction". zhanggroup.org. Retrieved 2021-12-17.
  23. "lacY - Lactose permease - Escherichia coli (strain K12) - lacY gene & protein". www.uniprot.org. Retrieved 2021-12-17.
  24. Aldahmesh, Mohammed A.; Khan, Arif O.; Mohamed, Jawahir Y.; Hijazi, Hadia; Al-Owain, Mohammed; Alswaid, Abdulrahman; Alkuraya, Fowzan S. (December 2012). "Genomic analysis of pediatric cataract in Saudi Arabia reveals novel candidate disease genes". Genetics in Medicine. 14 (12): 955–962. doi: 10.1038/gim.2012.86 . ISSN   1530-0366. PMID   22935719. S2CID   45088616.