FAM203B

Last updated

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. [1] [2] While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, [3] is highly conserved. [4] The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, [4] and no transmembrane domains. [5] This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton. [6]

Contents

HGH1
Identifiers
Aliases HGH1 , BRP16, BRP16L, C8orf30A, C8orf30B, FAM203A, FAM203B, HGH1 homolog
External IDs MGI: 1930628 HomoloGene: 48742 GeneCards: HGH1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_016458

NM_021555

RefSeq (protein)

NP_057542

NP_067530

Location (UCSC) Chr 8: 144.14 – 144.14 Mb Chr 15: 76.25 – 76.26 Mb
PubMed search [9] [10]
Wikidata
View/Edit Human View/Edit Mouse

Gene

FAM203B is located on the positive DNA strand of the long arm of chromosome 8 at locus 24.3 (8q24.3) from 76,368,898 - 76,371,411 in the human genome. The gene product contains 2,402 bp of mRNA with 6 predicted exons in the human gene. [2] [11] There are no known isoforms.


FAM203B mRNA tissue expression levels. FAM203B mRNA expression profile.png
FAM203B mRNA tissue expression levels.

Gene Neighborhood

The pseudogene TSSK5P2 is located on the negative strand opposite FAM203B (145,440,975 - 145,443,775), [13] while LOC377711 is located immediately downstream on the positive strand (145,448,755 - 145,485,896). [14] FAM203A, MROH1, and SCXB are located upstream of FAM203B. [11] [15]

Gene Expression

Expression Profile: mRNA expression has been localized in many tissue types (immune, nervous, muscle, internal, secretory, and reproductive) in similar quantities and may therefore be ubiquitous. [12]

Promoter: The predicted promoter region of FAM203B is located between 145,437,380 and 145,438,015 on Chromosome 8 and has a length of 636 bp. [16]


Protein

The function of FAM203B is not currently understood. The FAM203B protein has 390 amino acids, [1] a molecular weight of 42.1 kdal, [5] and an isoelectric point of 4.56. [17]

Structure

FAM203B contains two domains of unknown function: DUF383 (residues 110–288) and DUF384 (residues 292–349). [1] The protein is alanine-, proline-, and leucine-rich, but poor in serine, asparagine, threonine, isoleucine, lysine, and phenylalanine. The following internal repeats can be found in the primary sequence: LPFL (26-29, 245–248), ELAP (70-73), GRAL (54-57, 111–114), and LAADPGL (88-94, 99–105). There are no positive, negative, mixed charge, or hydrophobic clusters; no transmembrane domains; and no clusters of amino acid multiplets. [5] The secondary structure prediction generated by the Phyre 2.0 bioinformatic server shows only α-helices, almost all of which have high confidence values. The overall confidence value of the model is 99.5%. [18]

Post-Translational Modifications

There are at least six predicted phosphorylation sites in FAM203B: S17, S153, Y167, T223, S259, and S320. [19] The FAM203B protein is also predicted to locate to the cytoplasm. [20]

Protein Interactions

There are many possible transcription factor binding sites in the FAM203B promoter. Below is a table of the best possibilities, which have high confidence values, evolutionary conservation, and/or multiple possible binding sites in the promoter. [16]

Table of Possible Transcription Factor Binding Sites in Predicted FAM203B Promoter: [16]

Transcription FactorStartEndStrandSequence
Winged-helix transcription factor IL-2 enhancer binding factor, forkhead box K2622-gacaggacAACAcaggg
Hypermethylated in Cancer 14961+ccgTGCCagcctg
Zinc finger transcription factor ZBP-8994116+tggccactCCCCcattcagccct
Kidney-enriched kruppel-like factor, KLF15142158+gagccGGGGcgcgggcc
Transcription factor II B recognition element149155-ccgCGCC
Glial cells missing homolog 1, chorion-specific transcription factor GCMα159173+tcagaCCCTcagggc
Transcription factor AP-2α161175-gggcCCTGagggtct
Smad4 transcription factor involved in TGFβ signaling245255-gtaGTCTcggc
Nuclear factor 1278298-gatTTGGccgcctgccgcgtc
ZF5 POZ domain zinc finger, zinc finger protein 161295309+aatCGCGccgggcct
Smad3 transcription factor involved in TGFβ signaling365375-ggcGTCTggcc
Myeloid zinc finger protein MZF1384394-gcGGGGagtta
X-linked zinc finger protein397407+gcGGCCtggcc
Myeloid zinc finger protein MZF1406416-gaGGGGagggg
Core promoter-binding protein with 5 kruppel-type zinc fingers423445+ccggtcCCGCcccttgagcccag
X gene core promoter element 1424434-ggGCGGgaccg
Zinc finger and BTB domain-containing 7A479501-cgcaaCCCCgcccaccagaggag
Kruppel-like factor 7483499+tctggtgGGCGgggttg
Erythroid kruppel-like factor533549+ggcaccggtcGGGTggc
Hypermethylated in cancer 1541553-tgcTGCCacccga

There are several other proteins that may interact directly with the FAM203B protein including C1orf112, HEATR3, MRTO4, BYSL, GINS1, DKC1, TXNDC12, PWP2, IMP4, and NIP7. [21]

Homology and Evolution

FAM203A: Paralog

FAM203A is 99% identical to FAM203B with only one amino acid difference (E264Q) due to a point mutation (G857C). [1] [15] [22] This indicates that the duplication event that produced FAM203B 242,266 bp downstream [11] from FAM203A occurred very recently in evolutionary history. The FAM203A protein is highly conserved and has orthologs in primates, rodents, ungulates, marsupials, amphibians, fish, fungi, plants, and at least one monotreme, one reptile, and one hemichordate. [4] [23]

Orthologs and Homologs

Table of FAM203B Paralog and Homologs:

Scientific NameCommon NameDivergence from Humans (MYA) [24] NCBI Protein AccessionGene NameProtein LengthSequence Similarity
Homo sapiens Human0.0 NP_057542 FAM203A390100%
Macaca mulatta Rhesus macaque29.2 XM_001090013 BRP16L39694%
Pan troglodytes Chimpanzee6.3 XP_520011 FAM203A39598%
Mus musculus Mouse92.3 NP_067530 FAM203A39386%
Sus scrofa Wild boar94.2 XP_003125495 FAM203A-like40685%
Monodelphis domestica Gray short-tailed opossum162.6 XP_003340757 FAM203A-like48378%
Columba livia Rock dove296.0 EMC87403 BRP16 (partial)19464%
Danio rerio Zebrafish400.1 NP_001002522 FAM203A37770%
Xenopus tropicalis Western clawed frog371.2 AAI60980 LOC10014541237770%
Xenopus tropicalis Western clawed frog371.2 NP_001007916 FAM203A35968%
Strongylocentrotus purpuratus Purple sea urchin742.9 XP_793139 FAM203A-like37262%
Anolis carolinensis Carolina anole301.7 XP_003228921 BRP16L28657%
Saccoglossus kowglevski Acorn worm661.2 XP_002739897 BRP16L36261%
Danio rerio Zebrafish400.1 XP_002665502 BRP16L18157%
Saccharomyces cerevisiae Budding yeast1369.0 NP_011703 Hgh1p39452%
Arabidopsis thaliana Thale cress1369.0 NP_172882 Armadillo/beta-catenin-like repeats-containing33949%

There is one ortholog of FAM203B, brain protein 16-like (BRP16L) in Macaca mulatta, [4] [23] although no other primates appear to have orthologous proteins. There are two possible explanations for this anomaly: (1) DNA of other primates has not been sequenced thoroughly in the genomic region of the FAM203B ortholog, or (2) FAM203B is the result of a gene duplication event unique to humans, meaning that BRP16L in M. mulatta resulted from an earlier duplication event unique to that species. The second explanation is supported by the following evidence:

  1. Like M. mulatta, Danio rerio has both a FAM203A gene and a BRP16L gene. The large amount of time since the divergence of the M. mulatta and D. rerio lineages suggests that these BRP16L genes are the result of separate duplication events.
  2. The BRP16L protein in D. rerio has a significant 3’ truncation compared to the M. mulatta protein, further supporting the hypothesis that these proteins evolved separately. [22] [25] [26]
  3. If the BRP16L genes in "M mulatta" and "D. rerio" are the result of separate duplication events, then it is also possible that FAM203B and BRP16L in "M. mulatta" are the result of separate duplication events.
  4. BRP16 (brain protein 16) is an alias of FAM203A, and BRP16L (brain protein 16-like) is an alias of FAM203B. A gene named BRP16L simply means that the gene is related to FAM203A but not necessarily to FAM203B.
  5. FAM203A and FAM203B are located in the telomeric region of chromosome 8, an area of chromosomes that frequently experiences recombination events.

However, because FAM203A and FAM203B are so similar, it is difficult to determine whether proteins are orthologs or just simply homologs.


Phylogeny

The phylogenetic tree of FAM203B and its homologs matches with the overall divergence of the respective lineages. [22] [24]

Conserved Domains, Motifs, and Residues

  1. ARM (armadillo/beta-catenin-like repeats-containing): Found in two homologs (FAM203A in Danio rerio and At1g14300 in Arabidopsis thaliana) and overlaps slightly with the beginning of the DUF383 domain. Related to the HEAT domain, consists of a 40-amino-acid tandemly repeated sequence motif, and is thought to mediate protein-protein interactions. Several eukaryotic genes contain ARM domains including armadillo in Drosophila melanogaster, beta-catenin, plakoglobin, and adenomatous polyposis coli in mammals. [4]
  2. DUF383: Domain of unknown function 383
  3. DUF384: Domain of unknown function 383
A schematic representation of the conserved domains in the FAM203B protein. FAM203B Conserved Domains.png
A schematic representation of the conserved domains in the FAM203B protein.

Every ortholog and homolog of FAM203B has a DUF383 domain and a DUF384 domain (except Anolis carolinensis, which is missing DUF384 due to a large 3' truncation [23] [27] ). There is significant variation among mammals, marsupials, and monotremes as to where the DUF383 domain begins, whereas this variation is smaller in reptiles, amphibians, fish, invertebrates, plants, and fungi. Additionally, the DUF383 domain ends at the same location for all homologs, while the DUF384 domain starts and ends at roughly the same location in all homologs. There is high homology in the DUF384 domain (292..349) and in the DUF383 domain (154..288), and several amino acids are completely conserved in vertebrates, invertebrates, plants, and fungi, which include Arg190, Gly219, Asn226, Lys273, and Lys338. Other highly conserved amino acids include Asn87, Lys88, Arg216, and Phe229. [4] [22]

Related Research Articles

<span class="mw-page-title-main">SUHW4</span> Protein-coding gene in the species Homo sapiens

Zinc finger protein 280D, also known as Suppressor Of Hairy Wing Homolog 4, SUWH4, Zinc Finger Protein 634, ZNF634, or KIAA1584, is a protein that in humans is encoded by the ZNF280D gene located on chromosome 15q21.3.

<span class="mw-page-title-main">TSR3</span> Hypothetical human protein

TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.

<span class="mw-page-title-main">KIAA0895</span> Protein-coding gene in the species Homo sapiens

KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.

<span class="mw-page-title-main">Proline-rich 12</span> Protein-coding gene in the species Homo sapiens

Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.

<span class="mw-page-title-main">Protein FAM46B</span> Protein-coding gene in the species Homo sapiens

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

<span class="mw-page-title-main">Coiled-coil domain containing protein 120</span> Protein-coding gene in humans

Coiled coil domain containing protein 120 (CCDC120), also known as JM11 protein, is a protein that, in humans, is encoded by the CCDC120 gene. The function of CCDC120 has not been formally identified but structural components, conservation, and interactions can be identified computationally.

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">EVI5L</span> Protein-coding gene in the species Homo sapiens

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

<span class="mw-page-title-main">C17orf50</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C17orf50 is a protein which in humans is encoded by the C17orf50 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">SCRN3</span> Protein-coding gene in the species Homo sapiens

Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.

References

  1. 1 2 3 4 "Predicted: protein FAM203B [Homo sapiens]". NCBI Protein. Retrieved 5 February 2013.
  2. 1 2 "Predicted: Homo sapiens family with sequence similarity 203, member B (FAM203B, mRNA". NCBI Nucleotide. 2012-10-30. Retrieved 5 February 2013.{{cite journal}}: Cite journal requires |journal= (help)
  3. "FAM203 family". NextProt Beta. Retrieved 5 February 2013.
  4. 1 2 3 4 5 6 "HomoloGene: 48742, gene conserved in Eukaryota". NCBI HomoloGene. Retrieved 18 January 2013.
  5. 1 2 3 Brendel, Volker. "SAPS (Statistical Analysis of PS)".
  6. Fievet BT, Rodriguez J, Naganathan S, Lee C, Zeiser E, Ishidate T, Shirayama M, Grill S, Ahringer J (January 2013). "Systematic genetic interaction screens uncover cell polarity regulators and functional redundancy". Nature Cell Biology. 15 (1): 103–12. doi:10.1038/ncb2639. PMC   3836181 . PMID   23242217.
  7. 1 2 3 GRCh38: Ensembl release 89: ENSG00000235173 - Ensembl, May 2017
  8. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000022554 - Ensembl, May 2017
  9. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  10. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  11. 1 2 3 "FAM203B family with sequence similarity 203, member B [Homo sapiens (human)]". NCBI Gene. Retrieved 5 February 2013.
  12. 1 2 "FAM203B Gene". Weizmann Institute of Science. Retrieved 9 May 2013.
  13. "TSSK5P2 testis-specific serine kinase 5 pseudogene 2 [Homo sapiens (human)]". NCBI Gene. Retrieved 10 May 2013.
  14. "LOC377711 HEAT repeat-containing protein 7A-like [Homo sapiens (human)]". NCBI Gene. Retrieved 10 May 2013.
  15. 1 2 "FAM203A family with sequence similarity 203, member A [Homo sapiens (human)]". NCBI Gene. Retrieved 10 May 2013.
  16. 1 2 3 "Genomatix El Dorado" . Retrieved 7 April 2013.
  17. Toldo, Luca. "PI (Isoelectric Point Determination)".
  18. Kelley LA, Sternberg MJ (2009). "Protein structure prediction on the Web: a case study using the Phyre server" (PDF). Nature Protocols. 4 (3): 363–71. doi:10.1038/nprot.2009.2. hdl: 10044/1/18157 . PMID   19247286. S2CID   12497300.
  19. Blom N, Gammeltoft S, Brunak S (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–62. doi:10.1006/jmbi.1999.3310. PMID   10600390.
  20. Horton, Paul. "PSORT II".
  21. "C8orf30B Predicted Functional Partners". STRING: functional protein association networks. Retrieved 9 May 2013.
  22. 1 2 3 4 Higgins DG, Bleasby AJ, Fuchs R (April 1992). "CLUSTAL V: improved software for multiple sequence alignment" . Computer Applications in the Biosciences. 8 (2): 189–91. doi:10.1093/bioinformatics/8.2.189. PMID   1591615.
  23. 1 2 3 "BLAST: Basic Local Alignment Search Tool". NCBI BLAST. Retrieved 5 February 2013.
  24. 1 2 Hedges SB, Dudley J, Kumar S (December 2006). "TimeTree: a public knowledge-base of divergence times among organisms". Bioinformatics. 22 (23): 2971–2. doi: 10.1093/bioinformatics/btl505 . PMID   17021158.
  25. "Predicted: brain protein 16-like [Macaca mulatta]". NCBI Protein. Retrieved 5 February 2013.
  26. "Predicted: brain protein 16-like [Danio rerio]". NCBI Protein. Retrieved 5 February 2013.
  27. "Predicted: brain protein 16-like [Anolis carolinensis]". NCBI Protein. Retrieved 10 May 2013.