FAM227a

Last updated

FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. [1] FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene. [2]

Contents

Gene

FAM227A is found on chromosome 22 at the location 22q13.1. It is flanked by the gene LOC105373031 on the left and CBY1 on the right. The gene is 78,510 base pairs long with 21 exons. There are currently no aliases for FAM227A. [3]

CBY1 Protein-coding gene in the species Homo sapiens

Protein chibby homolog 1 is a protein that in humans is encoded by the CBY1 gene.

mRNA

There are two isoforms of FAM227A. The first isoform, NM_001013647.1, has a shorter transcript but a longer isoform. It is 2,948 base pairs long, and includes the first 17 exons. The second isoform, NM_001291030.1, is 10,362 base pairs long. It starts translation at a different start codon than variant 1 by utilizing an alternate splice site. The 5’ region is relatively short but the 3’ region is very long. [4]

Translation (biology) In biology, the process in which cellular ribosomes create proteins.

In molecular biology and genetics, translation is the process in which ribosomes in the cytoplasm or ER synthesize proteins after the process of transcription of DNA to RNA in the cell's nucleus. The entire process is called gene expression.

Conceptual Translation of FAM227A Conceptual Translation of FAM227A.pdf
Conceptual Translation of FAM227A

Protein

Visual representation of FAM227A. The brackets represent motifs, the grey diamonds represent predicted leucine-rich nuclear export signals, and the red diamonds represent predicted phosphorylation sites. FAM227A cartoon representation.png
Visual representation of FAM227A. The brackets represent motifs, the grey diamonds represent predicted leucine-rich nuclear export signals, and the red diamonds represent predicted phosphorylation sites.

The primary sequence for FAM227A is isoform 1 with accession number: NP_001013669.1. It is 570 amino acids long. There are 9 isoforms. The molecular weight is 66kD, [4] and the isoelectric point is 9.6. [6] Compared to other proteins in humans, FAM227A has less abundant glycine and more abundant hydrophobic amino acids and positively charges amino acids. [7] The protein is predicted to be in the nuclear region of the cell. Three nuclear signals include HKKK at 129(pat 4), KKK at 130(pat4), and PKKTKIK at 410(pat7). [1] An FWWh region, where h signifies hydrophobic, runs from amino acids 135-296 in Homo sapiens. Most eukaryotic proteins contain this sequence. The function of this region is still unknown. [3] Motifs in FAM227A include KRK, SGK, and RRE.

Secondary Structure

The secondary structure is predicted to be made up of alpha helices mainly. but also beta pleated sheets. [8]

Secondary Structure Prediction Secondary Structure Prediction.pdf
Secondary Structure Prediction

Post-Translational Modification

Phosphorylation is the only predicted post-translational modification. There are three experimentally determined phosphorylation sites at Y343, S348, and S349. [4]

Evolutionary History of FAM227A. This gene appears to evolve rapidly when compared to cytochrom c and fibrinogen Evolutionary History Graph.png
Evolutionary History of FAM227A. This gene appears to evolve rapidly when compared to cytochrom c and fibrinogen

Expression

FAM227A is experimentally determined to be highly expressed in the testis, epididymis, pituitary gland, and the fallopian tubes. This protein is not predicted to be ubiquitous as the rate of expression varies across tissue types. [11]

Function

Currently, the function of FAM227A has not been characterized.

Interacting Proteins

Currently, there are no predicted proteins that interact with FAM227A

Subcellular Localization

FAM227A is predicted to be located in the nuclear region of the cell. This prediction is consistent across species. [1]

NuclearMitochondrialCytoplasmicVacuolarExtracellular
Homo Sapiens43.5%17.4%17.4%8.7%4.3%
Nomascus leucogenys39.1%21.7%17.4%8.7%8.7%
Marmota marmota60.9%N/A26.1%4.3%N/A
Camelus bactrianus52.2%21.7%8.7%8.7%8.7%
Thamnophis sirtalis69.6%8.7%8.7%N/AN/A

Homology

Paralogs: FAM227B

Orthologs: FAM227A is present mainly in mammals but also in species of reptiles and birds. The most distantly related ortholog is Xenopus tropicalis, the Western Clawed Frog. Based on the years of divergence for FAM227A, the gene evolved very rapidly. [2]

OrderGenus and SpeciesCommon NameDate of DivergenceAccession #Sequence Identity to humans
PrimatesHomo sapiensHuman0NP_001013669.1100%
PrimatesNomascus leucogenysNorthern White-Cheeked Gibbon19.43XP_003264802.196%
ScandentiaTupaia chinensisChinese Tree Shrew85XP_006155440.172%
RodentiaJaculus jaculusLesser Egyptian Jerboa88XP_012804178.161%
CarnivoraAiluropoda melanoleucaGiant Panda94XP_002914627.170%
PerissodactylaEquus asinusDonkey94XP_014706772.169%
CetartiodactylaOncinus orcaKiller Whale94XP_012392035.163%
SoricomorphaCondylura cristataStar-Nosed Mole94XP_012590687.162%
ChiropteraEptesicus fuscusBig Brown Bat94XP_008143135.161%
CingulataDasypus novemcinctusNine Banded Armadillo102XP_004447922.170%
SireniaTrichechus manadtrus latirostrisWest Indian Manatee102XP_004374098.159%
TinamiformesTinamus guttatusWhite-Throated Tinamou320XP_010218404.151%
TestudinesPelodiscus sinensisChinese Softshell Turtle320XP_006119021.138%
AnuraXenopus tropicalisWestern Clawed Frog353XP_002933807.234%

Clinical Significance

In 2016, a study performed an association analysis on chromosome 22 at 31203 markers in order to determine if high blood pressure and smoking were correlated. Chromosome 22 was chosen based on the results of the data collected from three clinical visits at the Framingham Heart Study. [12] In 2013, researchers investigated 3 clusters of SNP’s thought to be linked to prostate cancer in Arab populations. The study found that the deletion region on chromosome 22q13, where FAM227A is located, can also be linked to breast and colorectal cancer in humans in addition to prostate cancer3. [13] Another study suggests the location of FAM227A may be linked to a central regulator, SOX10, which is involved in the maturation of neural crest derivatives. Gene deletion of FAM227A was linked to lung abnormality, atrial septum defect, small size for gestational age, and sensorineural hearing loss in this study. [14]

Related Research Articles

FAM63A protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.

TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

PRR29 protein-coding gene in the species Homo sapiens

PRR29 is a protein located on human chromosome 17 that in humans is encoded by the PRR29 gene.

TMEM176B protein-coding gene in the species Homo sapiens

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

FAM231B, or family with sequence similarity 231B, is a protein found in humans and is encoded by FAM231B gene. Orthologs of FAM231B are only found back to primates.


Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

C12orf66 is a protein that in humans is encoded by the C12orf66 gene. The C12orf66 protein is one of four proteins in the KICSTOR protein complex which negatively regulates mechanistic target of rapamycin complex 1 (mTORC1) signaling.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

C17orf53 protein-coding gene in the species Homo sapiens

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

C21orf58 protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

C19orf44 (gene) protein-coding gene in the species Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

TEX9 protein-coding gene in the species Homo sapiens

Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.

C16orf86 protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

C11orf42 is an uncharacterized protein in homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.

C9orf50 protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141.. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene.The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

References

  1. 1 2 3 "PSORT: Protein Subcellular Localization Prediction Tool". www.genscript.com. Retrieved 2017-04-27.
  2. 1 2 "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2017-04-27.
  3. 1 2 "FAM227A family with sequence similarity 227 member A [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-04-27.
  4. 1 2 3 "Homo sapiens family with sequence similarity 227 member A (FAM227A), t - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-04-27.
  5. "Homo sapiens family with sequence similarity 227 member A (FAM227A), t - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-05-05.
  6. Program by Dr. Luca Toldo, developed at http://www.embl-heidelberg.de. Changed by Bjoern Kindler to print also the lowest found net charge. Available at EMBL WWW Gateway to Isoelectric Point Service {{cite web |url=http://www.embl-heidelberg.de/cgi/pi-wrapper.pl |title=Archived copy |accessdate=2014-05-10 |url-status=dead |archiveurl=https://web.archive.org/web/20081026062821/http://www.embl-heidelberg.de/cgi/pi-wrapper.pl |archivedate=2008-10-26 }} Contact: Toldo@embl-heidelberg.de Bjoern.Kindler@embl-heidelberg.de
  7. Algorithm Citation: Brendel, V., Bucher, P., Nourbakhsh, I.R., Blaisdell, B.E. & Karlin, S. (1992) "Methods and algorithms for statistical analysis of protein sequences" Proc. Natl. Acad. Sci. U.S.A. 89, 2002-2006. Program Citation: Voker Brendel, Department of Mathematics, Stanford University, Stanford CA 94305, U.S.A., modified errors are due to modification.
  8. Protein structure prediction on the web: a case study using the Phyre server. Kelley LA and Sternberg MJE. Nature Protocols 4, 363-371 (2009)
  9. "Phyre 2 Results for FAM227A_Whole_Protein". www.sbg.bio.ic.ac.uk. Retrieved 2017-05-04.
  10. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2017-05-07.
  11. "Tissue expression of FAM227A - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2017-04-27.
  12. Basson, J., Sung, Y. J., de las Fuentes, L., Schwander, K. L., Vazquez, A., & Rao, D. C. (2016). Three Approaches to Modeling Gene‐Environment Interactions in Longitudinal Family Data: Gene‐Smoking Interactions in Blood Pressure. Genetic epidemiology, 40(1), 73-80.
  13. Shan, J., Al-Rumaihi, K., Rabah, D., Al-Bozom, I., Kizhakayil, D., Farhat, K., ... & Khalak, H. G. (2013). Genome scan study of prostate cancer in Arabs: identification of three genomic regions with multiple prostate cancer susceptibility loci in Tunisians. Journal of translational medicine, 11(1), 121.
  14. Jelena, B., Christina, L., Eric, V., & Fabiola, Q. R. (2014). Phenotypic variability in Waardenburg syndrome resulting from a 22q12. 3‐q13. 1 microdeletion involving SOX10. American Journal of Medical Genetics Part A, 164(6), 1512-1519