This article may be too technical for most readers to understand. Please help improve it to make it understandable to non-experts, without removing the technical details. (May 2017) (Learn how and when to remove this template message) |
FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. [1] FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene. [2]
FAM227A is found on chromosome 22 at the location 22q13.1. It is flanked by the gene LOC105373031 on the left and CBY1 on the right. The gene is 78,510 base pairs long with 21 exons. There are currently no aliases for FAM227A. [3]
Protein chibby homolog 1 is a protein that in humans is encoded by the CBY1 gene.
There are two isoforms of FAM227A. The first isoform, NM_001013647.1, has a shorter transcript but a longer isoform. It is 2,948 base pairs long, and includes the first 17 exons. The second isoform, NM_001291030.1, is 10,362 base pairs long. It starts translation at a different start codon than variant 1 by utilizing an alternate splice site. The 5’ region is relatively short but the 3’ region is very long. [4]
In molecular biology and genetics, translation is the process in which ribosomes in the cytoplasm or ER synthesize proteins after the process of transcription of DNA to RNA in the cell's nucleus. The entire process is called gene expression.
The primary sequence for FAM227A is isoform 1 with accession number: NP_001013669.1. It is 570 amino acids long. There are 9 isoforms. The molecular weight is 66kD, [4] and the isoelectric point is 9.6. [6] Compared to other proteins in humans, FAM227A has less abundant glycine and more abundant hydrophobic amino acids and positively charges amino acids. [7] The protein is predicted to be in the nuclear region of the cell. Three nuclear signals include HKKK at 129(pat 4), KKK at 130(pat4), and PKKTKIK at 410(pat7). [1] An FWWh region, where h signifies hydrophobic, runs from amino acids 135-296 in Homo sapiens. Most eukaryotic proteins contain this sequence. The function of this region is still unknown. [3] Motifs in FAM227A include KRK, SGK, and RRE.
The secondary structure is predicted to be made up of alpha helices mainly. but also beta pleated sheets. [8]
Phosphorylation is the only predicted post-translational modification. There are three experimentally determined phosphorylation sites at Y343, S348, and S349. [4]
FAM227A is experimentally determined to be highly expressed in the testis, epididymis, pituitary gland, and the fallopian tubes. This protein is not predicted to be ubiquitous as the rate of expression varies across tissue types. [11]
Currently, the function of FAM227A has not been characterized.
Currently, there are no predicted proteins that interact with FAM227A
FAM227A is predicted to be located in the nuclear region of the cell. This prediction is consistent across species. [1]
Nuclear | Mitochondrial | Cytoplasmic | Vacuolar | Extracellular | |
---|---|---|---|---|---|
Homo Sapiens | 43.5% | 17.4% | 17.4% | 8.7% | 4.3% |
Nomascus leucogenys | 39.1% | 21.7% | 17.4% | 8.7% | 8.7% |
Marmota marmota | 60.9% | N/A | 26.1% | 4.3% | N/A |
Camelus bactrianus | 52.2% | 21.7% | 8.7% | 8.7% | 8.7% |
Thamnophis sirtalis | 69.6% | 8.7% | 8.7% | N/A | N/A |
Paralogs: FAM227B
Orthologs: FAM227A is present mainly in mammals but also in species of reptiles and birds. The most distantly related ortholog is Xenopus tropicalis, the Western Clawed Frog. Based on the years of divergence for FAM227A, the gene evolved very rapidly. [2]
Order | Genus and Species | Common Name | Date of Divergence | Accession # | Sequence Identity to humans |
---|---|---|---|---|---|
Primates | Homo sapiens | Human | 0 | NP_001013669.1 | 100% |
Primates | Nomascus leucogenys | Northern White-Cheeked Gibbon | 19.43 | XP_003264802.1 | 96% |
Scandentia | Tupaia chinensis | Chinese Tree Shrew | 85 | XP_006155440.1 | 72% |
Rodentia | Jaculus jaculus | Lesser Egyptian Jerboa | 88 | XP_012804178.1 | 61% |
Carnivora | Ailuropoda melanoleuca | Giant Panda | 94 | XP_002914627.1 | 70% |
Perissodactyla | Equus asinus | Donkey | 94 | XP_014706772.1 | 69% |
Cetartiodactyla | Oncinus orca | Killer Whale | 94 | XP_012392035.1 | 63% |
Soricomorpha | Condylura cristata | Star-Nosed Mole | 94 | XP_012590687.1 | 62% |
Chiroptera | Eptesicus fuscus | Big Brown Bat | 94 | XP_008143135.1 | 61% |
Cingulata | Dasypus novemcinctus | Nine Banded Armadillo | 102 | XP_004447922.1 | 70% |
Sirenia | Trichechus manadtrus latirostris | West Indian Manatee | 102 | XP_004374098.1 | 59% |
Tinamiformes | Tinamus guttatus | White-Throated Tinamou | 320 | XP_010218404.1 | 51% |
Testudines | Pelodiscus sinensis | Chinese Softshell Turtle | 320 | XP_006119021.1 | 38% |
Anura | Xenopus tropicalis | Western Clawed Frog | 353 | XP_002933807.2 | 34% |
In 2016, a study performed an association analysis on chromosome 22 at 31203 markers in order to determine if high blood pressure and smoking were correlated. Chromosome 22 was chosen based on the results of the data collected from three clinical visits at the Framingham Heart Study. [12] In 2013, researchers investigated 3 clusters of SNP’s thought to be linked to prostate cancer in Arab populations. The study found that the deletion region on chromosome 22q13, where FAM227A is located, can also be linked to breast and colorectal cancer in humans in addition to prostate cancer3. [13] Another study suggests the location of FAM227A may be linked to a central regulator, SOX10, which is involved in the maturation of neural crest derivatives. Gene deletion of FAM227A was linked to lung abnormality, atrial septum defect, small size for gestational age, and sensorineural hearing loss in this study. [14]
Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.
PRR29 is a protein located on human chromosome 17 that in humans is encoded by the PRR29 gene.
Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.
FAM231B, or family with sequence similarity 231B, is a protein found in humans and is encoded by FAM231B gene. Orthologs of FAM231B are only found back to primates.
Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.
C12orf66 is a protein that in humans is encoded by the C12orf66 gene. The C12orf66 protein is one of four proteins in the KICSTOR protein complex which negatively regulates mechanistic target of rapamycin complex 1 (mTORC1) signaling.
The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
C11orf42 is an uncharacterized protein in homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.
Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141.. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene.The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.