FAM167A | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | FAM167A , C8orf13, D8S265, family with sequence similarity 167 member A, DIORA-1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 610085 MGI: 3606565 HomoloGene: 14243 GeneCards: FAM167A | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Family with sequence similarity 167, member A is a protein in humans that is encoded by the FAM167A gene located on chromosome 8. [5] FAM167A and its paralogs are protein encoding genes containing the conserved domain DUF3259, a protein of unknown function. [6] FAM167A has many orthologs in which the domain of unknown function is highly conserved.
On chromosome 8, FAM167A is positioned between c8orf12 (anti-sense) and BLK (anti-sense). [7] The exact locus of FAM167A is 8p23-22 and spans from 11,278,972 to 11,332,224, a total of 53,253 base pairs. The promoter spans from 11324145 to 11324476 on the negative strand, thereby the first basepair is actually on 11324476. There are no human isoforms found.
Family with Sequence Similarity 167, Member A is also known as FAM167A, c8orf13, or D8S265. [8]
FAM167A has one paralog, FAM167B also known as c1orf90. [9] FAM167B is located at 1p35.1 on the plus strand and is composed of 163 amino acids and also contains DUF3259. [10]
FAM167A has orthologs in 82 organisms and is conserved across chimpanzees, dog, cow, mouse, chicken, rat, frogs, and zebrafish. [11] [12]
Species | Species Common Name | NCBI Accession Number (Protein) | Amino Acid Length | Protein Identity | Divergence date from Humans (million years ago) |
---|---|---|---|---|---|
Homo Sapiens | Human | NP_444509 | 214 | 100% | 0 |
Pan Troglodytes | Chimpanzee | XP_001139122 | 214 | 99% | 6.3 |
Macaca Fascicularis | Macaque | XP_005562638.1 | 214 | 96% | 29 |
Neterocephalus Glaber | Naked mole rat | XP_004848509 | 214 | 84% | 92.3 |
Felis Catus | Cat | XP_003984890 | 209 | 80% | 94.2 |
Equus Caballus | Horse | XP_001497968 | 203 | 80% | 94.2 |
Alligator Sinensis | Chinese Alligator | XP_006028215 | 211 | 70% | 296 |
Anolis Carolinensis | Carolina Anole | XP_003227984 | 215 | 64% | 296 |
Danio Rerio | Zebrafish | NP_1020721 | 204 | 59% | 400.1 |
Latimeria Chalumnae | African Coelacanth | XP_05994570 | 148 | 43% | 414.9 |
Ciona Intestinalis | Sea Squirt | XP_002123421 | 255 | 27% | 722.5 |
As shown in the table above, FAM167A is highly conserved across many orthologs of various divergence dates. The exact degree of conservation follows what is expected due to the evolutionary track of a protein.
The gene that encodes FAM167A is 214 amino acids in length. The molecular weight in humans of the FAM167A protein is 24.2 kdal and the isoelectric point is measured to be 5.887 in Homo sapiens. [13] Mouse and chicken orthologs were shown to have a molecular weight of ± 0.5 kdal and isoelectric points were ±0.6.
As per the results on AceView, shown right, the FAM167A gene contains 13 introns. The gene is also "well expressed" at 1.2 times the average gene. Transcription produces 9 different mRNAs, 8 of which are alternatively spliced and 1 unspliced form. 4 of the spliced proteins, which includes 2 isoforms, are considered to be good while the remaining five are partial or not good proteins. [14]
FAM167A has a leucine zipper as part of its secondary structure as noted by the four heptad leucine repeat regions shown in SAPS. The leucine zipper is a portion of the DUF domain. Predictions of the secondary structure for the FAM167A protein are mostly that it is made of alpha helices and coiled coils, which would be reasonable as there is a coiled coil domain. The C-terminus end of DUF3259 is generally agreed upon in the PELE program to be a region of potential beta sheets and coiled coils. Using PELE, there is some consensus amongst the eight different outputs given as to the general secondary structure of the protein. There are no transmembrane domains as predicted on the FAM167A protein.
Using the MINT, STRING, and IntAct tools on Genecards, the sources have a consensus on the interactions between FAM167A and BANK1 as well as the BLK gene. [15] These proteins are already known to interact with FAM167A in the development of several diseases such as Sjogren's disease and systemic sclerosis. In both the case of BANK1 and BLK, there is literature to back up the possible connections and interactions between the two proteins in disease development.
No glycosylation sites have been found, as searched using tools on Expasy.org. There was a site for serine phosphorylation on both the human and mouse proteins and two for tyrosine phosphorylation, amino acids 147, 159, and 170 respectfully. Phosphorylation sites are used for various regulatory functions such as enzyme inhibition, protein-protein interactions, and protein degradation.
Micro arrays show that FAM167A has varied expression in reactions to cancers, but no information regarding the exact function of FAM167A can be drawn from these micro arrays. FAM167A has ubiquitously low expression in all tissues types throughout the body. [16] In mouse it has a higher expression in the skin, B-cells, and spleen, but the same low expression in all other cell types. [17]
SNPs in the regions between FAM167A and the BLK gene have been associated with the development of Sjogren's syndrome in a Han Chinese population, [18] as well as in a Scandinavian population. [19] The FAM167A-BLK region has also been linked to systemic sclerosis by comparing functional variants in the C8orf13-BLK locus in a Caucasian population. Results of the study confirms the C8orf13-BLK locus as a systemic sclerosis risk locus, strongest effects were observed in the interactions between that locus and BANK1. [20]
Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.
Protein FAM46C also known as family with sequence similarity 46, member C is a protein that, in humans, is encoded by the FAM46C gene at locus 1p12 spanning base pairs from 118,148,556 to 118,171,011.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
Cilia And Flagella Associated Protein 206 (CFAP206) is a gene that in humans encodes a protein “DUF3508”. This protein has a function that is not currently very well understood. Other known aliases are “dJ382I10.1, UPF0704 Protein C6orf165.” In humans, the gene coding sequence is 56,501 base pairs long, with an mRNA of 2,215 base pairs, and a protein sequence of 622 amino acids. The C6orf165 gene is conserved in chimpanzee, rhesus monkey, dog, cow, mouse, rat, chicken, zebrafish, mosquito, frog, and more C6orf165 is rarely expressed in humans, with relatively high expression in brain, lungs (trachea) and testis. The molecular weight of UPF0704 is 71,193 Da and the PI is 6.38
C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).
UPF0739 protein C1orf74 is a protein that in humans is encoded by the C1orf74 gene.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
FAM221B is a protein that in humans is encoded by the FAM221B gene . FAM221B is also known by the alias C9orf128, is expressed at low level, and is defined by 17 GenBank accessions . It is predicted to function in transcription regulation as a transcription factor.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.