Family with Sequence Similarity 166, member B, or FAM166B, is an uncharacterized protein in humans that is encoded by the FAM166B gene.
The FAM166B gene is located on the short arm of chromosome 9 at 9p13.3 on the minus strand. [1] The genomic sequence spans 2,069 base pairs from 35563899 to 35561830. Gene neighbors are RUSC2, RPS29P17, and TESK1.
FAM166B is expressed 0.5 times higher than average in humans. [2] FAM166B is highly expressed in the adrenal gland, fallopian tube, and respiratory epithelial tissues. It is weakly to moderately expressed in skeletal muscle and heart muscle. [3] [4] [5]
FAM166B is predicted to have a promoter that spans 680 bp and includes the 5' UTR. [6]
In humans, FAM166B has 10 transcript variants, which are all spliced. [2] FAM166B transcript variant 1 is 1,092 bp in length and contains 6 total exons. The accession number for this variant is NM_001164310. [7]
The amino acid sequence is 275 amino acids in length and contains 3 DUF 2475 regions. [8] The three DUF2475 regions are located from amino acids 15 to 80, 174 to 234, and 234 to 261.The predicted molecular weight is 30.6 kdal with the predicted isoelectric point of 8.414. [9] It is known to have a higher than normal proline composition compared to other human proteins at 12.4%. The protein has a negative charged region from residues 141 to 172. [10]
1 MAVASTFIPGLNPQNPHYIPGYTGHCPLLRFSVGQTYGQVTGQLLRGPPGLAWPPVHRTLLPPIRPPRSP 71 EVPRESLPVRRGQERLSSSMIPGYTGFVPRAQFIFAKNCSQVWAEALSDFTHLHEKQGSEELPKEAKGRK 141 DTEKDQVPEPEGQLEEPTLEVVEQASPYSMDDRDPRKFFMSGFTGYVPCARFLFGSSFPVLTNQALQEFG 211 QKHSPGSAQDPKHLPPLPRTYPQNLGLLPNYGGYVPGYKFQFGHTFGHLTHDALGLSTFQKQLLA
FAM166B is predicted to have 12 phosphorylation, 3 sumoylation, and 1 acetylation sites. [11] [12] [13] FAM166 has no predicted signal peptide sequences. [14]
FAM166B is predicted to be composed mostly of coils with short interspersed regions of alpha helices and beta sheets. [15] There are no predicted transmembrane domains and this is consistent through orthologs. [16]
However, the intracellular location of FAM166B is unknown. [17] [18] The average hydrophobicity of the protein is -0.519272, which suggests that it is a soluble protein. [16]
FAM166B has a number of orthologs in mammals, birds, reptiles, fish, and some invertebrates. The table below lists a number of FAM166B orthologs that were found using BLAST. [19] The table descending exhibits the diversity of species with FAM166B orthologs in descending order of identity.
Scientific Name | Common Name | Protein Accession Number | Sequence Length (aa) | Identity | Similarity |
---|---|---|---|---|---|
Homo Sapiens | Human | NP_001157782.1 | 275 | ||
Camelus Ferus | Bactrian Camel | XP_006187942.1 | 279 | 83% | 87% |
Bos Taurus | Domestic Cow | XP_005210130.1 | 303 | 81% | 87% |
Felis catus | Domestic Cat | XP_003995654.1 | 280 | 81% | 86% |
Tursiops truncatus | Common Bottlenose Dolphin | XP_004312901.1 | 274 | 80% | 86% |
Elephantulus edwardii | Cape Elephant Shrew | XP_006887001.1 | 275 | 80% | 86% |
Mus Musculus | Mouse | XP_006538075.1 | 273 | 75% | 82% |
Dasypus novemcinctus | Nine-banded armadillo | XP_004457403.1 | 286 | 75% | 82% |
Equus caballus | Horse | XP_001914783.2 | 300 | 75% | 80% |
Monodelphis domestica | Gray short-tailed opossum | XP_007498869.1 | 292 | 57% | 68% |
Chelonia mydas | Green Turtle | XP_007055984.1 | 251 | 47% | 62% |
Tinamus guttatus | White-throated Tinamu | XP_010220019.1 | 307 | 45% | 55% |
Python bivittatus | Burmese Python | XP_007427141.1 | 340 | 42% | 56% |
Xenopus tropicalis | Western Clawed Frog | NP_001106452.1 | 306 | 42% | 55% |
Anolis carolinesis | Carolina Anole | XP_003228316.1 | 314 | 40% | 53% |
Danio rerio | Zebrafish | NP_001076489.2 | 299 | 39% | 52% |
Poecilia formosa | Amazon Molly | XP_007566989.1 | 262 | 34% | 50% |
Ciona intestinalis | Vase Tunicate | XP_002129379.1 | 330 | 30% | 42% |
Picoides pubescens | Downy Woodpecker | XP_009895146.1 | 296 | 26% | 41% |
Hydra vulgaris | Freshwater Polyp | XP_002162128.1 | 282 | 26% | 41% |
Saccoglossus kowalevskii | Acorn Worm | XP_002739701.1 | 332 | 24% | 41% |
Stronglyocentrotus purpuratus | Purple Sea Urchin | XP_786484.3 | 333 | 24% | 40% |
FAM166B has one paralog, FAM166A, which spans 317 aa and has a 25% identity. [20] The accession number for FAM166A is NP_001001710.
Currently, FAM166 is not associated within a human disease or condition. Despite being located on Spastic paraplegia 46, a locus on chromosome 9, that is known to cause an autosomal-recessive disease called hereditary spastic paraplegia (HSP), FAM166B was determined not to be the gene responsible for the disease due to its frequency in the population controls. [21] FAM166B was excluded from a patent looking for genes that are prognosis predictors for classic Hodgkin's lymphoma (cHL). [22]
Uncharacterized protein KIAA1109 is a protein that in humans is encoded by the KIAA1109 gene.
Tetratricopeptide repeat 39A is a human protein encoded by the TTC39A gene. TTC39A is also known as DEME-6, KIAA0452, and c1orf34. The function of TTC39A is currently not well understood. The main feature within tetratricopeptide repeat 39A is the domain of unknown function 3808 (DUF3808), spanning almost the entire protein. KIAA0452 can also be seen as an isoform of TTC39A because of differences in genome sequence, but overlap in DUF domain.
Glutamine Serine Rich Protein 1 or QSER1 is a protein encoded by the QSER1 gene.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
Family with Sequence Similarity 78-Member B (FAM78B) is a protein of unknown function in humans that is encoded by the FAM78B gene (1q24.1). It has orthologous genes and predicted proteins in vertebrates and several invertebrates, but not in arthropods. It has a nuclear localization signal in the protein sequence and a miRNA target region in the mRNA sequence.
KIAA1841 is a gene in humans that encodes a protein known as KIAA1841. KIAA1841 is targeted for the nucleus and it predicted to play a role in regulating transcription.
NHL Repeat Containing Protein 2, or NHLRC2, is a protein encoded by the NHLRC2 gene.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.
C21orf62 is a protein that, in humans, is encoded by the C21orf62 gene. C21orf62 is found on human chromosome 21, and it is thought to be expressed in tissues of the brain and reproductive organs. Additionally, C21orf62 is highly expressed in ovarian surface epithelial cells during normal regulation, but is not expressed in cancerous ovarian surface epithelial cells.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.
UPF0575 protein C19orf67 is a protein which in humans is encoded by the C19orf67 gene. Orthologs of C19orf67 are found in many mammals, some reptiles, and most jawed fish. The protein is expressed at low levels throughout the body with the exception of the testis and breast tissue. Where it is expressed, the protein is predicted to be localized in the nucleus to carry out a function. The highly conserved and slowly evolving DUFF3314 region is predicted to form numerous alpha helices and may be vital to the function of the protein.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
{{cite journal}}
: Cite journal requires |journal=
(help)