Fam221b

Last updated
Fam221b
FAM221B ITASSER.gif
Identifiers
Aliases 4930412F15Rikfamily with sequence similarity 221member B
External IDs HomoloGene: 52184 GeneCards:
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_175517

n/a

RefSeq (protein)

NP_780726

n/a

Location (UCSC)n/an/a
PubMed search [1] n/a
Wikidata
View/Edit Human

FAM221B is a protein that in humans is encoded by the FAM221B gene [2] . FAM221B is also known by the alias C9orf128, is expressed at low level, and is defined by 17 GenBank accessions [3] . It is predicted to function in transcription regulation as a transcription factor.

Contents

Gene

Locus

FAM221B can be found around the end of the short arm of human chromosome 9.

General position of FAM221B on Human Chromosome 9 (marked by red line) FAM221B hc9.png
General position of FAM221B on Human Chromosome 9 (marked by red line)
Gene neighborhood of FAM221B FAM221B ex hc9.gif
Gene neighborhood of FAM221B

Expression patterns

FAM221B is expressed at low levels in human and mouse tissues. Expression is highest in germ cell tissues and cells. This differential expression is most pronounced in testes tissue. Compared to Homo sapiens, Mus musculus shows more differential expression of FAM221B in testes tissue [4] [5] [6] [7] . Mature beta cells express FAM221B at higher rates than do fetal beta cells [8] .

mRNA

Alternative splicing and isoforms

FAM221B has a total of 5 transcript variants: the putative sequence, Isoform X1 [9] , Isoform X2 [10] , Isoform X3 [11] , and Isoform X4. Isoform X4 does not exist in humans but is found in various primates.

Exons

Various FAM221B spliced isoforms and their exons as shown in AceView AceView FAM221B exons.png
Various FAM221B spliced isoforms and their exons as shown in AceView

There are a total of six exons in the putative sequence of FAM221B. However, a total of seven exons exist for FAM221B, as the seventh exon is an alternative exon.

Protein

PHYRE2 prediction for FAM221B secondary structure with alpha-helix regions with strongest evidence for presence circled in red PHYRE2 prediction for FAM221B secondary structure.png
PHYRE2 prediction for FAM221B secondary structure with alpha-helix regions with strongest evidence for presence circled in red
Predicted phosphorylation sites annotated on FAM221B transcript determined by analysis of various program outputs FAM221B Predicted Phosphorylation Sites.png
Predicted phosphorylation sites annotated on FAM221B transcript determined by analysis of various program outputs

General characteristics

The putative sequence for FAM221B is 402 amino acids long and weighs 45.4 kilodaltons. Amino acids expressed at abnormal rates include Histidine, Cysteine, Glutamic acid, and Tyrosine. When compared to typical proteins, FAM221B expresses Histidine at a much higher frequency at 6.0% of protein, Cysteine at a slightly higher frequency at 4.7% of protein, Glutamic acid at a slightly higher frequency at 11.4% of protein, and Tyrosine at a slightly lower frequency at 1.0% of protein [15] . The isoelectric point of FAM221B is 5.264, suggesting FAM221B is an acidic protein at a normal physiological pH (7.4) [15] . There is strong evidence that FAM221B is a protein found within the nucleus [16] .

Compositional features

FAM221B is predicted to have two distinct alpha helices in its secondary structure [17] [18] [19] . Secondary structure predicting programs predict beta sheets but are not as consistent as the two alpha helices.

Post-translational modifications

FAM221B is predicted to have a high number of phosphorylation sites.

Protein interactions

There is evidence that FAM221B interacts with the proteins Autophagy related 13 (KIAA0652), RB1-inducible coiled-coil 1 (RB1CC1), and Ephrin-B3 (EFNB3) [20] . These proteins are predicted to be localized in the nucleus at the same confidence level as FAM221B.

Homology and evolution

FAM221B is conserved in Eutheria. However, both orthologous and paralogous transcripts predating ancestral Boroeutheria can be found.

Paralogs

One paralog exists for FAM221B in humans: FAM221A [21] . FAM221A and FAM221B's ancestral gene is predicted to have diverged in prokarya.

Gene nameAccession numberSequence length (aa)Sequence identity to human proteinSequence similarity to human proteinNotes
FAM221ANP_954587.240228%46%Exists in other organisms

Orthologs

Genus and speciesCommon nameDivergence from human liineage (MYA)Accession numberSequence length (aa)Sequence identity to human proteinSequence similarity to human protein
Rhinopithecus roxellana Golden Snub-nosed Monkey29.1XP_010374448.140292%95%
Saimiri boliviensis Black-capped Squirrel Monkey43.1XP_003943837.140292%93%
Tsuga chinensis Chinese Tree Shrew85.9XP_006143215.151878%88%
Cavia porcellus Guinea Pig90.9XP_003470749.141572%83%
Odobenus rosmarus divergens Pacific Walrus97.5XP_004392324.139872%81%
Orcinus orca Killer Whale97.5XP_004271469.141070%78%
Felis catus Feral Cat97.5XP_006939339.142968%75%
Loxodonta africana African Bush Elephant105XP_003407335.141466%78%
Ornithorhynchus anatinus Platypus179.2XP_007656406.126265%75%
Anolis carolinensis Carolina Anole320.5XP_008122390.155063%69%
Thamnophis sirtalis Common Garter Snake320.5XP_013924342.141162%74%
Lepisosteus oculatus Alligator Gar429.6XP_015222126.127262%74%
Callorhinchus milii Australian Ghostshark482.9XP_007895354.132658%76%
Strongylocentrotus purpuratus Sea Urchin747.8XP_781628.140957%73%
Crassostrea gigas Pacific Oyster847EKC20817.142056%70%
Clonorchis sinensis Chinese Liver Fluke847GAA48218.135942%55%
Nematostella vectensis Startlet Sea Anemone936XP_001628705.124442%57%

Homologous domains

There are three conserved domains within FAM221B: DUF4475 super family [22] , PRCC super family [23] , and Caprin-1_C [24] . DUF4475 is the most conserved domain of the three.

Clinical significance

FAM221B is linked to mutations in the RNA component of RNase MRP, which causes pleiotropic human disease cartilage–hair hypoplasia. Also, as patients with acute lymphoblastic leukemia often carry genetic alterations in the short arm of human chromosome 9, FAM221B has two consistent non-synonymous amino acid variations associated with the disease. In acute lymphoblastic leukemia patients, Histidine is substituted for an Arginine at position 345, and a Leucine is substituted for a Phenylalanine at position 277 of the protein.

Related Research Articles

FAM46C

Protein FAM46C also known as family with sequence similarity 46, member C is a protein that, in humans, is encoded by the FAM46C gene at locus 1p12 spanning base pairs from 118,148,556 to 118,171,011.

Protein FAM46B

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

FAM203B

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

Coiled-coil domain-containing 37, also known as FLJ40083, is a protein that in humans is encoded by the CCDC37 gene (3q21.3). There is no confirmed function of CCDC37.

FAM167A

Family with sequence similarity 167, member A is a protein in humans that is encoded by the FAM167A gene located on chromosome 8. FAM167A and its paralogs are protein encoding genes containing the conserved domain DUF3259, a protein of unknown function. FAM167A has many orthologs in which the domain of unknown function is highly conserved.

FAM63A

Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.

TMEM143 is a protein that in humans is encoded by TMEM143 gene. TMEM143, a dual-pass protein, is predicted to reside in the mitochondria and high expression has been found in both human skeletal muscle and the heart. Interaction with other proteins indicate that TMEM143 could potentially play a role in tumor suppression/expression and cancer regulation.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

C21orf58

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

Transmembrane protein 254 is a transmembrane protein that is encoded by the TMEM254 gene, it is predicted to have many orthologs across eukaryotes.

C12orf24

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

CCDC121

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

FAM214B

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

FAM120AOS

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

FAM98C

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

FAM166C

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19-85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

Proline-rich protein 29, encoded by the PRR29 gene in humans, is a protein which is located in the human genome at 17q23. Its function is not fully understood. Its name is derived from the chain of 5 proline amino acids located toward the end of the protein. The primary domain within the sequence of this protein is known as DUF4587. It is reported to have high levels of expression in tissues pertaining to the circulatory system and the immune system. It is hypothesized that PRR29 is a nuclear protein that facilitates communication between the nucleus and the mitochondria.

References

  1. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  2. "FAM221B Gene - GeneCards".
  3. "AceView entry on FAM221B".
  4. "GEO GDS 3113 entry on FAM221B in Homo sapiens".
  5. "GEO GDS 3142 entry on FAM221B in Mus musculus".
  6. "BioGPS entry on FAM221B in Homo sapiens".
  7. "BioGPS entry on FAM221B in Mus musculus".
  8. "Markers for mature beta-cells and methods of using the same".
  9. "FAM221B Homo sapiens Isoform X1".
  10. "FAM221B Homo sapiens Isoform X2".
  11. "FAM221B Homo sapiens Isoform X3".
  12. "PHYRE2 secondary structure prediction for FAM221B".[ permanent dead link ]
  13. "MyHits motif scan for post-translational modifications".
  14. "NetPhos 2.0 phosphorylation site predictor".
  15. 1 2 "General protein characteristics from SDSC Biology WorkBench SAPS tool".[ permanent dead link ]
  16. "PSORT II predictions on FAM221B".
  17. "LOMETS prediction for FAM221B".[ permanent dead link ]
  18. "MUSTER prediction for FAM221B".[ permanent dead link ]
  19. "SWISS-model prediction and constructor for FAM221B". Archived from the original on 2 February 2017. Retrieved 10 May 2016.
  20. "BioGrid summary for protein interactions for FAM221B".
  21. "FAM221A Gene - GeneCards".
  22. "NCBI entry on DUF 4475 super family".[ permanent dead link ]
  23. "NCBI entry on PRCC super family".[ permanent dead link ]
  24. "NCBI entry on Caprin-1_C".[ permanent dead link ]

Suggested reading