Fam78b

Last updated
FAM78B
Identifiers
Aliases FAM78B , Fam78b, family with sequence similarity 78 member B
External IDs MGI: 2443050 HomoloGene: 18451 GeneCards: FAM78B
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001017961
NM_001320302

NM_001160261
NM_001160262
NM_175461

RefSeq (protein)

NP_001017961
NP_001307231

NP_001153733
NP_001153734
NP_780670

Location (UCSC) Chr 1: 166.06 – 166.17 Mb Chr 1: 166.83 – 166.92 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Family with Sequence Similarity 78-Member B (FAM78B) is a protein of unknown function in humans that is encoded by the FAM78B gene (1q24.1). It has orthologous genes and predicted proteins in vertebrates and several invertebrates, but not in arthropods. It has a nuclear localization signal in the protein sequence and a miRNA target region in the mRNA sequence.

Contents

Evolutionary analysis

Homology

FAM78B has one paralog, FAM78A, and is conserved throughout many species. Orthologs can be found throughout all vertebrates excluding arthropods. FAM78B is also found in several invertebrates including the pacific oyster and liver fluke. FAM78A, it’s paralog, is also found to be conserved in more invertebrates such as the tunicates, worms, and leeches, and make up the distant homologs of FAM78B. The table below contains a list of FAM78B orthologs with percent identity values and time since divergence values relative to the human FAM78B gene or protein. [5]

Evolutionary History of FAM78B comparing percent identity versus time (millions of years) FAM78B-Evolutionary-History-Graph1.png
Evolutionary History of FAM78B comparing percent identity versus time (millions of years)
Genus and speciesCommon nameDate of divergence from

human lineage (millions of years ago)

Accession numberSequence length (bp or aa)Protein/mRNA identity (%)Query cover (%)
Crassostrea gigasPacific oyster782.7 EKC28338.1 2674879
Clonorchis sinensisLiver fluke792.4 GAA32739.1 2953384
Xiphophorus maculatusPlatyfish400.1 XP_005802428.1 28580100
Takifugu rubripesPufferfish400.1 XP_003975800.1 26180100
Danio rerioZebrafish400.1 XP_001338241.2 28580100
Lepisosteus oculatusSpotted gar400.1 XP_006635127.1 26190100
Xenopus (Silurana) tropicalisWestern clawed frog371.2 XP_002933818.1 26393100
Alligator mississippiensisAlligator296 XP_006276782.1 26192100
Anolis carolinensisAnole lizard296 XP_003219922.1 26190100
Gallus gallusChicken296 XP_001232254.1 26190100
Pseudopodoces humilisGroundpecker296 XP_005528713.1 26190100
Columba liviaRock dove296 XP_005499081.1 2388693
Pelodiscus sinensisSoft shell turtle296 XP_006128883.1 2169472
Trichechus manatus latirostrisManatee98.7 XP_004385839.1 24278100
Camelus ferusCamel94.2 XP_006194022.1 26199100
Mus musculusMouse92.3 NP_780670.2 26298100
Microtus ochrogasterPrairie vole92.3 XP_005370139.1 26199100
Loxodonta africanaElephant98.7 XP_003415081.1 9999998
Macaca fascicularisCrab eating macaque42.6 EHH50788.1 23389100
Pan paniscusBonobo42.6 XP_003824811.1 24388100

Structure

Gene

The FAM78B gene is located on the sense (negative) strand of chromosome 1 at location 1q24.1 and spans the chromosomal locus 166039271-166135909, covering a total of 96,638 base pairs along the chromosome, the FAM78B gene has 2 exons in its transcript mRNA of 1,481 bp. [6] FAM78B in humans is separated into two exons that have 95,243 bp of introns between them. [7] The gene is highly conserved in vertebrates (excluding arthropods) and the pacific clam and liver fluke.

mRNA

There is one isoform that has been identified in humans and is composed of two exons that composes a mRNA of 1481 bp. [8]

Protein

The FAM78B protein has a calculated molecular weight of 30 kDa, has a higher relative abundance of tryptophan (W), has a more greatly conserved c-terminal region, is composed of both alpha helix and beta strand, and resides in the nucleus of the cell after transcription [9]

General properties

The protein FAM78B consists of 254 amino acids with a predicted molecular weight of 30 kDal. The protein has an isoelectric point of 9.6. FAM78B has a highly conserved C terminus among its orthologs and is histidine poor. [10] The highest conserved amino acids are ISDSDG from aa 104-110, WLVA from aa 171-175, VDP---L--R from aa 199-208, and the C’ terminus, but especially NADQVLMW from aa 240-247.

Conservation

The amino acid sequence for FAM78B is highly conserved in mammals, having around 86% to 100% sequence similarity. Birds, frogs, mammals, and lizards also have a high degree of similarity to the human FAM78B sequence with similarities between 76% and 83%. Fish have between 56% and 66% sequence similarity. The C terminal end is the most highly conserved across ortholog-containing species from mammals to the pacific sea clam. [4]

Regulation

mRNA level

There is one miRNA binding site targeted by miR-24 for sequence CUGAGCCA in Homo sapiens located on the 3' end of the mRNA at 88-95 after the stop codon (bp 167,091,390-167,091,397 on chromosome 1). Stem loop from 155-172 of the 3' end of the mRNA matches with the miRNA site. [11]

Protein level

Conserved nuclear localization signal (RPKR) from aa 248-252.

Expression

FAM78B is generally ubiquitously expressed [12] and is highly expressed in regions of the brain. [13]

Clinical relevance

FAM78B is statistically significantly correlated to chronic kidney disease when there is one of three different single nucleotide polymorphisms (SNPs) including two located in the intron (rs2116519 and rs4074897) and one located in the 5’ UTR (rs987131).

Related Research Articles

<span class="mw-page-title-main">ITFG3</span> Protein-coding gene in the species Homo sapiens

Protein ITFG3 also known as family with sequence similarity 234 member A (FAM234A) is a protein that in humans is encoded by the ITFG3 gene. Here, the gene is explored as encoded by mRNA found in Homo sapiens. The FAM234A gene is conserved in mice, rats, chickens, zebrafish, dogs, cows, frogs, chimpanzees, and rhesus monkeys. Orthologs of the gene can be found in at least 220 organisms including the tropical clawed frog, pandas, and Chinese hamsters. The gene is located at 16p13.3 and has a total of 19 exons. The mRNA has a total of 3224 bp and the protein has 552 aa. The molecular mass of the protein produced by this gene is 59660 Da. It is expressed in at least 27 tissue types in humans, with the greatest presence in the duodenum, fat, small intestine, and heart.

<span class="mw-page-title-main">FAM200A</span> Protein-coding gene in the species Homo sapiens

C7orf38 is a gene located on chromosome 7 in the human genome. The gene is expressed in nearly all tissue types at very low levels. Evolutionarily, it can be found throughout the kingdom animalia. While the function of the protein is not fully understood by the scientific community, bioinformatic tools have shown that the protein bares much similarity to zinc finger or transposase proteins. Many of its orthologs, paralogs, and neighboring genes have been shown to possess zinc finger domains. The protein contains a hAT dimerization domain nears its C-terminus. This domain is highly conserved in transposase enzymes.

<span class="mw-page-title-main">FAM185A</span> Gene of the species Homo sapiens

The FAM185A is a protein that in humans is encoded by the FAM185A gene. The FAM185A gene is found on the positive strand of Chromosome 7 at 7q22.1. The gene begins 102,389,399bp from the p-terminus of the chromosome and ends at 102,449,672bp from the p-terminus; it covers a total of 73,308 basepairs. The protein encoded by this gene is characterized by the presence of multiple copies of DUF4098 near its C-terminus. It is described as a Long Interspersed Nuclear Element (LINE), a subclass of penaeid repetitive elements (PREs).

<span class="mw-page-title-main">FAM203B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

<span class="mw-page-title-main">TMEM260</span> Protein-coding gene in the species Homo sapiens

TMEM260 is a protein that in humans is encoded by the TMEM260 gene. The function of TMEM260 is not yet clearly understood. TMEM260 is also known as UPF0679, c14orf101, and FLJ0392.

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">FAM98A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.

<span class="mw-page-title-main">Transmembrane protein 151b</span> Transmembrane protein

Transmembrane protein 151B is a protein that in humans is encoded by the TMEM151B gene.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.


Family with Sequence Similarity 166, member B, or FAM166B, is an uncharacterized protein in humans that is encoded by the FAM166B gene.

<span class="mw-page-title-main">Proser1</span>

PROSER1 is a protein that in humans is encoded by the PROSER1 gene.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">FAM71F2</span> Protein-coding gene in the species Homo sapiens

FAM71F2 or Family with Sequence Similarity 71 member F2 is a protein that in humans is encoded by the Family with Sequence Similarity 71 member F2 gene. This gene is highly active in the reproductive tissues, specifically the testis, and may serve as a potential biomarker for determining metastatic testicular cancer.

<span class="mw-page-title-main">C9orf25</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

<span class="mw-page-title-main">FAM222A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 222 member A or Aggregatin is a protein of unknown function. In humans it is encoded by the gene FAM222A. Aggregatin's cellular function is not well understood, however it has been implicated in Alzheimer's disease.

<span class="mw-page-title-main">FAM155B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">UBALD1</span> Human Gene/Protein

UBALD1 is a protein encoded by the UBALD1 gene, located on chromosome 16 in humans. UBALD1 has high ubiquitous tissue expression and localizes in the nucleus and cytoplasm. UBALD1 is conserved in animals, including invertebrates. An alias for UBALD1 is FAM100A.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000188859 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000060568 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "SDSC Biology Workbench".
  6. "NCBI Gene".
  7. "USCS BLAT".
  8. "NCBI Nucleotide".
  9. Nakai and Horton. "PSORTII".
  10. "SAPS Biology Workbench".[ permanent dead link ]
  11. "mFold".
  12. "Gene Atlas".
  13. "NCBI Geo Profiles".

Further reading