Zinc Finger Protein 821, also known as ZNF821, is a protein encoded by the ZNF821 gene. This gene is located on the 16th chromosome and is expressed highly in the testes, moderately expressed in the brain and low expression in 23 other tissues. The protein encoded is 412 amino acids long with 2 Zinc Finger motifs (C2H2 type) and a 23 amino acid long STPR domain.
ZNF821 is located at 16q22.2 on the minus strand, it is composed of 35,657 bases spanning from base 71,893,583 to 71,929,239. ZNF821 has 8 exons and is located in the same neighborhood as 4 other genes, ATXNL1, IST1, PKD1L3, AP1G1. [5]
Transcription of ZNF821 is handled by the promoter GXP_9784938 which is 539 bases long and located from base 71,884,046 to 71,884,585. The promoter region begins 404 base pairs upstream of the beginning of transcription. Several transcription factors with scores greater than 0.9 are predicted to regulate ZNF821 expression.
Transcription Factor | Abbreviation | Binding Site | Strand |
---|---|---|---|
Myeloid zinc finger protein | MZF1 | GTGGGGATCCG | + |
Cyclin D binding myb-like transcription factor | DMTF | CACCCCTAGGCCCGA | - |
Early growth response 2 | EGRF | GAGAGGGGGTGCCTGCGGC | + |
Human acute myelogenous leukemia factors | HAML | AGCTGTGGTTGGGGG | + |
C2H2 zinc finger transcription factors 2 | ZF02 | GACTTGAGCTACCACCCCATTCT | - |
Hypoxia-response elements | HIFF | CCATCCCACCGCAAATGTGCAGGTC | - |
Ets variant 1 | ETSF | CACGTCCAGGAAGGTCTGGGG | + |
Homeobox transcription factor Nanog | HOXF | ACCCGGGAATGGGCGAGGC | + |
GLIS family zinc finger 3 | GLIF | CGCTCCGCCCCCCAAGG | - |
GTF2I-like repeat 4 of GTF3 | GUCE | CGGGATTGGGC | + |
zinc finger protein with KRAB and SCAN domains 12 | ZF07 | GGAGCCCCTCCTCTCCA | + |
Myc associated zinc finger protein | MAZF | GTCTCGGGGAGAGGAGTCCGGGGCGGGTGTT | - |
Zinc finger and BTB domain containing 14 | VF5F | CGGTCCGCGCGCGGCCC | + , - |
Transcription factor II B (TFIIB) recognition element | TF2B | CCGCGCC | - |
Zinc finger protein 37 alpha | ZF37 | CCTCCCCCT | - |
E2F transcription factor 1 | E2FF | CGCGCGAGGGCGGCGGG | - |
Cas-interacting zinc finger | CIZF | GTAGAAAAAGG | - |
Sma- and Mad-related proteins | SMAD | TCTGTCTGTCT | + |
SRY (sex determining region Y)-box 6 | SORY | CAGACAGACAGACGACAACCGAAACAGGCAG | - |
ZNF821 is highly expressed in the testes, almost 2.5 times as much as in the brain, the next most highly expressed in tissue. Expression in the brain is primarily during fetal development, with lower levels of expression occurring in the cerebellum. There are low levels of expression in most other tissues.
ZNF821 has 7 different transcript variants and 4 isoforms. [6] Variant 1 Isoform 1 is the second longest and but most abundant of all the variants and isoforms. While variant 2 is longer, it contains one fewer exon. Variant 1, Isoform 1 is 1987 bases long with a 5' UTR 415 bases long and a 3' UTR 433 bases long.
Variant | Isoform | Length | # of Exons |
---|---|---|---|
1 | 1 | 1987 | 8 |
2 | 1 | 2005 | 7 |
3 | 2 | 1894 | 7 |
4 | 2 | 1879 | 6 |
5 | 3 | 1853 | 7 |
6 | 4 | 1959 | 8 |
7 | 4 | 1722 | 7 |
The protein encoded by the ZNF821 gene is 412 amino acids long with a calculated molecular weight of ~ 47 kDa and a predicted isoelectric point of 6.14. Compared to the rest of the human proteome, there are decreased amounts of Isoleucine and Tyrosine residues as well as increased levels of Arginine residues. [7]
ZNF821 protein contains two C2H2 Zinc Finger motifs (spanning amino acids 120-140 and 152–172, respectively) and an STPR (one-score-and-three-amino acid peptide repeat) domain (spanning amino acids 223–314) containing a bipartite nuclear localization signal. This STPR domain is a double-stranded DNA-binding domain with similar traits to the silkworm FMBP-1 STPR domain and is thought to be responsible for the nuclear localization of the ZNF821 protein. [8] The secondary structure of the ZNF821 protein is composed of several alpha helical structures along with two small regions of beta sheets. [9] [10] [11] [12] The tertiary structure of the ZNF821 protein provides exposure of the Zinc Fingers for presumed DNA-binding. [13]
The ZNF821 protein binds to DNA making it highly likely to be localized to the nucleus, there is also a bipartite nuclear localization sequence from Lys280 to Arg297, Lys304 to Leu320, and Lys338 to Arg354. An analysis of the subcellular localization in both close and distant orthologs resulted in a >99% chance of being localized to the nucleus for all orthologs. [14]
The ZNF821 protein is predicted to be modified post-translationally at several different positions. When compared with both close and distant orthologous sequences two phosphorylation sites are conserved, the Serine at position 2 and the Threonine at position 7. [15] It is also predicted by several sources to have further phosphorylation sites of the Serine at position 254 and the Tyrosine at position 279. [16] [15]
Several proteins have been shown to interact with the ZNF821 protein, many of them relating to transcriptional regulation. [17]
Abbreviation | Protein Name | Identification Method | Function |
---|---|---|---|
ATM | ATM Serine/Threonine kinase | Two Hybrid Array | Activates checkpoint signaling upon sensing DNA damage |
CCDC85B | Coiled-coil domain-containing protein 85B | Two Hybrid Pooling | Transcriptional Repressor |
SMARCA2 | Probable global transcription activator SNF2L2 | Two Hybrid Pooling | Transcriptional activation and repression by chromatin remodeling |
CDCA7L | Cell division cycle-associated 7-like protein | Two Hybrid Array | Transcriptional Repressor |
PIM2 | Serine/threonine-protein kinase Pim-2 | Two Hybrid Array | Proto-oncogene |
DVL3 | Segment polarity protein dishevelled homolog DVL-3 | Two Hybrid Array | Cell signal transduction |
RUNDC3A | RUN domain-containing protein 3A | Two Hybrid Array | Effector of RAPA2A |
FXR1 | Fragile X mental retardation syndrome-related protein 1 | Two Hybrid Pooling | RNA-binding protein |
ZNF821 has no paralogs in humans. [5]
There are orthologs for ZNF821 across vertebrates, but none for the protein in invertebrates. The Zinc Finger motifs are conserved into invertebrates. The STPR domain is only present in mammals.
Genus | Species | Common Name | Taxonomy | Date of Divergence (MYA) | Sequence Length (AA) | Sequence Identity (%) |
---|---|---|---|---|---|---|
Homo | sapiens | Human | Primates | 0.00 | 412 | 100.00 |
Pongo | abelii | Sumatran Orangutan | Primates | 15.20 | 412 | 99.76 |
Lacerta | agilis | Sand lizard | Squamata | 318.00 | 429 | 76.33 |
Chrysemys | picta bellii | Western Painted turtle | Testudines | 318.00 | 411 | 87.65 |
Gopherus | evgoodei | Sinaloan desert tortoise | Testudines | 318.00 | 413 | 86.44 |
Antrostomus | carolinensis | Chuck-will's-widow | Caprimulgiformes | 318.00 | 411 | 86.20 |
Apteryx | rowi | Okarito kiwi | Apterygiformes | 318.00 | 482 | 87.41 |
Leptosomus | discolor | Cuckoo roller | Leptosomiformes | 318.00 | 410 | 85.68 |
Xenopus | laevis | Two-lined caecilian | Gymnophiona | 351.70 | 409 | 76.76 |
Geotrypetes | seraphini | Gaboon caecilian | Gymnophiona | 351.70 | 378 | 75.26 |
Bufo | bufo | Common toad | Anura | 351.70 | 395 | 66.75 |
Rhinatrema | bivittatum | African clawed frog | Anura | 351.70 | 398 | 69.64 |
Pygocentrus | nattereri | Red-bellied piranha | Characiformes | 433.00 | 456 | 48.58 |
Cyprinus | carpio | Common Carp | Cypriniformes | 433.00 | 441 | 49.56 |
Polypterus | senegalus | Senegal bichir | Polypteriformes | 433.00 | 539 | 54.59 |
Erpetoichthys | calabaricus | Reedfish | Polypteriformes | 433.00 | 471 | 49.58 |
Paramormyrops | kingsleyae | Old Calabar mormyrid | Osteoglossiformes | 433.00 | 443 | 53.63 |
Carcharodon | carcharias | Great white shark | Lamniformes | 465.00 | 360 | 42.07 |
The relative rate of divergence is slow when compared to the rates of two reference proteins, Cytochrome c and Fibrinogen alpha, but increases to slightly faster than Cytochrome c as the date of divergence gets closer to the present.
ZNF821 has been associated in some capacity with several different diseases and conditions. It has been implicated in causing craniosynostosis through interactions with the transcription factor BCL11B by affecting the charges on the arginine-3, arginine-5, and lysine-3 residues, thereby increasing their conformational flexibility. [18] It has also been found to be a possible biomarker for methamphetamine-associated psychosis (MAP) via the process of RNA-degradation. [19] Another disease association is that with breast cancer as part of a DNA-repair sub network. ZNF821 was found to be dysregulated among breast cancer patients. [20] Finally, there is a study showing an increase in methylation over time on ZNF821 in Parkinson's disease patients who did not receive L-dopa/entacapone. This provides a clearer view of changes due only to Parkinson's pathophysiology. [21]
EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.
Zinc finger protein 226 is a protein that in humans is encoded by the ZNF226 gene.
Zinc finger protein 684 is a protein that in humans is encoded by the ZNF684 gene.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
KIAA1143 is an uncharacterized protein in humans that is encoded by the KIAA1143 gene. it may play a role in cell growth mechanisms and regulation/creation of cytoskeletal structure. This gene is located on chromosome 3 on the minus strand
Transmembrane protein 61 (TMEM61) is a protein that is encoded by the TMEM61 gene in humans. It is located on the first chromosome in humansand is highly expressed in the intestinal regions predominantly the kidney, adrenal gland and pituitary tissues. The protein, unlike other transmembrane protein in the region does not promote cancer growth. However, the TMEM61 protein when inhibited by secondary factors restricts normal activity in the kidney. The human protein shares many Orthologs and has been prevalent on Earth for millions of years.
ZNF839 or zinc finger protein 839 is a protein which in humans is encoded by the ZNF839 gene. It is located on the long arm of chromosome 14. Zinc finger protein 839 is speculated to play a role in humoral immune response to cancer as a renal carcinoma antigen (NY-REN-50). This is because NY-REN-50 was found to be over expressed in cancer patients, especially those with renal carcinoma. Zinc finger protein 839 also plays a role in transcription regulation by metal-ion binding since it binds to DNA via C2H2-type zinc finger repeats.
Zinc Finger Protein 62, also known as "ZNF62," "ZNF755," or "ZET," is a protein that in humans is encoded by the ZFP62 gene. ZFP62 is part of the C2H2 Zinc Finger family of genes.
ZNF730 or zinc finger protein 730 is a protein which in humans is encoded by the ZNF730 gene. It is located on the short arm of chromosome 19. Zinc finger protein 730 is speculated to play a role in transcriptional regulation in acute myeloid leukemia and endometrial cancer. This is because ZNF730 was found to be expressed in higher levels in endometrial cancerous tumor samples and has been reported as a core binding factor in acute myeloid leukemia. Zinc Finger protein 730 is a C2H2-type zinc finger protein containing a β/β/α structure, held in place by a Zinc ion. The C2H2-type protein motifs can regulate transcription by recognizing and binding to DNA sequences.
{{cite journal}}
: Cite journal requires |journal=
(help)