Chromosome 19 open reading frame 47 is a protein that in humans is encoded by the C19orf47 gene. Aliases include Chromosome 19 Open Reading Frame 47, FLJ36888, DKZp686P05129, and LOCI26526. [1]
Homo sapiens C19orf47 is located in cytogenetic band 19q13.2. It covers 28.98 kilobases from 40,854,420 to 40,825,543 on the minus strand. [2] The gene has 8 exons in the isoform 1 precursor, the last of which is the longest and comprises over half of the mRNA transcript. [1]
Transcription of Homo sapiens C19orf47 produces 13 different mRNAs, with 12 alternatively spliced variants and 1 unspliced form. Isoforms and the proteins encoded by them are shown in the table below. Homo sapiens C19orf47 has broad expression in heart, testes, and other tissues.
Isoforms of C19orf47.
Isoform number | Nucleotide Accession | mRNA length (bp) | Protein Accession | Protein Length (aa) |
NM1 | NM_001256440.1 | 2104 | NP_001243369.1 | 422 |
NM2 | NM_001256441.2 | 3611 | NP_001243370.1 | 385 |
X1 | XM_017026291.2 | 3608 | XP_016881780.1 | 384 |
X2 | XM_005258520.3 | 3625 | XP_005258577.1 | 421 |
X3 | XM_017026292.3 | 1407 | XP_016881781.1 | 366 |
X4 | XM_017026293.3 | 1404 | XP_016881782.1 | 365 |
X5 | XM_047438175.1 | 1421 | XP_047294131.1 | 402 |
X6 | XM_024451364.2 | 3507 | XP_024307132.1 | 344 |
X7 | XM_047438176.1 | 3504 | XP_047294132.1 | 343 |
X8 | XM_024451365.2 | 4638 | XP_024307133.1 | 385 |
X8 | XM_047438177.1 | 4671 | XP_047294133.1 | 385 |
X9 | XM_011526460.3 | 3524 | XP_011524762.1 | 381 |
X10 | XM_047438178.1 | 3668 | XP_047294134.1 | 355 |
X11 | XM_047438179.1 | 3241 | XP_047294135.1 | 281 |
The C19orf47 gene isoform 1 precursor encodes for a 422 amino acid protein. The protein is located in the nucleoplasm and nucleus of the cell.
The following proteins have predicted interactions with C19orf47. Interacting proteins with C19orf47 in humans. Notes with important information are shown.
Abbreviated Name | Full Name | Additional Notes |
PARK2 | Parkin RBR E3 Ubiquitin Protein Ligase | Component of multiprotein E3 ubiquitin ligase complex. Mutations are known to cause Parkinson’s disease. |
NSP3 | Non-structural protein 3 | SARS-CoV-2 protein |
ORF14 | Open reading frame 14 | SARS-CoV-2 protein |
MYC | V-Myc Avian Myelocytomatosis Viral Oncogene Homolog 2 3 | Proto-oncogene, forms a heterodimer with related transcription factor for MAX. |
DDX39B | DExD-Box Helicase 39B | RNA-dependent ATPase that mediates ATP hydrolysis during mRNA splicing. |
C17orf85 | Nuclear Cap-Binding Protein Subunit 3 | Associates with NCBP1/CBP80 to form an alternative cap-binding complex (CBC) which plays a key role in mRNA export. |
NXF1 | Nuclear RNA Export Factor 1 | Member of a family of nuclear RNA export factor genes. |
THOC2 | THO Complex 2 | Multiprotein complex binds specifically to spliced mRNAs to facilitate mRNA export. |
YWHAQ | Tyrosine 3-Monooxygenase/Tryptophan 5-Monooxygenase Activation Protein, Theta Polypeptide | Mediates signal transduction by binding to phosphoserine-containing proteins. |
C19orf47 is found in organisms including mammals, reptiles, amphibian, fish, insects, and plant. [3]
Current orthologs of human C19orf47. Sequence identity and similarity are shown.
C19orf47 | Genus, Species | Common Name | Taxonomic Group | Date of Divergence (MYA) | Accession Number | Sequence Length (aa) | Identity | Similarity |
Mammalia | Homo sapiens | Human | Primates | 0 | NP_001243369.1 | 422 | 100.0% | 100.0% |
Mus musculus | Mouse | Rodentia | 87 | XP_036009244.1 | 397 | 75.5% | 80.2% | |
Castor canadensis | American Beaver | Rodentia | 87 | XP_020022531.1 | 382 | 71.3% | 74.7% | |
Reptilia | Gopherus flavomarginatus | Bolson Tortoise | Testudines | 319 | XP_050784538.1 | 386 | 63.3% | 72.9% |
Dermochelys coriacea | Leatherback Sea Turtle | Testudines | 319 | XP_038238045.2 | 449 | 62.3% | 72.0% | |
Varanus komodoensis | Komodo Dragon | Squamata | 319 | XP_044281356.1 | 395 | 60.2% | 69.2% | |
Alligator sinensis | Chinese Alligator | Crocodylia | 319 | XP_025068843.1 | 388 | 54.7% | 63.3% | |
Aves | Haliaeetus leucocephalus | Bald Eagle | Falconiformes | 319 | XP_010564700.1 | 380 | 60.2% | 69.1% |
Phalacrocorax carbo | Great Cormorant | Suliformes | 319 | XP_009501755.1 | 381 | 58.5% | 67.3% | |
Gallus gallus | Chicken | Galliformes | 319 | XP_015129410.4 | 374 | 36.6% | 45.5% | |
Amphibia | Xenopus tropicalis | Frog | Anura | 352 | NP_001005016.1 | 398 | 54.2% | 64.5% |
Fish | Protopterus annectens | West African Lungfish | Lepidosireniformes | 408 | XP_043937251.1 | 393 | 46.4% | 57.0% |
Latimeria chalumnae | West Indian Ocean Coelacanth | Coelacanthiformes | 415 | XP_014348608.1 | 381 | 58.2% | 69.7% | |
Danio rerio | Zebrafish | Cypriniformes | 429 | NP_001038706.1 | 392 | 48.6% | 59.5% | |
Leucoraja erinacea | Little Skate | Rajiformes | 462 | XP_055519598.1 | 395 | 53.0% | 64.2% | |
Petromyzon marinus | Sea Lamprey | Petromyzontiformes | 563 | XP_032803651.1 | 452 | 41.6% | 52.4% | |
Arthropods | Rhipicephalus sanguineus | Brown Dog Tick | Ixodida | 686 | XP_037499932.1 | 428 | 29.3% | 39.9% |
Biomphalaria glabrata | Bloodfluke Planorb | Planorbidae | 686 | XP_055879100.1 | 370 | 26.6% | 36.0% | |
Polistes fuscatus | Northern Paper Wasp | Hymenoptera | 686 | XP_043494673.1 | 409 | 24.1% | 39.3% | |
Plants | Gossypium anomalum | Wild Cotton | Malvales | 1530 | KAG8495680.1 | 266 | 11.5% | 20.6% |
One study discusses the identification of four novel mutations in the TUBB4A gene associated with laryngeal and cervical dystonia, a rare neurological disorder. These mutations were found in several affected families, and the study highlights the complexity of this genetic condition, with evidence of incomplete penetrance in some cases. Laryngeal dystonia, often the initial symptom, is a prominent feature of the disease. Of note, there was presence of a variant in the C19orf47 gene in one family. It was shown that the variant in the gene TUBB4A was more likely to be the source of the phenotype, as C19orf47 has low expression in the brain. [4]
HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Transmembrane protein 44 is a protein that in humans is encoded by the TMEM44 gene.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.
C2orf80 is a protein that in humans is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.
{{cite web}}
: Missing or empty |url=
(help)