KIAA2012 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | KIAA2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2685819 HomoloGene: 124277 GeneCards: KIAA2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
KIAA2012 is a protein which, in humans, is encoded by the KIAA2012 gene. KIAA2012 is expressed at very low levels throughout the body, but it is primarily expressed in the ovary, lungs, and brain. [5]
KIAA2012 is located on the positive sense strand at position 2q33.1. [6] KIAA2012 has 24 exons, and it spans 131,934 bases including introns. No aliases or common names are used in addition to KIAA2012.
Within the promoter region of KIAA2012, there is a highly conserved transcription factor binding site that has no common SNPs. [7] The RFX transcription factors, more specifically RFX1-6, bind to this highly conserved region and regulates cellular specialization and differentiation. [8] The image below shows the promoter region of KIAA2012 with the highly conserved RFX1-6 binding site. [7]
KIAA2012 is expressed differentially in the body at low levels. Of this overall low expression, KIAA2012 is expressed most highly in the brain, lungs, and ovary. [5] [9] KIAA2012 is expressed at lower levels in the liver, trachea, and testes. [10] [11] [12]
Unmodified KIAA2012 is 1,181 amino acids in length, has a molecular weight of 136 kdal, and an isoelectric pH around 8. [13] [14]
KIAA2012 is rich in glutamic acid and glutamine, and it is poor in valine. [13] There is also one mixed charge cluster between amino acids 951–1118. [15] There is one Domain of Unknown Function (DUF 4670) within KIAA2012 spanning from amino acid 635 to amino acid 1137. [6] Different than the whole KIAA2012, DUF 4670 is also rich in arginine and poor in glycine and phenylalanine. [13]
The secondary structure of KIAA2012 consists primarily of alpha helices. On the left, a high confidence prediction of the secondary structure is shown. On the right, the entire 3-D structure is shown, showing how the alpha helices fold to form the entire KIAA2012 protein.
KIAA2012 has a highly conserved cGMP-dependent protein kinase binding domain. These cGMP-dependent protein kinases (PRKG) are a part of the NO/cGMP signaling pathway, and they are important factors in many signal transduction processes. [16] Additionally, there are many potential sites for phosphorylation, SUMOylation, and myristoylation. In instances where KIAA2012 is post-translationally modified in these ways, the resulting charge, structure, function, and sub-cellular localization can be altered. [17] [18]
Proteins tagged with localization signals will be transported to various regions of the cell. KIAA2012 contains nuclear localization signal sequences, which are short stretches of amino acids that moderate transportation of nuclear proteins to the nucleus. [19] Shown in the table below, human KIAA2012 and two orthologs are listed with confidence values of where in the cell KIAA2012 is localized. [20]
Nuclear | Plasma Membrane | Cytoskeletal | Mitochondrial | Cytoplasmic | Secretory Vesicles | |
---|---|---|---|---|---|---|
Human | 82% | 4% | 9% | 4% | --- | --- |
Sardinian Tree Frog | 78% | 9% | 9% | 4% | --- | --- |
Zebrafish | 74% | 9% | 4% | --- | 9% | 4% |
KIAA2012 has predicted protein interactions with STAG2 and SMC1A. [21] STAG2 encodes a subunit of cohesion complexes used to regulate sister chromatid separation during cell division. [22] SMC1A is an important part of functional kinetochores due to its role in the multiprotein cohesion complex required for sister chromatid cohesion. [23] Because KIAA2012 is localized in the nucleus and interacts with STAG2 and SMC1A, it's role as a protein surrounds DNA manipulation or cell division.
Protein Name | Aliases | Location |
---|---|---|
SMC1A | SMC1, SMCB, CDLS2, SB1.8, SMC1L1, DXS423E, SMC1alpha, RP6-29D12.1 | Xp11.22 [23] |
STAG2 | SA2, SA-2, SCC3B, bA517O1.1, RP11-517O1.1 | Xq25 [22] |
Twenty organisms with a KIAA2012 ortholog are shown below, and they are sorted by date of divergence and sequence identity. There were no orthologs found in birds, but ortholog versions of KIAA2012 exist in mammals, reptiles, amphibians, and fish. An unrooted phylogenetic tree showing each taxonomic group and their divergence patterns can be found below the ortholog table.
Genus & Species | Common Name | Date of Divergenve (MYA) | Accession # | Sequence Length | % Identity | % Similarity |
Homo sapiens | Human | 0 | NM_001277372.4 | 1181 | 100 | 100 |
Hylobates moloch | Silvery Gibbon | 19.5 | XP_032610815 | 1181 | 94.2 | 96.4 |
Sciurus carolinensis | Gray Squirrel | 87 | XP_047398902 | 1130 | 64.1 | 72.3 |
Mus caroli | Mouse | 87 | XP_029333762 | 1160 | 61.3 | 72 |
Panthera uncia | Snow Leopard | 94 | XP_049471125 | 1180 | 75.3 | 82.7 |
Orcinus orca | Killer Whale | 94 | XP_033285753 | 1172 | 74.5 | 82.7 |
Bubalus bubalis | Water Buffalo | 94 | XP_006080602 | 1185 | 72.9 | 81.1 |
Alligator mississippiensis | American Alligator | 319 | XP_059583055 | 1325 | 37.1 | 49.7 |
Caretta caretta | Loggerhead Turtle | 319 | XP_048725054 | 1329 | 37 | 51.4 |
Chelonia mydas | Green Sea Turtle | 319 | XP_037768210 | 1325 | 36.8 | 50.9 |
Crotalus tigris | Tiger Rattlesnake | 319 | XP_039210533 | 1220 | 32.4 | 46.9 |
Xenopus tropicalis | Western Clawed Frog | 352 | XP_031749269 | 1339 | 31.9 | 45.8 |
Rhinatrema bivittatum | Two-Lined Caecilian | 352 | XP_029462137 | 1499 | 30.6 | 44.8 |
Spea bombifrons | Plains Spadefoot Toad | 352 | XP_053326593 | 1436 | 29.6 | 44 |
Hyla sarda | Sardinian Tree Frog | 352 | XP_056391303 | 1428 | 29.5 | 44.9 |
Protopterus annectens | West African Lungfish | 408 | XP_043931036 | 1412 | 29.8 | 45 |
Takifugu rubripes | Japanese Puffer | 429 | XP_029701411 | 1129 | 25.3 | 39.4 |
Danio rerio | Zebrafish | 429 | XP_009302807 | 1484 | 24.9 | 37.1 |
Anarrhichthys ocellatus | Wolf Eel | 429 | XP_031729884 | 1204 | 23.9 | 37.3 |
Amblyraja radiata | Thorny Skate | 462 | XP_032880336 | 1392 | 25.7 | 40.5 |
There are several genome-wide association studies that report traits associated variations in KIAA2012. The reported traits with the highest number of associations are heel bone mineral density, taste liking measurement, educational attainment, lung function, and height. [24] Additionally, KIAA2012 is down regulated in women with polycystic ovary syndrome (PCOS) compared to women without PCOS. [25]
MHC class II regulatory factor RFX1 is a protein that, in humans, is encoded by the RFX1 gene located on the short arm of chromosome 19.
Transcription factor RFX3 is a protein that in humans is encoded by the RFX3 gene.
Basic Leucine Zipper and W2 Domain-Containing Protein 2 is a protein that is encoded by the BZW2 gene. It is a eukaryotic translation factor found in species up to bacteria. In animals, it is localized in the cytoplasm and expressed ubiquitously throughout the body. The heart, placenta, skeletal muscle, and hippocampus show higher expression. In various cancers, upregulation tends to lead to higher severity and mortality. It has been found to interact with SARS-CoV-2.
Cohesin subunit SA-1 (SA1) is a protein that in humans is encoded by the STAG1 gene. SA1 is a subunit of the Cohesin complex which mediates sister chromatid cohesion, homologous recombination and DNA looping. In somatic cells cohesin is formed of SMC3, SMC1, RAD21 and either SA1 or SA2 whereas in meiosis, cohesin is formed of SMC3, SMC1B, REC8 and SA3. There is a nonprofit community formed for those with a STAG1 Gene mutation at www.stag1gene.org.
Transmembrane protein 53, or TMEM53, is a protein that is encoded on chromosome 1 in humans. It has no paralogs but is predicted to have many orthologs across eukaryotes.
Coiled-coil domain-containing protein 135, also known as CCDC135, is a protein that in humans is encoded by the CCDC135 gene.
Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.
MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.
ARMH3 or Armadillo Like Helical Domain Containing 3, also known as UPF0668 and c10orf76, is a protein that in humans is encoded by the ARMH3 gene. Its function is not currently known, but experimental evidence has suggested that it may be involved in transcriptional regulation. The protein contains a conserved proline-rich motif, suggesting that it may participate in protein-protein interactions via an SH3-binding domain, although no such interactions have been experimentally verified. The well-conserved gene appears to have emerged in Fungi approximately 1.2 billion years ago. The locus is alternatively spliced and predicted to yield five protein variants, three of which contain a protein domain of unknown function, DUF1741.
Acyl-CoA thioesterase 9 is a protein that is encoded by the human ACOT9 gene. It is a member of the acyl-CoA thioesterase superfamily, which is a group of enzymes that hydrolyze Coenzyme A esters. There is no known function, however it has been shown to act as a long-chain thioesterase at low concentrations, and a short-chain thioesterase at high concentrations.
C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).
Intermediate filament family orphan 1 is a protein that in humans is encoded by the IFFO1 gene. IFFO1 has uncharacterized function and a weight of 61.98 kDa. IFFO1 proteins play an important role in the cytoskeleton and the nuclear envelope of most eukaryotic cell types.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.
FAM163A, also known as cebelin and neuroblastoma-derived secretory protein (NDSP) is a protein that in humans is encoded by the FAM163A gene. This protein has been implicated in promoting proliferation and anchorage-independent growth of neuroblastoma cancer cells. In addition, this protein has been found to be up-regulated in the lung tissue of chronic smokers. FAM163A is found on human chromosome 1q25.2; its protein product is 167 amino acids long. FAM163A contains a very highly conserved signal peptide sequence, coded for by the first ~37 amino acids in its sequence; albeit only conserved in eukaryotes, the most distant of which being the Japanese Rice Fish.
Coiled-coil domain containing 74A is a protein that in humans is encoded by the CCDC74A gene. The protein is most highly expressed in the testis and may play a role in developmental pathways. The gene has undergone duplication in the primate lineage within the last 9 million years, and its only true ortholog is found in Pan troglodytes.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Transmembrane protein 247 is a multi-pass transmembrane protein of unknown function found in Homo sapiens encoded by the TMEM247 gene. Notable in the protein are two transmembrane regions near the c-terminus of the translated polypeptide. Transmembrane protein 247 has been found to be expressed almost entirely in the testes.
Human protein 53 intron 1 (Hp53int1) is a protein encoded by the Hp53int1 gene in humans.
Family with sequence similarity 13 member B is a protein which in humans is encoded by the FAM13B gene, also known as C5ORF5. The FAM13B gene is found in vertebrates and jawed fish. FAM13B is expressed ubiquitously in human tissues and has been linked to maglinant myelomas susceptibility to atrial fibrillation, a cardiac arrhythmia.