Chromosome 1 open reading frame 68, or C1orf68, is a human gene which encodes for skin-specific protein 32. C1orf68 gene is expressed in the skin, [1] is a part of the epidermal differentiation complex, and potentially plays a role in epidermal cornification, and epidermal barrier function. [2] [3]
C1orf68 is mapped on the plus strand of chromosome 1 at 1q21.3, that spans 949 base pairs in the human genome. [4] [5] Other aliases include Late envelope protein 7 (LEP7), XP32, Skin-Specific Protein (Xp32). [4] This gene has only 1 exon, and no introns. It is a part of the epidermal differentiation complex (1q21).
Skin-specific protein 32 has only one isoform, and has a sequence length of 250 amino acids. [6] It has a molecular mass of 26 kDa, [7] [8] and a predicted pI value of 8.41. [8] It was noted that the amino acid sequence contained high levels of cysteine relative to other human protein sequences. [9]
Skin-specific protein 32 has one domain, PRK10264, which is a DNA translocase FtsK. [4] It also contains a cysteine rich region, which is shown to be conserved across most mammal orthologs, excluding Monotremes. [10]
The protein sequence also contains a repeat sequence, the three continuous repeat sequences are located from amino acid position Gln65 to Cys127. [11] The repeat sequences can be observed in the conceptual translation on the right. They are within the DNA translocase Ftsk domain and the cysteine rich region. The repeat sequences are conserved across mammal orthologs. The conservation of each individual amino acid can be observed in the LOGO below.
One promoter was identified for C1orf68 using ElDorado Genomatics. [12] This promoter, GXP_1818199, spans 1,040 bases and overlaps C1orf68 by 40 bases. [12] Since C1orf68 does not contain a 5'-UTR, the promoter overlaps the start codon, which can be visualized in the diagram below.
C1orf68 is expressed in a select few tissues, specifically in the skin and in breast tissue. [13] In humans, C1orf68 protein abundance is moderate. [14] In terms of specific cell types within the skin, C1orf68 is expressed in suprabasal keratinocytes, which are a type of epithelial cell. [15] It has also been noted that C1orf68 is moderately expressed in stratum corneum and granular layer of skin. [16] This could be because the protein remains in the cell as it differentiates and matures.
C1orf68 does not contain a 5'-UTR, but does contain a 3'-UTR. The predicted secondary structure of C1orf68's 3'-UTR mRNA contains various stem loops. The stem loop containing PUM2 RNA protein binding site, which was shown in all of the predicted structures created by mFold. [17]
Skin-specific protein 32 is predicted to be localized in the cytoplasm. The protein has been shown to occupy the cytoplasm within skin cells, which can be observed in the immunofluorescence staining in Human Protein Atlas, Subcellular. [18]
There are no known paralogs of C1orf68.
C1orf68 has a range of orthologs within mammals, and some amphibians, specifically shown in two frog species. The ortholog sequence similarity percentages range from 96 to 23%. There are no orthologs in birds, fish, and reptiles but there was a few in amphibians. Additionally, within the mammals, there was no orthologs in Cetacea (marine mammals). [19] The most highly conserved amino acids across mammals and amphibians with available sequences are Pro61, Pro73, Pro126, Pro182, which are all proline amino acids.
Genus Species | Common Name | Taxonomic Group | Divergence Data (MYA) Median Time [20] | Accession Number | Query Cover | Sequence Length (aa) | Sequence Identity (%) | Sequence Similarity (%) |
---|---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primates | 0 | NP_001019850 | 100% | 250 | 100% | 100% |
Rhinopithecus roxellana | Golden snub-nosed monkey | Primates | 29 | XP_030792113 | 98% | 250 | 94% | 96% |
Callithrix jacchus | Common marmoset | Primates | 43 | XP_035135776 | 96% | 256 | 83% | 85% |
Cavia porcellus | Guinea pig | Rotentia | 89 | XP_005007858 | 96% | 249 | 80% | 84% |
Ochotona curzoniae | Plateau pika | Glires | 89 | XP_040854203 | 100% | 241 | 77% | 80% |
Sus scrofa | Wild boar | Artiodactyla | 94 | XP_003125804 | 96% | 248 | 79% | 82% |
Myotis brandtii | Brandt's bat | Chiroptera | 94 | XP_005880696 | 98% | 248 | 75% | 79% |
Sorex araneus | Common shrew | Eulipotyphla | 94 | XP_004618165 | 76% | 261 | 70% | 73% |
Orycteropus afer | Aardvark | Afrotheria | 102 | XP_007956474 | 98% | 270 | 73% | 76% |
Echinops telfairi | Lesser hedgehog tenrec | Afrotheria | 102 | XP_004717741 | 91% | 262 | 73% | 75% |
Dasypus novemcinctus | Nine-banded armadillo | Xenarthra | 102 | XP_004469783 | 100% | 257 | 71% | 76% |
Trichosurus vulpecula | Common brushtail possum | Diprotodontia | 160 | XP_036609710 | 100% | 254 | 62% | 67% |
Phascolarctos cinereus | Koala | Diprotodontia | 160 | XP_020847076 | 100% | 266 | 60% | 65% |
Vombatus ursinus | Common wombat | Diprotodontia | 160 | XP_027726322 | 100% | 291 | 52% | 58% |
Sarcophilus harrisii | Tasmanian devil | Dasyuromorphia | 160 | XP_003770670 | 99% | 268 | 52% | 60% |
Dromiciops gliroides | Colocolo opossum | Microbiotheria | 160 | XP_043856143 | 100% | 317 | 51% | 57% |
Monodelphis domestica | Gray short-tailed opossum | Didelphimorphia | 160 | XP_016285839 | 96% | 292 | 54% | 59% |
Ornithorhynchus anatinus | Platypus | Monotremata | 180 | XP_028910439 | 98% | 244 | 50% | 56% |
Tachyglossus aculeatus | Short-beaked echidna | Monotremata | 180 | XP_038624254 | 98% | 264 | 47% | 54% |
Ranitomeya imitator | Mimic poison frog | Anura | 352 | CAF5025995 | 96% | 251 | 26% | 33% |
Xenopus tropicalis | Western clawed frog | Anura | 352 | KAE8606393 | 54% | 296 | 23% | 33% |
The figure below shows more information about the evolutionary rate of C1orf68 throughout its orthologs. The rate of evolution of C1orf68 was observed to be fast when comparing to cytochrome c and fibrinogen alpha. This observation is determined since C1orf68 appears to evolve at a similar rate to fibrinogen alpha, which serves as a standard for rapidly evolving genes.
Three different transcription factors for C1orf68 were predicted and obtained from MatInspector Genomatics. [12]
Grainyhead-like 2 has been shown to impair keratinocyte differentiation through transcriptional inhibition of the gene in the epidermal differentiation complex. [21] Also showed enhanced protein and mRNA levels in chronic skin lesions, such as in psoriasis. [21]
Zinc finger E-box-binding homeobox 1 has been shown to regulate corneal epithelial terminal phenotype. [22]
GATA-binding factor 3 has been shown localized in the cytoplasm and nucleus of proliferating keratinocytes but only in the nucleus in differentiated keratinocytes. [23] It has also been shown that GATA3 induces differentiation of primary keratinocytes, and suggested that it may regulate human interfollicular epidermal renewal. [23]
Other potential proteins that interact with C1orf68 are located in the table below. These proteins were selected from the results from prediction tools [4] [24] [25] because of their participation in the epidermal cornified envelope, the location of their gene within the epidermal differentiation complex, and the localization to the cytoplasm.
Abbreviated Name | Full Name | Basis of Identification | Protein Description |
---|---|---|---|
KPRP | Keratinocyte proline-rich protein | Affinity Capture-MS [26] | This protein's gene is located on the epidermal differentiation complex on chromosome 1q21. Protein has a potential role in keratinocyte differentiation. [27] |
TGM3 | Transglutaminase 3 | Affinity Capture-MS [26] | An epidermal cross-linking enzyme, it's involved with the formation of the cornfield envelope. [28] |
CYLD | CYLD Lysine 63 Deubiquitinase | Affinity chromatography technology [29] | This protein functions as a deubiquitinating enzyme, and is localized in the cytoplasm. [30] |
C1orf68 is expressed differently when we look at samples of healthy skin, skin with psoriasis without lesions and skin with psoriasis with lesions. [31] In one study, it was suggested that proteins with significant differences in expression in skin with psoriasis without lesions and skin with psoriasis with lesions, could contribute to maintaining the non-lesional state and may add to our understanding of lesion formation. [32]
Keratinocytes are the primary type of cell found in the epidermis, the outermost layer of the skin. In humans, they constitute 90% of epidermal skin cells. Basal cells in the basal layer of the skin are sometimes referred to as basal keratinocytes. Keratinocytes form a barrier against environmental damage by heat, UV radiation, water loss, pathogenic bacteria, fungi, parasites, and viruses. A number of structural proteins, enzymes, lipids, and antimicrobial peptides contribute to maintain the important barrier function of the skin. Keratinocytes differentiate from epidermal stem cells in the lower part of the epidermis and migrate towards the surface, finally becoming corneocytes and eventually being shed, which happens every 40 to 56 days in humans.
Interleukin 17 family is a family of pro-inflammatory cystine knot cytokines. They are produced by a group of T helper cell known as T helper 17 cell in response to their stimulation with IL-23. Originally, Th17 was identified in 1993 by Rouvier et al. who isolated IL17A transcript from a rodent T-cell hybridoma. The protein encoded by IL17A is a founding member of IL-17 family. IL17A protein exhibits a high homology with a viral IL-17-like protein encoded in the genome of T-lymphotropic rhadinovirus Herpesvirus saimiri. In rodents, IL-17A is often referred to as CTLA8.
Protein-glutamine gamma-glutamyltransferase K is a transglutaminase enzyme that in humans is encoded by the TGM1 gene.
S100 calcium-binding protein A7 (S100A7), also known as psoriasin, is a protein that in humans is encoded by the S100A7 gene.
Keratohyalin is a protein structure found in cytoplasmic granules of the keratinocytes in the stratum granulosum of the epidermis. Keratohyalin granules (KHG) mainly consist of keratin, profilaggrin, loricrin and trichohyalin proteins which contribute to cornification or keratinization, the process of the formation of epidermal cornified cell envelope. During the keratinocyte differentiation, these granules maturate and expand in size, which leads to the conversion of keratin tonofilaments into a homogenous keratin matrix, an important step in cornification.
Involucrin is a protein component of human skin and in humans is encoded by the IVL gene. In binding the protein loricrin, involucrin contributes to the formation of a cell envelope that protects corneocytes in the skin.
Fatty acid-binding protein, epidermal is a protein that in humans is encoded by the FABP5 gene.
Loricrin is a protein that in humans is encoded by the LOR gene.
Caspase 14 is an enzyme that in humans is encoded by the CASP14 gene. Orthologs of this gene also exist in other mammals, such as sirenians and cetaceans, though they are inactivated in these two clades. Curiously, manatees, which are sirenians, retain some functional CASP14 genes.
Corneodesmosin is a protein that in humans is encoded by the CDSN gene.
In molecular biology miR-203 is a short non-coding RNA molecule. MicroRNAs function to regulate the expression levels of other genes by several mechanisms, such as translational repression and Argonaute-catalyzed messenger RNA cleavage. miR-203 has been identified as a skin-specific microRNA, and it forms an expression gradient that defines the boundary between proliferative epidermal basal progenitors and terminally differentiating suprabasal cells. It has also been found upregulated in psoriasis and differentially expressed in some types of cancer.
Protein S100-A7A (S100A7A), also known as koebnerisin, is a protein that in humans is encoded by the S100A7A gene.
Trichohyalin is a protein that in mammals is encoded by the TCHH gene.
The S100 calcium-binding protein mS100a7a15 is the murine ortholog of human S100A7 (Psoriasin) and human S100A15 (Koebnerisin). mS100a7a15 is also known as S100a15, mS100a7 and mS100a7a and is encoded by the mS100a7a gene
The epidermal differentiation complex (EDC) is a gene complex comprising over fifty genes encoding proteins involved in the terminal differentiation and cornification of keratinocytes, the primary cell type of the epidermis. In humans, the complex is located on a 1.9 Mbp stretch within chromosome 1q21. The proteins encoded by EDC genes are closely related in terms of function, and evolutionarily they belong to three distinct gene families: the cornified envelope precursor family, the S100 protein family and the S100 fused type protein (SFTP) family.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
UPF0575 protein C19orf67 is a protein which in humans is encoded by the C19orf67 gene. Orthologs of C19orf67 are found in many mammals, some reptiles, and most jawed fish. The protein is expressed at low levels throughout the body with the exception of the testis and breast tissue. Where it is expressed, the protein is predicted to be localized in the nucleus to carry out a function. The highly conserved and slowly evolving DUFF3314 region is predicted to form numerous alpha helices and may be vital to the function of the protein.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Proline-Rich Protein 23A is a protein that is encoded by the Proline-Rich 23A (PRR23A) gene.