C1orf74 | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||
Aliases | C1orf74 , URLC4, chromosome 1 open reading frame 74 | ||||||||||||||||||||||||
External IDs | MGI: 2441776 HomoloGene: 17592 GeneCards: C1orf74 | ||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
Orthologs | |||||||||||||||||||||||||
Species | Human | Mouse | |||||||||||||||||||||||
Entrez | |||||||||||||||||||||||||
Ensembl | |||||||||||||||||||||||||
UniProt | |||||||||||||||||||||||||
RefSeq (mRNA) | |||||||||||||||||||||||||
RefSeq (protein) | |||||||||||||||||||||||||
Location (UCSC) | Chr 1: 209.78 – 209.78 Mb | Chr 1: 193.17 – 193.18 Mb | |||||||||||||||||||||||
PubMed search | [3] | [4] | |||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||
|
UPF0739 protein C1orf74 is a protein that in humans is encoded by the C1orf74 gene. [5]
The gene C1orf74 is a protein-encoding gene on chromosome 1 in humans. [5] [6] It is also known as URLC4 in humans. The locus of this gene is 1q32.2. C1orf74 is 2229 base pairs long. The gene contains two exons.
C1orf74 is downstream of the gene interferon regulatory factor 6 or IRF6 in humans. [7]
C1orf74 is transcribed into an mRNA that is 1642 nucleotides long in humans. [8] The transcript contains two exons and one upstream in-frame stop codon. The 5' UTR of this transcript is 343 nucleotides long and the 3' UTR is 570 nucleotides long.
Both exons are usually transcribed. [9] A few cases exist where only the second exon was transcribed. A fusion transcript containing IRF6 and the first exon of C1orf74 has also been found, but this transcript results in a short polypeptide. [10]
The protein that is encoded by C1orf74 in humans is most commonly known as UPF0739 protein C1orf74. [11] The human version of UPF0739 contains 269 amino acids and weighs 29430 Da. Amino acids 19 to 269 are part of a domain of unknown function known as DUF4504. Within this DUF, there are two conserved sequence motifs LLGYP and SFS.
The translational start site of C1orf74 is after the exon-exon junction, which means the protein is made only by translating the second exon.
C1orf74 is ubiquitously expressed in most tissues in humans during embryonic development and through adulthood. [12] This gene is expressed throughout the nervous system, mammary and salivary glands, skin, and most internal organs.
One suggestion of C1orf74's function in humans comes from data that has been published only in NCBI from a paper that will come out later this year by Daigo and Nakamura. The authors found that C1orf74 is up-regulated in lung cancer and they have added the alias URLC4 to this protein (BAQ19750).
C1orf74's locus, 1q32.2, has been associated with schizophrenia. [13] This means that C1orf74, or its neighbors, contribute to the risk of schizophrenia. A mutation in IRF6, C1orf74's upstream neighbor, results in cleft palate and Van der Woude syndrome. [14] Mutations in regions upstream and downstream of IRF6, such as C1orf74, may also result in Van der Woude syndrome or these mutations may work with a mutation in IRF6 to result in the disease.
The human gene C1orf74 does not have any known paralogs, but it has many orthologs that contain the same DUF. It has orthologs in most vertebrates and some invertebrates, like worms, leeches and sea snails. [15] Some of C1orf74's orthologs include mouse, hedgehog, chicken, zebrafish, alligator, and leech. C1orf74 has a distant ortholog in white rust ( Albugo candida ), which is a type of oomycete and not a true fungus. [16] No orthologs were found in plants, fungi, or bacteria.
HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene. When the l7Rn6 protein is disrupted in mice, the mice display severe emphysema at birth as a result of disorganization of the Golgi apparatus and formation of aberrant vesicular structures within club cells.
C9orf64 is a gene located on chromosome 9, that in humans encodes the protein queuosine salvage protein. The function and biological process of the queuosine salvage protein is not well understood by the scientific community, but some evidence from orthologs indicates it may be involved in tRNA processing. The most common mRNA contains 4 coding exons, and it has 2 additional alternatively spliced exons. C9orf64 has been found in 5 different splice variants.
METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.
CZIB is a gene in the human genome that encodes the protein CXXC motif containing zinc binding protein. CZIB was previously referred to as C1orf123.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
TMEM260 is a protein that in humans is encoded by the TMEM260 gene. The function of TMEM260 is not yet clearly understood. TMEM260 is also known as UPF0679, c14orf101, and FLJ0392.
Cilia And Flagella Associated Protein 206 (CFAP206) is a gene that in humans encodes a protein “DUF3508”. This protein has a function that is not currently very well understood. Other known aliases are “dJ382I10.1, UPF0704 Protein C6orf165.” In humans, the gene coding sequence is 56,501 base pairs long, with an mRNA of 2,215 base pairs, and a protein sequence of 622 amino acids. The C6orf165 gene is conserved in chimpanzee, rhesus monkey, dog, cow, mouse, rat, chicken, zebrafish, mosquito, frog, and more C6orf165 is rarely expressed in humans, with relatively high expression in brain, lungs (trachea) and testis. The molecular weight of UPF0704 is 71,193 Da and the PI is 6.38
Family with sequence similarity 167, member A is a protein in humans that is encoded by the FAM167A gene located on chromosome 8. FAM167A and its paralogs are protein encoding genes containing the conserved domain DUF3259, a protein of unknown function. FAM167A has many orthologs in which the domain of unknown function is highly conserved.
Serine-rich single pass membrane protein 1 is a protein that in humans is encoded by the SSMEM1 gene.
Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.
PRR29 is a protein located on human chromosome 17 that in humans is encoded by the PRR29 gene.
Chromosome 8 open reading frame 82 is a protein encoded in humans by the C8orf82 gene.
Fanconi Anemia Opposite Strand Transcript protein is a predicted protein that in humans is encoded by the FANCD2OS gene. The name is derived from mRNA transcribed from the strand complementary to the FANCD2 gene.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.
C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.