LRRC74A | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | LRRC74A , C14orf166B, LRRC74, leucine rich repeat containing 74A | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 3646959; HomoloGene: 19331; GeneCards: LRRC74A; OMA:LRRC74A - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A (aliases C14orf166B, 14q24.3) is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. [5] The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas. [6]
The LRRC74A gene, also known as C14orf166B, is located on the positive-sense strand of locus 14q24.3. The full unspliced gene contains 17 exons. [7] LRRC74A spans from 76,826,408 to 76,870,304 for a total length of 43.9 kpb. [8]
LRRC74A has four transcript variants. The most abundant variant is LRRC74A transcript variant 1, which is 1710 nucleotides in length. [7]
Accession number | Transcript length | Number of exons | Protein length | Isoform |
NM_194287.3 | 1710 | 14 | 488 | 1 |
NM_001322426.2 | 1861 | 5 | 471 | 2 |
NM_001105519.3 | 718 | 4 | 201 | 3 |
NM_001105519.3 | 718 | 4 | 201 | 4 |
The LRRC74A protein is 488 amino acids in length with a predicted molecular weight of 55 kDA and an isoelectric point of 5.22. [9] It has higher than normal levels of methionine and asparagine. [10]
The LRRC74A protein contains eight leucine-rich repeat domains in its sequence. [11] LRRC74A isoform 1 secondary structure is made up of alternating alpha helices and beta sheets. [12] Tertiary structure predictions show a horseshoe-shaped protein with high similarity to ribonuclease inhibitor [13]
LRRC74A has four splice isoforms. The most abundant isoform is LRRC74A protein isoform 1 which is 488 amino acids in length. [7]
Name | Transcript variant | Peptide length | Domains present |
---|---|---|---|
Isoform 1 | 1 | 488 aa | 8 LRR domains |
Isoform 2 | 2 | 471 aa | 6 LRR domains |
Isoform 3 | 3 | 464 aa | 6 LRR domains |
Isoform 4 | 4 | 427 aa | 7 LRR domains |
LRRC74A has overall low levels of expression compared to other proteins but within the tissues it is expressed in, it appears most prominently in the testes, salivary gland, and pancreas. [7] Within the cell, LRRC74A is localized to the cytosol. [15]
The 5' UTR of LRRC74A transcript variant 1 is 91 bp in length. [16] Analysis of potential folding structures identifies two possible stemloop structures. [17]
The 3' UTR is 158 bp in length and contains one polyadenylation signal. [16] It contains four predicted stemloop structures, with three loops closer to the 5' end of the UTR and one loop closer to the 3' end of the UTR.
The human LRRC74A gene has one paralog called LRRC74B. It is located at 22q11.21 [18]
LRRC74A has orthologs in species as distant as tunicates. Mammalian orthologs are moderately similar to human LRRC74A, with percent similarity greater than 80%. Orthologs in reptiles, birds and amphibians range from 65% to 40%. In fish and invertebrates, identity ranges from 40% to 20%. No orthologs were found in fungi, bacteria or plants.
Genus species | Common name | Taxonomic order | Estimated date of divergence (MYA) | Accession number | Sequence length (aa) | Sequence identity (%) | Sequence similarity (%) | |
---|---|---|---|---|---|---|---|---|
Mammalia | Homo sapiens | Human | Primates | 0 | NP_919263.2 | 488 | 100 | 100 |
Mus musculus | Mouse | Rodentia | 87 | NP_001182696.1 | 487 | 65.7 | 77.4 | |
Gulo gulo | Wolverine | Carnivora | 94 | KAI5767761.1 | 488 | 74.6 | 86.3 | |
Ursus maritimus | Polar bear | Carnivora | 94 | XP_040497188.1 | 548 | 60.6 | 70.6 | |
Balaenoptera musculus | Blue whale | Artiodactyla | 94 | XP_036697954.1 | 482 | 68.9 | 80.1 | |
Gracilinanus agilia | Agile gracile opossum | Marsupialia | 106 | XP_044518037.1 | 468 | 52.5 | 71.5 | |
Aves | Gallus gallus | Chicken | Galliformes | 319 | XP_040528719.1 | 476 | 42.8 | 60.9 |
Melopsittacus undulatus | Budgerigar | Psittaciformes | 319 | XP_005149032.1 | 494 | 46.4 | 64.6 | |
Aquila chrysaetos | Golden eagle | Accipitriformes | 319 | XP_029863093.1 | 492 | 46 | 62.1 | |
Phaethon lepturus | White-tailed tropicbird | Phaethontiformes | 319 | XP_010285698.1 | 478 | 44.3 | 61.5 | |
Reptilia | Pelodiscus sinensis | Chinese softshell turtle | Testudines | 319 | XP_025037771.1 | 486 | 49.5 | 68 |
Pogona vitticeps | Central bearded dragon | Squamata | 319 | XP_020649579.1 | 483 | 48 | 64.5 | |
Notechis scutatus | Tiger snake | Squamata | 319 | XP_026520078.1 | 491 | 45.2 | 61.9 | |
Amphibia | Geotrypetes seraphini | Gaboon caecilian | Gymnophiona | 353 | XP_033809167.1 | 540 | 35.6 | 50.3 |
Bufo bufo | Common toad | Anura | 353 | XP_040268304.1 | 536 | 34.5 | 51.4 | |
Fish | Latimeria chalumnae | West Indian Ocean coelacanth | Latimeriidae | 414 | XP_014341482.1 | 456 | 47.5 | 66.2 |
Lepisosteus oculatus | Spotted gar | Lepisosteiformes | 431 | XP_015205589.1 | 450 | 42 | 62.5 | |
Salmo salar | Atlantic salmon | Salmoniformes | 431 | XP_045549789.1 | 648 | 32.3 | 45.1 | |
Carcharodon carcharias | Great white shark | Chondrichthyes | 464 | XP_041070161.1 | 727 | 24 | 37.4 | |
Petromyzon marinus | Sea lamprey | Agnatha | 510 | XP_032820627.1 | 510 | 32.1 | 49.6 | |
Invertebrata | Ciona intestinalis | Vase tunicate | Enterogona | 603 | XP_002120047.1 | 661 | 24.5 | 40.6 |
The LRRC74A gene appears most distantly in tunicates which diverged from humans approximately 603 million years ago. [19] Orthologs of LRRC74A and LRRC74B also occur in tunicates. LRRC74A evolves at a moderately fast rate; a 1% change in amino acid sequence required around 10 million years. Based on sequence similarity of orthologs, LRRC74A evolves at a rate in the middle of cytochrome c and fibrinogen alpha.
A GWAS evaluating genetic mutations and clinical outcomes of patients who contracted COVID-19 found that a mutation in the LRRC74A gene was associated with higher mortality rates in infected patients, with the mutation being 7.4% more prevalent in deceased patients than living patients. [20]
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.