LRRC74A

Last updated
LRRC74A
Identifiers
Aliases LRRC74A , C14orf166B, LRRC74, leucine rich repeat containing 74A
External IDs MGI: 3646959; HomoloGene: 19331; GeneCards: LRRC74A; OMA:LRRC74A - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_194287
NM_001385106
NM_001385107
NM_001385108

NM_001195767

RefSeq (protein)

NP_919263

NP_001182696

Location (UCSC) Chr 14: 76.83 – 76.87 Mb Chr 12: 86.78 – 86.81 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A (aliases C14orf166B, 14q24.3) is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. [5] The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas. [6]

Contents

Gene

The LRRC74A gene, also known as C14orf166B, is located on the positive-sense strand of locus 14q24.3. The full unspliced gene contains 17 exons. [7] LRRC74A spans from 76,826,408 to 76,870,304 for a total length of 43.9 kpb. [8]


Transcripts

LRRC74A has four transcript variants. The most abundant variant is LRRC74A transcript variant 1, which is 1710 nucleotides in length. [7]

LRRC74A transcript variants
Accession numberTranscript lengthNumber of exonsProtein lengthIsoform
NM_194287.31710144881
NM_001322426.2186154712
NM_001105519.371842013
NM_001105519.371842014

Protein

The LRRC74A protein is 488 amino acids in length with a predicted molecular weight of 55 kDA and an isoelectric point of 5.22. [9] It has higher than normal levels of methionine and asparagine. [10]

Domains and structure

The LRRC74A protein contains eight leucine-rich repeat domains in its sequence. [11] LRRC74A isoform 1 secondary structure is made up of alternating alpha helices and beta sheets. [12] Tertiary structure predictions show a horseshoe-shaped protein with high similarity to ribonuclease inhibitor [13]

LRRC74A Tertiary Structure prediction by I-TASSER Lrrc74aITASSER.gif
LRRC74A Tertiary Structure prediction by I-TASSER

[14]

Isoforms

LRRC74A has four splice isoforms. The most abundant isoform is LRRC74A protein isoform 1 which is 488 amino acids in length. [7]

LRRC74A protein isoforms
NameTranscript variantPeptide lengthDomains present
Isoform 11488 aa8 LRR domains
Isoform 22471 aa6 LRR domains
Isoform 33464 aa6 LRR domains
Isoform 44427 aa7 LRR domains

Regulation

Expression pattern

LRRC74A has overall low levels of expression compared to other proteins but within the tissues it is expressed in, it appears most prominently in the testes, salivary gland, and pancreas. [7] Within the cell, LRRC74A is localized to the cytosol. [15]

Transcript level regulation

Predicted 5' UTR folding structure of LRRC74A Lrrc74a5UTR.png
Predicted 5' UTR folding structure of LRRC74A

The 5' UTR of LRRC74A transcript variant 1 is 91 bp in length. [16] Analysis of potential folding structures identifies two possible stemloop structures. [17]

Predicted folding structure of the 3' UTR of LRRC74A Lrrc74a3utr.png
Predicted folding structure of the 3' UTR of LRRC74A

The 3' UTR is 158 bp in length and contains one polyadenylation signal. [16] It contains four predicted stemloop structures, with three loops closer to the 5' end of the UTR and one loop closer to the 3' end of the UTR.

Homology and evolution

Paralogs

The human LRRC74A gene has one paralog called LRRC74B. It is located at 22q11.21 [18]

Orthologs

LRRC74A has orthologs in species as distant as tunicates. Mammalian orthologs are moderately similar to human LRRC74A, with percent similarity greater than 80%. Orthologs in reptiles, birds and amphibians range from 65% to 40%. In fish and invertebrates, identity ranges from 40% to 20%. No orthologs were found in fungi, bacteria or plants.

Unrooted phylogenetic tree depicting the evolution of LRRC74A Lrrc74atree.png
Unrooted phylogenetic tree depicting the evolution of LRRC74A
Genus speciesCommon nameTaxonomic orderEstimated date of divergence (MYA)Accession numberSequence length (aa)Sequence identity (%)Sequence similarity (%)
Mammalia Homo sapiens Human Primates 0NP_919263.2488100100
Mus musculus Mouse Rodentia 87NP_001182696.148765.777.4
Gulo gulo Wolverine Carnivora 94KAI5767761.148874.686.3
Ursus maritimus Polar bear Carnivora 94XP_040497188.154860.670.6
Balaenoptera musculus Blue whale Artiodactyla 94XP_036697954.148268.980.1
Gracilinanus agilia Agile gracile opossum Marsupialia 106XP_044518037.146852.571.5
Aves Gallus gallus Chicken Galliformes 319XP_040528719.147642.860.9
Melopsittacus undulatus Budgerigar Psittaciformes 319XP_005149032.149446.464.6
Aquila chrysaetos Golden eagle Accipitriformes 319XP_029863093.14924662.1
Phaethon lepturus White-tailed tropicbird Phaethontiformes 319XP_010285698.147844.361.5
Reptilia Pelodiscus sinensis Chinese softshell turtle Testudines 319XP_025037771.148649.568
Pogona vitticeps Central bearded dragon Squamata 319XP_020649579.14834864.5
Notechis scutatus Tiger snake Squamata 319XP_026520078.149145.261.9
Amphibia Geotrypetes seraphini Gaboon caecilian Gymnophiona 353XP_033809167.154035.650.3
Bufo bufo Common toad Anura 353XP_040268304.153634.551.4
Fish Latimeria chalumnae West Indian Ocean coelacanth Latimeriidae 414XP_014341482.145647.566.2
Lepisosteus oculatus Spotted gar Lepisosteiformes 431XP_015205589.14504262.5
Salmo salar Atlantic salmon Salmoniformes 431XP_045549789.164832.345.1
Carcharodon carcharias Great white shark Chondrichthyes 464XP_041070161.17272437.4
Petromyzon marinus Sea lamprey Agnatha 510XP_032820627.151032.149.6
Invertebrata Ciona intestinalis Vase tunicate Enterogona 603XP_002120047.166124.540.6

Evolution

Lrrc74aevorate.png

The LRRC74A gene appears most distantly in tunicates which diverged from humans approximately 603 million years ago. [19] Orthologs of LRRC74A and LRRC74B also occur in tunicates. LRRC74A evolves at a moderately fast rate; a 1% change in amino acid sequence required around 10 million years. Based on sequence similarity of orthologs, LRRC74A evolves at a rate in the middle of cytochrome c and fibrinogen alpha.


Clinical significance

Disease association

COVID-19

A GWAS evaluating genetic mutations and clinical outcomes of patients who contracted COVID-19 found that a mutation in the LRRC74A gene was associated with higher mortality rates in infected patients, with the mutation being 7.4% more prevalent in deceased patients than living patients. [20]

Related Research Articles

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">TMEM176B</span> Protein-coding gene in the species Homo sapiens

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">C7orf26</span> Human protein-encoding gene on chromosome 7

c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">NOXRED1</span> Human gene

NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000100565 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000059114 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Protein LRRC74A isoform 1 Homo sapiens". National Center for Biotechnology Information. U.S. National Library of Medicine. Retrieved 7 December 2022.
  6. "LRRC74A". The Human Protein Atlas. Knut and Alice Wallenberg Foundation. Retrieved 7 December 2022.
  7. 1 2 3 4 "LRRC74A Leucine-rich Repeat-containing Protein 74A [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  8. "Genome Data Viewer - NCBI". www.ncbi.nlm.nih.gov. Retrieved 7 December 2022.
  9. ""ExPASy - Compute pI/Mw tool"". Expasy.
  10. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk.
  11. "protein LRRC74A isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-12-13.
  12. "A Protein Secondary Structure Prediction Server". JPred4.
  13. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2022-12-13.
  14. "I-TASSER". ZhangGroup.
  15. "PSORT II Prediction". psort.hgc.jp. Retrieved 16 December 2022.
  16. 1 2 "Homo sapiens leucine rich repeat containing 74A (LRRC74A), transcript variant 1, mRNA" . Retrieved 2022-12-13.
  17. "RNA Folding Form". UNAFold.
  18. "LRRC74B leucine rich repeat containing 74B [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 13 December 2022.
  19. Kumar S, Stecher G, Suleski M. "TimeTree: The Timescale of Life". www.timetree.org. Retrieved 15 December 2022.
  20. Pandit R, Singh I, Ansari A, Raval J, Patel Z, Dixit R, et al. (June 2022). "First report on genome wide association study in western Indian population reveals host genetic factors for COVID-19 severity and outcome". Genomics. 114 (4). doi:10.1016/j.ygeno.2022.110399. PMC   9169419 . PMID   35680011.