LRRC40

Last updated
LRRC40
Identifiers
Aliases LRRC40 , dJ677H15.1, leucine rich repeat containing 40
External IDs MGI: 1914394 HomoloGene: 9825 GeneCards: LRRC40
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_017768

NM_001289524
NM_001289525
NM_024194
NM_001359763

RefSeq (protein)

NP_060238

NP_001276453
NP_001276454
NP_077156
NP_001346692

Location (UCSC) Chr 1: 70.14 – 70.21 Mb Chr 3: 157.74 – 157.77 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Leucine rich repeat containing 40 (LRRC40) is a protein that in humans is encoded by the LRRC40 gene. [5]

Contents

Species distribution

LRRC40 is conserved throughout all of its orthologs. The entire protein is highly conserved in mammals, while conservation is high within the leucine rich repeats in the rest of the orthologs. [6] Orthologs were found all the way back to the scarlet sea anemone and homologs were found in bacteria and Archaea using BLAST. [7] The following table gives information on the homologs of LRRC40.

Genus species Organism common nameDivergence from humans (MYA) [8] NCBI mRNA accessionSequence similarity [7] Protein lengthCommon gene name
Homo sapiens [9] Humans--NM_017768100%602LRRC40
Pan troglodytes [10] Common chimp6.4XM_51348399%602Hypothetical protein
Pongo abelii [11] Orangutan15.8NM_00113118099%602LRRC40
Macaca fascicularis [12] Long-tailed macaque30.2AB17921999%602Full LRRC40
Callithrix jacchus [13] Common marmoset43.9XM_002750952.199%602Predicted: LRRC40
Sus scrofa [14] Wild boar92.5XM_00312792896%602Predicted: LRRC40 like protein
Mus musculus [15] Mouse94.1NM_02419492%602LRRC40
Monodelphis domestica [16] Opossum160.2XM_00137941786%598Hypothetical protein
Gallus gallus [17] Chicken274.8NM_00103129585%603LRRC40
Taeniopygia guttata [18] Zebra finch274.8XM_00218836785%605Predicted: LRRC40
Xenopus (Silurana) tropicalis [19] Western clawed frog389.7NM_00101131080%605LRRC40
Danio rerio [20] Zebrafish444.3NM_19986283%601LRRC40
Salmo salar [21] Salmon444.3BT04362182%600LRRC40
Nematostella vectensis [22] Scarlet sea anemone830.3XM_00164023066%602Predicted protein
Culex quinquefasciatus [23] Southern house mosquito838.3XM_001842697.158%612LRRC40

Gene

LRRC40 is located on the negative DNA strand (see Sense (molecular biology)) of chromosome 1 from 70,611,483- 70,671,223. [24] The gene produces a 2958 base pair mRNA. There are 15 predicted exons in the human gene [9] with four other splice patterns predicted on GeneCards by the Alternative Splice Database. [25]

Gene neighborhood

LRRC40 is neighbored downstream by LRRC7 (70,225,888 - 70,587,570) on the positive DNA strand and upstream by SRSF11 (70,687,320-70,716,488) on the positive DNA strand.

Gene expression

LRRC40 is expressed between the 50th and 100th percentile in almost every tissue in the body. [26]

Expression of LRRC40 in 79 human tissues. LRRC40 Gene Expression.png
Expression of LRRC40 in 79 human tissues.

Protein

While the exact function of the LRRC40 protein is not yet understood, it is believed to participate in protein-protein interactions because it is a member of the leucine rich repeat family of proteins which are known to participate in protein-protein interactions. [27]

Properties

LRRC40 is a 602 amino acid protein with a molecular weight of 68.254 kDa and an isoelectric point of 6.04. [28] LRRC40 is expected to localize to the nucleus [29] and has no transmembrane domains to anchor it to the nuclear membrane. LRRC40 has many predicted phosphorylation sites. Of the 19 predicted phosphoserine sites, only two are conserved within the orthologs. [30] These two sites are S38 and S391.

Protein structure

The secondary structure of the protein has a pattern within the leucine repeat regions. Each leucine repeat has a β-sheet and α-helix. The image to the right shows the particular horseshoe-like structure of a protein with many leucine rich repeats. Depending on the area where the LRRs are located, other proteins can bind within the curve of the horseshoe or attach to the outside of the protein.

Structure of the Inla S192n G194S protein without its binding partner, sHEC1. The binding site was left empty to show the highlights of the leucine rich repeats (in yellow) demonstrating the protein-binding properties of LRRs. Inla S192n Protein Structure.png
Structure of the Inla S192n G194S protein without its binding partner, sHEC1. The binding site was left empty to show the highlights of the leucine rich repeats (in yellow) demonstrating the protein-binding properties of LRRs.

Protein interactions

According to Genecards, LRRC40 has 756 possible protein interactions. [25] These interactions are based on results in the Molecular Interaction database which provided two possible protein interactions. The two proteins are described in the table below.

AbbreviationProtein nameNCBI protein accessionCellular locationFunction
CDC5L Cell division cycle 5-like proteinNP_001244nucleustranscription regulation and mRNA processing [32]
SNW1 Ski-interacting proteinNP_036377.1nucleusmRNA processing [33]

Related Research Articles

<span class="mw-page-title-main">Morn repeat containing 1</span> Protein-coding gene in the species Homo sapiens

MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.

<span class="mw-page-title-main">FAM43A</span> Protein-coding gene in the species Homo sapiens

The family with sequence similarity 43 member A (FAM43A) gene, also known as; GCO3P195887, GC03P194406, GC03P191784, and NM_153690.3, codes for a 423 bp protein that is conserved in primates, and orthologs have been found in vertebrate and invertebrate species. Three transcripts have been identified, two protein coding isoforms, and a non-coding transcript (cAug10). Molecular weight of 45.8 kdal in the unphosphorylated state and isoelectric point of 6.1.

<span class="mw-page-title-main">TMEM131</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.

<span class="mw-page-title-main">Protein FAM46B</span> Protein-coding gene in the species Homo sapiens

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

<span class="mw-page-title-main">RUFY2</span> Protein-coding gene in humans

RUN and FYVE domain containing 2 (RUFY2) is a protein that in humans is encoded by the RUFY2 gene. The RUFY2 gene is named for two of its domains, the RUN domain and FYVE domains. RUFY2 is a member of the RUFY family of proteins that include RUFY1, RUFY2, RUFY3, and RUFY4. RUFY2 protein has a dynamic role in endosomal membrane trafficking.

<span class="mw-page-title-main">FAM203B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

<span class="mw-page-title-main">CCDC94</span> Protein found in humans

Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">CXorf66</span> Human protein

CXorf66 also known as Chromosome X Open Reading Frame 66, is a 361aa protein in humans that is encoded by the CXorf66 gene. The protein encoded is predicted to be a type 1 transmembrane protein; however, its exact function is currently unknown.

<span class="mw-page-title-main">KIAA1841</span> Protein-coding gene in the species Homo sapiens

KIAA1841 is a gene in humans that encodes a protein known as KIAA1841. KIAA1841 is targeted for the nucleus and it predicted to play a role in regulating transcription.

<span class="mw-page-title-main">FAM98A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.

<span class="mw-page-title-main">EVI5L</span> Protein-coding gene in the species Homo sapiens

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

<span class="mw-page-title-main">Leucine-rich repeats and iq motif containing 1</span> Protein-coding gene in the species Homo sapiens

Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.

<span class="mw-page-title-main">C14orf93</span> Protein-coding gene in the species Homo sapiens

C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

<span class="mw-page-title-main">FAM155B</span> Protein-coding gene in humans

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000066557 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000063052 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Entrez Gene: leucine rich repeat containing 40".
  6. Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (July 2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res. 31 (13): 3497–500. doi:10.1093/nar/gkg500. PMC   168907 . PMID   12824352.
  7. 1 2 "NCBI BLAST".
  8. "Time Tree".
  9. 1 2 "NCBI Nucleotide: NM_017768.4". 24 June 2018.
  10. "NCBI Nucleotide: XP_513483". 20 March 2018.
  11. "NCBI Nucleotide: NM_001131180". 19 February 2022.
  12. "NCBI Nucleotide: AB179219". 6 October 2006.
  13. "NCBI Nucleotide: XM_002750952.1". 18 May 2010.
  14. "NCBI Nucleotide: XM_003127928". 13 May 2017.
  15. "NCBI Nucleotide: NM_024194". 13 August 2022.
  16. "NCBI Nucleotide: XM_001379417". 27 April 2016.
  17. "NCBI Nucleotide: NM_001031295". 9 March 2022.
  18. "NCBI Nucleotide: XM_002188367". 12 February 2013.
  19. "NCBI Nucleotide: NM_001011310". 19 June 2021.
  20. "NCBI Nucleotide: NM_199862". 20 November 2021.
  21. "NCBI Nucleotide: BT043621". 24 November 2009.
  22. "NCBI Nucleotide: XM_001640230". 31 January 2009.
  23. "NCBI Nucleotide: XM_001842697.1". December 2009.
  24. "NCBI Gene: 55631".
  25. 1 2 "GeneCards: LRRC40".
  26. 1 2 "GEO Profiles: LRRC40 GDS596".
  27. Kobe B, Kajava AV (December 2001). "The leucine-rich repeat as a protein recognition motif". Curr. Opin. Struct. Biol. 11 (6): 725–32. doi:10.1016/S0959-440X(01)00266-4. PMID   11751054.
  28. "ExPASy: Compute PI/Mw". Archived from the original on 2003-07-23.
  29. "PSORTII: Protein Localization Tool".[ permanent dead link ]
  30. "NetPhos 2.0 Server: Phosphorylation Prediction".
  31. "NCBI MMDB: Inla S192n G194S".
  32. "MINT: CDC5L". Archived from the original on 2013-02-18.
  33. "MINT: SNW1". Archived from the original on 2013-02-18.