LRRC40

LRRC40
Identifiers
Aliases	LRRC40 , dJ677H15.1, leucine rich repeat containing 40
External IDs	MGI: 1914394 HomoloGene: 9825 GeneCards: LRRC40
Gene location (Human)
Chr.	Chromosome 1 (human)
End	70,205,579 bp
Gene location (Mouse)
Chr.	Chromosome 3 (mouse)
End	157,774,124 bp
RNA expression pattern
	Top expressed in
	jejunal mucosa; ; oocyte; ; secondary oocyte; ; endothelial cell; ; Brodmann area 23; ; embryo; ; cavity of mouth; ; synovial joint; ; cerebellar vermis; ; superficial temporal artery;
	Top expressed in
	ureter; ; medullary collecting duct; ; secondary oocyte; ; cumulus cell; ; renal corpuscle; ; primitive streak; ; endocardial cushion; ; maxillary prominence; ; abdominal wall; ; superior cervical ganglion;
	More reference expression data
	n/a
Orthologs
	55631
	67144
	ENSG00000066557
	ENSMUSG00000063052
	Q9H9A6
	Q9CRC8
	NM_017768
	NM_001289524 ; NM_001289525 ; NM_024194 ; NM_001359763
	NP_060238
	NP_001276453 ; NP_001276454 ; NP_077156 ; NP_001346692
	Wikidata
View/Edit Human	View/Edit Mouse

Last updated January 21, 2024

Leucine rich repeat containing 40 (LRRC40) is a protein that in humans is encoded by the LRRC40 gene.^[5]

Species distribution

LRRC40 is conserved throughout all of its orthologs. The entire protein is highly conserved in mammals, while conservation is high within the leucine rich repeats in the rest of the orthologs.^[6] Orthologs were found all the way back to the scarlet sea anemone and homologs were found in bacteria and Archaea using BLAST.^[7] The following table gives information on the homologs of LRRC40.

Genus species	Organism common name	Divergence from humans (MYA) ^[8]	NCBI mRNA accession	Sequence similarity ^[7]	Protein length	Common gene name
Homo sapiens ^[9]	Humans	--	NM_017768	100%	602	LRRC40
Pan troglodytes ^[10]	Common chimp	6.4	XM_513483	99%	602	Hypothetical protein
Pongo abelii ^[11]	Orangutan	15.8	NM_001131180	99%	602	LRRC40
Macaca fascicularis ^[12]	Long-tailed macaque	30.2	AB179219	99%	602	Full LRRC40
Callithrix jacchus ^[13]	Common marmoset	43.9	XM_002750952.1	99%	602	Predicted: LRRC40
Sus scrofa ^[14]	Wild boar	92.5	XM_003127928	96%	602	Predicted: LRRC40 like protein
Mus musculus ^[15]	Mouse	94.1	NM_024194	92%	602	LRRC40
Monodelphis domestica ^[16]	Opossum	160.2	XM_001379417	86%	598	Hypothetical protein
Gallus gallus ^[17]	Chicken	274.8	NM_001031295	85%	603	LRRC40
Taeniopygia guttata ^[18]	Zebra finch	274.8	XM_002188367	85%	605	Predicted: LRRC40
Xenopus (Silurana) tropicalis ^[19]	Western clawed frog	389.7	NM_001011310	80%	605	LRRC40
Danio rerio ^[20]	Zebrafish	444.3	NM_199862	83%	601	LRRC40
Salmo salar ^[21]	Salmon	444.3	BT043621	82%	600	LRRC40
Nematostella vectensis ^[22]	Scarlet sea anemone	830.3	XM_001640230	66%	602	Predicted protein
Culex quinquefasciatus ^[23]	Southern house mosquito	838.3	XM_001842697.1	58%	612	LRRC40

Gene

LRRC40 is located on the negative DNA strand (see Sense (molecular biology)) of chromosome 1 from 70,611,483- 70,671,223.^[24] The gene produces a 2958 base pair mRNA. There are 15 predicted exons in the human gene ^[9] with four other splice patterns predicted on GeneCards by the Alternative Splice Database.^[25]

Gene neighborhood

LRRC40 is neighbored downstream by LRRC7 (70,225,888 - 70,587,570) on the positive DNA strand and upstream by SRSF11 (70,687,320-70,716,488) on the positive DNA strand.

Gene expression

LRRC40 is expressed between the 50th and 100th percentile in almost every tissue in the body.^[26]

Expression of LRRC40 in 79 human tissues.

Protein

While the exact function of the LRRC40 protein is not yet understood, it is believed to participate in protein-protein interactions because it is a member of the leucine rich repeat family of proteins which are known to participate in protein-protein interactions.^[27]

Properties

LRRC40 is a 602 amino acid protein with a molecular weight of 68.254 kDa and an isoelectric point of 6.04.^[28] LRRC40 is expected to localize to the nucleus ^[29] and has no transmembrane domains to anchor it to the nuclear membrane. LRRC40 has many predicted phosphorylation sites. Of the 19 predicted phosphoserine sites, only two are conserved within the orthologs.^[30] These two sites are S38 and S391.

Protein structure

The secondary structure of the protein has a pattern within the leucine repeat regions. Each leucine repeat has a β-sheet and α-helix. The image to the right shows the particular horseshoe-like structure of a protein with many leucine rich repeats. Depending on the area where the LRRs are located, other proteins can bind within the curve of the horseshoe or attach to the outside of the protein.

Protein interactions

According to Genecards, LRRC40 has 756 possible protein interactions.^[25] These interactions are based on results in the Molecular Interaction database which provided two possible protein interactions. The two proteins are described in the table below.

Abbreviation	Protein name	NCBI protein accession	Cellular location	Function
CDC5L	Cell division cycle 5-like protein	NP_001244	nucleus	transcription regulation and mRNA processing ^[32]
SNW1	Ski-interacting protein	NP_036377.1	nucleus	mRNA processing ^[33]

Related Research Articles

MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.

The family with sequence similarity 43 member A (FAM43A) gene, also known as; GCO3P195887, GC03P194406, GC03P191784, and NM_153690.3, codes for a 423 bp protein that is conserved in primates, and orthologs have been found in vertebrate and invertebrate species. Three transcripts have been identified, two protein coding isoforms, and a non-coding transcript (cAug10). Molecular weight of 45.8 kdal in the unphosphorylated state and isoelectric point of 6.1.

Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

RUN and FYVE domain containing 2 (RUFY2) is a protein that in humans is encoded by the RUFY2 gene. The RUFY2 gene is named for two of its domains, the RUN domain and FYVE domains. RUFY2 is a member of the RUFY family of proteins that include RUFY1, RUFY2, RUFY3, and RUFY4. RUFY2 protein has a dynamic role in endosomal membrane trafficking.

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

CXorf66 also known as Chromosome X Open Reading Frame 66, is a 361aa protein in humans that is encoded by the CXorf66 gene. The protein encoded is predicted to be a type 1 transmembrane protein; however, its exact function is currently unknown.

KIAA1841 is a gene in humans that encodes a protein known as KIAA1841. KIAA1841 is targeted for the nucleus and it predicted to play a role in regulating transcription.

Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.

C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

1 2 3 GRCh38: Ensembl release 89: ENSG00000066557 - Ensembl, May 2017
1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000063052 - Ensembl, May 2017
↑ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
↑ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
↑ "Entrez Gene: leucine rich repeat containing 40".
↑ Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (July 2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res. 31 (13): 3497–500. doi:10.1093/nar/gkg500. PMC 168907 . PMID 12824352.
1 2 "NCBI BLAST".
↑ "Time Tree".
1 2 "NCBI Nucleotide: NM_017768.4". 24 June 2018.
↑ "NCBI Nucleotide: XP_513483". 20 March 2018.
↑ "NCBI Nucleotide: NM_001131180". 19 February 2022.
↑ "NCBI Nucleotide: AB179219". 6 October 2006.
↑ "NCBI Nucleotide: XM_002750952.1". 18 May 2010.
↑ "NCBI Nucleotide: XM_003127928". 13 May 2017.
↑ "NCBI Nucleotide: NM_024194". 13 August 2022.
↑ "NCBI Nucleotide: XM_001379417". 27 April 2016.
↑ "NCBI Nucleotide: NM_001031295". 9 March 2022.
↑ "NCBI Nucleotide: XM_002188367". 12 February 2013.
↑ "NCBI Nucleotide: NM_001011310". 19 June 2021.
↑ "NCBI Nucleotide: NM_199862". 20 November 2021.
↑ "NCBI Nucleotide: BT043621". 24 November 2009.
↑ "NCBI Nucleotide: XM_001640230". 31 January 2009.
↑ "NCBI Nucleotide: XM_001842697.1". December 2009.
↑ "NCBI Gene: 55631".
1 2 "GeneCards: LRRC40".
1 2 "GEO Profiles: LRRC40 GDS596".
↑ Kobe B, Kajava AV (December 2001). "The leucine-rich repeat as a protein recognition motif". Curr. Opin. Struct. Biol. 11 (6): 725–32. doi:10.1016/S0959-440X(01)00266-4. PMID 11751054.
↑ "ExPASy: Compute PI/Mw". Archived from the original on 2003-07-23.
↑ "PSORTII: Protein Localization Tool".^{[ permanent dead link ]}
↑ "NetPhos 2.0 Server: Phosphorylation Prediction".
↑ "NCBI MMDB: Inla S192n G194S".
↑ "MINT: CDC5L". Archived from the original on 2013-02-18.
↑ "MINT: SNW1". Archived from the original on 2013-02-18.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[refGRCh38Ensembl-1] 1 2 3 GRCh38: Ensembl release 89: ENSG00000066557 - Ensembl, May 2017

[refGRCm38Ensembl-2] 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000063052 - Ensembl, May 2017

[3] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[4] "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[entrez-5] "Entrez Gene: leucine rich repeat containing 40".

[pmid12824352-6] Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (July 2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res. 31 (13): 3497–500. doi:10.1093/nar/gkg500. PMC 168907 . PMID 12824352.

[BLAST-7] 1 2 "NCBI BLAST".

[Time_Tree-8] "Time Tree".

[Nucleotide:_Hsa-9] 1 2 "NCBI Nucleotide: NM_017768.4". 24 June 2018.

[Nucleotide:_Ptr-10] "NCBI Nucleotide: XP_513483". 20 March 2018.

[Nucleotide:_Pab-11] "NCBI Nucleotide: NM_001131180". 19 February 2022.

[Nucleotide:_Mfa-12] "NCBI Nucleotide: AB179219". 6 October 2006.

[Nucleotide:_Cja-13] "NCBI Nucleotide: XM_002750952.1". 18 May 2010.

[Nucleotide:_Ssu-14] "NCBI Nucleotide: XM_003127928". 13 May 2017.

[Nucleotide:_Mmu-15] "NCBI Nucleotide: NM_024194". 13 August 2022.

[Nucleotide:_Mdo-16] "NCBI Nucleotide: XM_001379417". 27 April 2016.

[Nucleotide:_Gga-17] "NCBI Nucleotide: NM_001031295". 9 March 2022.

[Nucleotide:_Tgu-18] "NCBI Nucleotide: XM_002188367". 12 February 2013.

[Nucleotide:_Xtr-19] "NCBI Nucleotide: NM_001011310". 19 June 2021.

[Nucleotide:_Dre-20] "NCBI Nucleotide: NM_199862". 20 November 2021.

[Nucleotide:_Ssa-21] "NCBI Nucleotide: BT043621". 24 November 2009.

[Nucleotide:_Nve-22] "NCBI Nucleotide: XM_001640230". 31 January 2009.

[Nucleotide:_Cqu-23] "NCBI Nucleotide: XM_001842697.1". December 2009.

[Gene-24] "NCBI Gene: 55631".

[GeneCards-25] 1 2 "GeneCards: LRRC40".

[GEO_Profiles-26] 1 2 "GEO Profiles: LRRC40 GDS596".

[pmid11751054-27] Kobe B, Kajava AV (December 2001). "The leucine-rich repeat as a protein recognition motif". Curr. Opin. Struct. Biol. 11 (6): 725–32. doi:10.1016/S0959-440X(01)00266-4. PMID 11751054.

[Compute_PI/Mw-28] "ExPASy: Compute PI/Mw". Archived from the original on 2003-07-23.

[PSORTII-29] "PSORTII: Protein Localization Tool".^{[ permanent dead link ]}

[NetPhos_2.0-30] "NetPhos 2.0 Server: Phosphorylation Prediction".

[Molecular_Modeling_Database-31] "NCBI MMDB: Inla S192n G194S".

[MINT:_CDC5L-32] "MINT: CDC5L". Archived from the original on 2013-02-18.

[MINT:_SNW1-33] "MINT: SNW1". Archived from the original on 2013-02-18.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[32]

[33]