LRRC40 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | LRRC40 , dJ677H15.1, leucine rich repeat containing 40 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1914394 HomoloGene: 9825 GeneCards: LRRC40 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Leucine rich repeat containing 40 (LRRC40) is a protein that in humans is encoded by the LRRC40 gene. [5]
LRRC40 is conserved throughout all of its orthologs. The entire protein is highly conserved in mammals, while conservation is high within the leucine rich repeats in the rest of the orthologs. [6] Orthologs were found all the way back to the scarlet sea anemone and homologs were found in bacteria and Archaea using BLAST. [7] The following table gives information on the homologs of LRRC40.
Genus species | Organism common name | Divergence from humans (MYA) [8] | NCBI mRNA accession | Sequence similarity [7] | Protein length | Common gene name |
---|---|---|---|---|---|---|
Homo sapiens [9] | Humans | -- | NM_017768 | 100% | 602 | LRRC40 |
Pan troglodytes [10] | Common chimp | 6.4 | XM_513483 | 99% | 602 | Hypothetical protein |
Pongo abelii [11] | Orangutan | 15.8 | NM_001131180 | 99% | 602 | LRRC40 |
Macaca fascicularis [12] | Long-tailed macaque | 30.2 | AB179219 | 99% | 602 | Full LRRC40 |
Callithrix jacchus [13] | Common marmoset | 43.9 | XM_002750952.1 | 99% | 602 | Predicted: LRRC40 |
Sus scrofa [14] | Wild boar | 92.5 | XM_003127928 | 96% | 602 | Predicted: LRRC40 like protein |
Mus musculus [15] | Mouse | 94.1 | NM_024194 | 92% | 602 | LRRC40 |
Monodelphis domestica [16] | Opossum | 160.2 | XM_001379417 | 86% | 598 | Hypothetical protein |
Gallus gallus [17] | Chicken | 274.8 | NM_001031295 | 85% | 603 | LRRC40 |
Taeniopygia guttata [18] | Zebra finch | 274.8 | XM_002188367 | 85% | 605 | Predicted: LRRC40 |
Xenopus (Silurana) tropicalis [19] | Western clawed frog | 389.7 | NM_001011310 | 80% | 605 | LRRC40 |
Danio rerio [20] | Zebrafish | 444.3 | NM_199862 | 83% | 601 | LRRC40 |
Salmo salar [21] | Salmon | 444.3 | BT043621 | 82% | 600 | LRRC40 |
Nematostella vectensis [22] | Scarlet sea anemone | 830.3 | XM_001640230 | 66% | 602 | Predicted protein |
Culex quinquefasciatus [23] | Southern house mosquito | 838.3 | XM_001842697.1 | 58% | 612 | LRRC40 |
LRRC40 is located on the negative DNA strand (see Sense (molecular biology)) of chromosome 1 from 70,611,483- 70,671,223. [24] The gene produces a 2958 base pair mRNA. There are 15 predicted exons in the human gene [9] with four other splice patterns predicted on GeneCards by the Alternative Splice Database. [25]
LRRC40 is neighbored downstream by LRRC7 (70,225,888 - 70,587,570) on the positive DNA strand and upstream by SRSF11 (70,687,320-70,716,488) on the positive DNA strand.
LRRC40 is expressed between the 50th and 100th percentile in almost every tissue in the body. [26]
While the exact function of the LRRC40 protein is not yet understood, it is believed to participate in protein-protein interactions because it is a member of the leucine rich repeat family of proteins which are known to participate in protein-protein interactions. [27]
LRRC40 is a 602 amino acid protein with a molecular weight of 68.254 kDa and an isoelectric point of 6.04. [28] LRRC40 is expected to localize to the nucleus [29] and has no transmembrane domains to anchor it to the nuclear membrane. LRRC40 has many predicted phosphorylation sites. Of the 19 predicted phosphoserine sites, only two are conserved within the orthologs. [30] These two sites are S38 and S391.
The secondary structure of the protein has a pattern within the leucine repeat regions. Each leucine repeat has a β-sheet and α-helix. The image to the right shows the particular horseshoe-like structure of a protein with many leucine rich repeats. Depending on the area where the LRRs are located, other proteins can bind within the curve of the horseshoe or attach to the outside of the protein.
According to Genecards, LRRC40 has 756 possible protein interactions. [25] These interactions are based on results in the Molecular Interaction database which provided two possible protein interactions. The two proteins are described in the table below.
Abbreviation | Protein name | NCBI protein accession | Cellular location | Function |
---|---|---|---|---|
CDC5L | Cell division cycle 5-like protein | NP_001244 | nucleus | transcription regulation and mRNA processing [32] |
SNW1 | Ski-interacting protein | NP_036377.1 | nucleus | mRNA processing [33] |
MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.
The family with sequence similarity 43 member A (FAM43A) gene, also known as; GCO3P195887, GC03P194406, GC03P191784, and NM_153690.3, codes for a 423 bp protein that is conserved in primates, and orthologs have been found in vertebrate and invertebrate species. Three transcripts have been identified, two protein coding isoforms, and a non-coding transcript (cAug10). Molecular weight of 45.8 kdal in the unphosphorylated state and isoelectric point of 6.1.
Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.
Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.
RUN and FYVE domain containing 2 (RUFY2) is a protein that in humans is encoded by the RUFY2 gene. The RUFY2 gene is named for two of its domains, the RUN domain and FYVE domains. RUFY2 is a member of the RUFY family of proteins that include RUFY1, RUFY2, RUFY3, and RUFY4. RUFY2 protein has a dynamic role in endosomal membrane trafficking.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.
Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.
CXorf66 also known as Chromosome X Open Reading Frame 66, is a 361aa protein in humans that is encoded by the CXorf66 gene. The protein encoded is predicted to be a type 1 transmembrane protein; however, its exact function is currently unknown.
KIAA1841 is a gene in humans that encodes a protein known as KIAA1841. KIAA1841 is targeted for the nucleus and it predicted to play a role in regulating transcription.
Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.
EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.
C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.