LOC105377021 | |
---|---|
Identifiers | |
Symbol | UNQ6490, PRO21339, (LOC389102, YPLR6490) |
LOC105377021 is a protein which in humans is encoded by the LOC105377021 gene. [1] [2] LOC105377021 exhibits expressional pathology related to breast cancer, specifically triple negative breast cancer. [3] [4] LOC105377021 contains a serine rich region in addition to predicted alpha helix motifs. [5] [2]
LOC105377021 localizes to Homo sapiens chromosome 3 (3p2; antisense strand), approximate to the reading frame of TRIM71. [2] [6] [7] The corresponding gene has 2,473 nucleotides. [2] There is one exon in the LOC105377021p mRNA. [2] There is no predicted alternative splicing on the NCBI gene database. [2]
The figure below shows the basic primary protein structure, with N-terminus and C-terminus in their respective annotations. The orange domain is a predicted nuclear localization sequence, while the blue domain is the remainder of the LOC105377021 exon. [8]
Amino Acid Count | Calculated Isoelectric Point | Calculated Molecular Weight | Competitive Repeat Unit | Over-represented Amino Acids |
---|---|---|---|---|
168 [2] | 11.438 [9] | 18.2 kdal [10] [11] | RARP | None [10] [11] |
According to Ali2D (a multiple sequence alignment structural predictor for proteins), LOC105377021 is predicted to form mostly alpha helix (see red highlight, blue highlight is for Beta Sheet). [5]
LOC105377021 has a prominent, C-terminus repeat of serine residues, potentially for disulfide bonding. [2] One disulfide bond (139-148) was predicted by DISULFIDE software. [12] Additionally, the I TASSER profile shows several alpha helices in a variety of different colors, in addition to potential turn motifs (see I TASSER 3D Prediction of LOC105377021). [13]
A predicted protein modification of LOC105377021 is phosphorylation, with sites throughout the protein, including the serine rich construct near the C-terminus of the protein. [14] [15] In addition, there is predicted evidence of O-Linked β-N-acetylglucosamine supplements in the C terminal region. [16] [17] There is predicted evidence for a nuclear localization sequence oriented at the N-terminal, provided by PSORT with partial support by PHOBIUS software. [8] [18]
Compared to the average expression of human protein, LOC105377021 is expressed at 0.9%, which is classified as low. [6] In humans: cranial, intestinal, ovarian, renal, and testicular tissues corroborate this trend. [6]
Microarray data posits the expression of LOC105377021 in certain breast cancer tissues, including metastases to lymphatic and lung tissue. [4] There is potential evidence for higher expression of LOC105377021 during Triple Negative Breast Cancer, which overshadows normal secretion levels for said protein. [3] The figure below shows a potential trend line for this pattern (shown in green, with the triple negative microarray on the left). As the figure legend states, the red bars refer to the left axis for sample counts, whereas the blue dots show the percentage of LOC105377021 expression within each sample (the right axis).
This photo is courtesy of NCBI Geo Profiles Accession GDS4069.
Seven key brain tissues express LOC105377021 according to an Allen Brain Atlas probe. [19] The temporal lobe, parietal lobe, cingulate gyrus, parahippocampal gyrus, and insula are five overarching regions of the seven brain tissues where expression was highlighted. The annotated figures below serve as fairly holistic representations of cranial expression in the context of LOC105377021. Light blue shaded regions posit more dense expression of LOC105377021, where as darker green and brighter red show less and least amounts of expression respectively. All seven expression areas, including the middle temporal gyrus, the short insular gyrus, the postcentral gyrus, the cingulate gyrus, the inferior temporal gyrus, the parahippocampal gyrus, and the superior temporal gyrus are depicted in Allen Brain Atlas profiles below.
These photos are courtesy of the Allen Brain Atlas.
The Basic Local Alignment Sequence Tool (BLAST) shows that LOC105377021 orthologs are largely homogeneous and mammalian. [20] Important orthologs are summarized into three categories: primates, aquatic mammals, and ferrets/ferret-like animals. Pongo abelii and Tursiops truncatus are the most distant and related orthologs respectively. The river dolphin is the first ortholog to detach from the 80% plus similarity cohort. The following includes a list of select orthologs found:
Common Name | Genus species | NCBI Accession Number [21] | % Similarity | % Identity | Protein Length (in amino acids) | Ortholog Aliases [21] |
---|---|---|---|---|---|---|
Human | Homo sapiens | XP_011532636.1 | 100 | 100 | 168 | LOC389102, YPLR6490, UNQ6490, PRO21339 |
Sumatran orangutan | Pongo abelii | XP_002814002.1 | 98 | 98 | 170 | LOC100446670 |
Squirrel Monkey | Saimiri boliviensis boliviensis | XP_010339850.1 | 96 | 96 | 167 | LOC101034964 |
Golden snub -nosed monkey | Rhinopithecus roxellana | XP_010352756.1 | 96 | 95 | 170 | LOC104655070 |
Olive baboon | Papio anubis | XP_003895275.1 | 96 | 95 | 170 | LOC101001601 |
Angolan Black and White Colobus | Colobus angolensis palliatus | XP_011816934.1 | 95 | 92 | 171 | LOC105525785 |
Mouse lemur | Microbus murinus | XP_012624670.1 | 93 | 92 | 169 | LOC105873906 |
Northern greater galago | Otolemur garnettii | XP_012661211.1 | 89 | 87 | 167 | LOC100965366 |
Sunda Flying Lemur | Galeopterus variegatus | XP_008581853 | 87 | 86 | 151 | LOC103599484 |
Killer Whale | Orcinus orca | XP_004279764.1 | 82 | 80 | 282 | SEC31 |
Sperm Whale | Physeter catodon | XP_007125317.1 | 82 | 80 | 230 | LOC10299578 |
Baiji | Litotes vexillifer | XP_007464989.1 | 78 | 76 | 217 | LOC103081437 |
Star-Nosed Mole | Condylura cristata | XP_012590632.1 | 77 | 74 | 162 | LOC101633309 |
Aardvark | Orycteropus afer afer | XP_007946855.1 | 76 | 71 | 186 | LOC103203593 |
Cape Elephant Shrew | Elephantulus edwardii | XP_006890725.1 | 71 | 67 | 225 | LOC102845592 |
Cape Golden Mole | Chrysochloris asiatica | XP_006859232.1 | 69 | 66 | 264 | LOC102816436 |
Bactrian camel | Camels bactrianus | XP_010958002.1 | 54 | 54 | 126 | LOC105072685 |
Common Bottlenose Dolphin | Tursiops truncatus | XP_004315393.1 | 50 | 46 | 166 | LOC101335194 |
The pace of evolution of LOC105377021 upon its inception (in humans) is modeled to be slow. This speed is relative to cytochrome c 6A1 and Alpha fibrinogen using corrected divergence methods.
The corrected divergences graph above shows three lines: Alpha Fibrinogen in Red, LOC Ortholog (aka LOC105377021) in blue, and Cytochrome c 6A1 in green. These lines associate with evolutionary pace in LOC105377021, as tested using a corrected divergence genomic analysis.
The following diagram shows single nucleotide polymorphisms (SNP's) in various regions of the protein. SNP's are highlighted green, with SNP coding on the right hand coding for switches in amino acids. [22]
Polymorphism | Chemical Nature prior to Polymorphism | Chemical Nature after Polymorphism |
---|---|---|
G7R | Non polar | Basic |
R8L,P | Basic | Non polar |
L25F | Non polar | Non polar |
L29F | Non polar | Non polar |
A34P | Non polar | Non polar |
L51F | Non polar | Non polar |
P54R | Non polar | Basic |
H96R | Basic | Basic |
L97H | Non polar | Basic |
G128R | Non polar | Basic |
S157F | Polar | Non polar |
Y163F | Polar | Non polar |
According to Genomatix, LOC389102 (synonym to LOC105377021) is proximate to a 601 base pair promoter and a 5'UTR 129 base pairs long consecutively. [2] [23] Genomatix predicts several transcription factors in general. Two select factors predicted include Gli3 and E2F1. [23]
C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.
C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.