LOC105377021

Last updated
LOC105377021
Identifiers
SymbolUNQ6490, PRO21339, (LOC389102, YPLR6490)

LOC105377021 is a protein which in humans is encoded by the LOC105377021 gene. [1] [2] LOC105377021 exhibits expressional pathology related to breast cancer, specifically triple negative breast cancer. [3] [4] LOC105377021 contains a serine rich region in addition to predicted alpha helix motifs. [5] [2]

Contents

Gene and mRNA

LOC105377021 localizes to Homo sapiens chromosome 3 (3p2; antisense strand), approximate to the reading frame of TRIM71. [2] [6] [7] The corresponding gene has 2,473 nucleotides. [2] There is one exon in the LOC105377021p mRNA. [2] There is no predicted alternative splicing on the NCBI gene database. [2]

Protein

Protein Primary Structure

The figure below shows the basic primary protein structure, with N-terminus and C-terminus in their respective annotations. The orange domain is a predicted nuclear localization sequence, while the blue domain is the remainder of the LOC105377021 exon. [8]

LOC105377021.png

Amino Acid CountCalculated Isoelectric Point Calculated Molecular Weight Competitive Repeat UnitOver-represented Amino Acids
168 [2] 11.438 [9] 18.2 kdal [10] [11] RARPNone [10] [11]

Protein Secondary Structure

According to Ali2D (a multiple sequence alignment structural predictor for proteins), LOC105377021 is predicted to form mostly alpha helix (see red highlight, blue highlight is for Beta Sheet). [5]

Secondary Structure Text Outline.png

I TASSER 3D Prediction of LOC105377021 I TASSER 3D Prediction of LOC105377021.png
I TASSER 3D Prediction of LOC105377021

Protein Tertiary Structure

LOC105377021 has a prominent, C-terminus repeat of serine residues, potentially for disulfide bonding. [2] One disulfide bond (139-148) was predicted by DISULFIDE software. [12] Additionally, the I TASSER profile shows several alpha helices in a variety of different colors, in addition to potential turn motifs (see I TASSER 3D Prediction of LOC105377021). [13]

Protein Modifications and Localization

A predicted protein modification of LOC105377021 is phosphorylation, with sites throughout the protein, including the serine rich construct near the C-terminus of the protein. [14] [15] In addition, there is predicted evidence of O-Linked β-N-acetylglucosamine supplements in the C terminal region. [16] [17] There is predicted evidence for a nuclear localization sequence oriented at the N-terminal, provided by PSORT with partial support by PHOBIUS software. [8] [18]

Microarray Expression Pattern and Pathology

Basic Expression and Breast Cancer

Compared to the average expression of human protein, LOC105377021 is expressed at 0.9%, which is classified as low. [6] In humans: cranial, intestinal, ovarian, renal, and testicular tissues corroborate this trend. [6]

Microarray data posits the expression of LOC105377021 in certain breast cancer tissues, including metastases to lymphatic and lung tissue. [4] There is potential evidence for higher expression of LOC105377021 during Triple Negative Breast Cancer, which overshadows normal secretion levels for said protein. [3] The figure below shows a potential trend line for this pattern (shown in green, with the triple negative microarray on the left). As the figure legend states, the red bars refer to the left axis for sample counts, whereas the blue dots show the percentage of LOC105377021 expression within each sample (the right axis).

NCBI Geo Profile for Triple Negative Breast Cancer and YPLR6490.png

This photo is courtesy of NCBI Geo Profiles Accession GDS4069.

Brain Tissue Expression

Seven key brain tissues express LOC105377021 according to an Allen Brain Atlas probe. [19] The temporal lobe, parietal lobe, cingulate gyrus, parahippocampal gyrus, and insula are five overarching regions of the seven brain tissues where expression was highlighted. The annotated figures below serve as fairly holistic representations of cranial expression in the context of LOC105377021. Light blue shaded regions posit more dense expression of LOC105377021, where as darker green and brighter red show less and least amounts of expression respectively. All seven expression areas, including the middle temporal gyrus, the short insular gyrus, the postcentral gyrus, the cingulate gyrus, the inferior temporal gyrus, the parahippocampal gyrus, and the superior temporal gyrus are depicted in Allen Brain Atlas profiles below.

Four Allen Brain Atlas Annotations for LOC105377021 Brain Expression.png

Three Allen Brain Atlas LOC105377021 Cranial Expression Profiles.png

These photos are courtesy of the Allen Brain Atlas.

Evolutionary Relationships and Homology

Orthologs

The Basic Local Alignment Sequence Tool (BLAST) shows that LOC105377021 orthologs are largely homogeneous and mammalian. [20] Important orthologs are summarized into three categories: primates, aquatic mammals, and ferrets/ferret-like animals. Pongo abelii and Tursiops truncatus are the most distant and related orthologs respectively. The river dolphin is the first ortholog to detach from the 80% plus similarity cohort. The following includes a list of select orthologs found:

Common NameGenus speciesNCBI Accession Number [21] % Similarity% IdentityProtein Length (in amino acids)Ortholog Aliases [21]
Human Homo sapiensXP_011532636.1100100168LOC389102, YPLR6490, UNQ6490, PRO21339
Sumatran orangutan Pongo abeliiXP_002814002.19898170LOC100446670
Squirrel Monkey Saimiri boliviensis boliviensisXP_010339850.19696167LOC101034964
Golden snub -nosed monkey Rhinopithecus roxellanaXP_010352756.19695170LOC104655070
Olive baboon Papio anubisXP_003895275.19695170LOC101001601
Angolan Black and White Colobus Colobus angolensis palliatusXP_011816934.19592171LOC105525785
Mouse lemur Microbus murinusXP_012624670.19392169LOC105873906
Northern greater galago Otolemur garnettiiXP_012661211.18987167LOC100965366
Sunda Flying Lemur Galeopterus variegatusXP_0085818538786151LOC103599484
Killer Whale Orcinus orcaXP_004279764.18280282 SEC31
Sperm Whale Physeter catodonXP_007125317.18280230LOC10299578
Baiji Litotes vexilliferXP_007464989.17876217LOC103081437
Star-Nosed Mole Condylura cristataXP_012590632.17774162LOC101633309
Aardvark Orycteropus afer aferXP_007946855.17671186LOC103203593
Cape Elephant Shrew Elephantulus edwardiiXP_006890725.17167225LOC102845592
Cape Golden Mole Chrysochloris asiaticaXP_006859232.16966264LOC102816436
Bactrian camel Camels bactrianusXP_010958002.15454126LOC105072685
Common Bottlenose Dolphin Tursiops truncatusXP_004315393.15046166LOC101335194

Pace of Evolution

The pace of evolution of LOC105377021 upon its inception (in humans) is modeled to be slow. This speed is relative to cytochrome c 6A1 and Alpha fibrinogen using corrected divergence methods.

LOC105377021 Corrected Divergence Graph.png

The corrected divergences graph above shows three lines: Alpha Fibrinogen in Red, LOC Ortholog (aka LOC105377021) in blue, and Cytochrome c 6A1 in green. These lines associate with evolutionary pace in LOC105377021, as tested using a corrected divergence genomic analysis.

Single Nucleotide Polymorphisms

The following diagram shows single nucleotide polymorphisms (SNP's) in various regions of the protein. SNP's are highlighted green, with SNP coding on the right hand coding for switches in amino acids. [22]

SNP's in LOC105377021.png

PolymorphismChemical Nature prior to PolymorphismChemical Nature after Polymorphism
G7RNon polarBasic
R8L,PBasicNon polar
L25FNon polarNon polar
L29FNon polarNon polar
A34PNon polarNon polar
L51FNon polarNon polar
P54RNon polarBasic
H96RBasicBasic
L97HNon polarBasic
G128RNon polarBasic
S157FPolarNon polar
Y163FPolarNon polar

Promoter and Gene Regulation

According to Genomatix, LOC389102 (synonym to LOC105377021) is proximate to a 601 base pair promoter and a 5'UTR 129 base pairs long consecutively. [2] [23] Genomatix predicts several transcription factors in general. Two select factors predicted include Gli3 and E2F1. [23]

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">LOC101059915</span> Protein-coding gene in the species Homo sapiens

LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in the species Homo sapiens

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C6orf136</span>

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

References

  1. "HGNC database of human gene names | HUGO Gene Nomenclature Committee". www.genenames.org. Retrieved 2016-04-27.
  2. 1 2 3 4 5 6 7 8 9 "LOC105377021 putative uncharacterized protein UNQ6490/PRO21339 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-02-06.
  3. 1 2 "GDS4069 / 8086024 / YPLR6490". www.ncbi.nlm.nih.gov. Retrieved 2016-04-22.
  4. 1 2 "101977830 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-04-22.
  5. 1 2 Remmert M. "Ali2D". toolkit.tuebingen.mpg.de. Archived from the original on 2009-05-14. Retrieved 2016-04-27.
  6. 1 2 3 Thierry-Mieg, Danielle; Thierry-Mieg, Jean. "AceView a comprehensive annotation of human and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2016-02-20.
  7. "Human BLAT Search". genome.ucsc.edu. Retrieved 2016-05-06.
  8. 1 2 "Phobius". phobius.sbc.su.se. Retrieved 2016-04-24.
  9. Toldo L, Kindler B (2016). "Isoelectric point determination". EMBL WWW Gateway to Isoelectric Point Service.
  10. 1 2 Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences of the United States of America. 89 (6): 2002–6. Bibcode:1992PNAS...89.2002B. doi: 10.1073/pnas.89.6.2002 . PMC   48584 . PMID   1549558.
  11. 1 2 Brendel V (1992). "SAPS". Department of Mathematics, Stanford University.
  12. Ceroni A, Passerini A, Vullo A, Frasconi P (July 2006). "DISULFIND: a disulfide bonding state and cysteine connectivity prediction server". Nucleic Acids Research. 34 (Web Server issue): W177–81. doi:10.1093/nar/gkl266. PMC   1538823 . PMID   16844986.
  13. "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2016-05-06.
  14. "Motif Scan". myhits.isb-sib.ch. Retrieved 2016-04-27.
  15. Blom N, Gammeltoft S, Brunak S (1999). "Sequence-and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–1362. doi:10.1006/jmbi.1999.3310. PMID   10600390 . Retrieved 2016-04-27.
  16. Gupta R. "Prediction of glycosylation sites in proteomes: from post-translational modifications to protein function". www.cbs.dtu.dk. Retrieved 2016-05-04.
  17. Gupta R, Brunak S (2002). "Prediction of glycosylation across the human proteome and the correlation to protein function". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing: 310–22. doi:10.1142/9789812799623_0029. ISBN   978-981-02-4777-5. PMID   11928486. S2CID   13924422.
  18. "Welcome to psort.org!". www.psort.org. Retrieved 2016-04-24.
  19. "Microarray Data :: Allen Brain Atlas: Human Brain". human.brain-map.org. Retrieved 2016-05-03.
  20. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–10. doi:10.1016/S0022-2836(05)80360-2. PMID   2231712. S2CID   14441902.
  21. 1 2 "National Center for Biotechnology Information". www.ncbi.nlm.nih.gov. Retrieved 2016-05-03.
  22. "dbSNP Home Page". www.ncbi.nlm.nih.gov. Retrieved 2016-05-03.
  23. 1 2 "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2016-04-24.