PBDC1

Last updated
PBDC1
Predicted Cxorf26 by Swiss Model.JPG
Identifiers
Aliases PBDC1 , CXorf26, Cxorf26, polysaccharide biosynthesis domain containing 1
External IDs MGI: 1914933 HomoloGene: 9542 GeneCards: PBDC1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_016500
NM_001300888

NM_001281871
NM_026312

RefSeq (protein)

NP_001287817
NP_057584

NP_001268800
NP_080588

Location (UCSC) Chr X: 76.17 – 76.18 Mb Chr X: 104.12 – 104.16 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

CXorf26 (Chromosome X Open Reading Frame 26), also known as MGC874, is a well conserved human gene found on the plus strand of the short arm of the X chromosome. The exact function of the gene is poorly understood, but the polysaccharide biosynthesis domain that spans a major portion of the protein product (known as UPF0368), as well as the yeast homolog, YPL225, offer insights into its possible function.

Contents

Proposed function

Given the mass of data available on CXorf26, potential function is likely related to the workings of RNA polymerase II, ubiquitination, and ribosomes in the cytoplasm. The basis of these arguments is on the interaction data of human CXorf26 as well as its yeast homolog, YPL225W. Both homologs show interaction with multiple ubiquinated proteins as well as the transcriptional enzyme RNA polymerase II. For example, ubiquitiation and subsequent degradation of the 26S proteasome serves an important function in regulating transcription in eukaryotes. [5] The yeast protein RPN11, which interacts with YPL225W, has a homolog in humans that is a metalloprotease component of 26S proteasome that also degrades proteins targeted for destruction by the ubiquitin pathway. [6] These functions do not seem to relate to a polysaccharide biosynthesis function as would be assumed due to its conserved domain, but it may still play a role in secondary structure or sites of phosphorylation.

Further experimentation into the potential role of CXorf26 can give further insight into its exact function in these key cellular processes. Experiments such as a RNA polymerase II inhibitor and subsequent gene expression of CXorf26 could enlighten potential function as well as a complete knockout of YPL225W in yeast using methods such as RNAi.

Gene

Gene neighborhood around CXorf26. Black arrows to the right indicate those genes on the positive strand of the X chromosmome, gray arrows indicate those genes on the negative strand Gene Neighborhood.jpg
Gene neighborhood around CXorf26. Black arrows to the right indicate those genes on the positive strand of the X chromosmome, gray arrows indicate those genes on the negative strand

CXorf26 is found on the plus strand of the short arm of the X chromosome, specifically on the gene locus Xq13.3 spanning the genomic chromosome region from bases 75,393,420-75,397,740. [7] The primary mRNA transcript sequence has 1214 base pairs and its protein product, UPF0368, is composed of 233 amino acids and has a predicted mass of 26,057 Da. [7] The locus where CXorf26 is located, Xq13.3, has known associations to X-linked mental retardation. [8] The third gene located upstream of CXorf26 is ATRX, which encodes for an ATPase/helicase domain, and when mutated causes an X-linked mental retardation syndrome along with alpha thalassemia syndrome; both are known to cause changes in the DNA methylation patterns. [9] Furthermore, the third gene downstream of CXorf26, ZDHHC15, which when mutated, causes mental retardation X-linked type 91. [10] One noteworthy gene located nearby is Xist, which plays a role in the inactivation process of the X chromosome. X inactivation relates to CXorf26, and is discussed below in the relevant research section.

Expression

Expression levels of CXorf26 in common human tissues, as referenced in GEO profiles from NCBI CXorf26 multiple tissue expression levels.jpg
Expression levels of CXorf26 in common human tissues, as referenced in GEO profiles from NCBI

Expression data for CXorf26 shows it is highly ubiquitously expressed throughout human tissues and ESTs in nearly all situations. The GEO profile to the right shows the expression levels for CXorf26 in common human tissues to consistently be around the 75th percentile range, suggesting it may possess a housekeeping function due its seemingly ubiquitous expression. If the conserved domain does indeed play a role in polysaccharide biosynthesis of some sort, this high gene expression is sensible to that function.

CXorf26 expression goes down greatly when CLDN1 is overexpressed, suggesting a relationship between CXorf26 and the cell surface, as predicted by its polysaccharide biosynthesis domain CXorf26 expression under CLDN1 overexpression.jpg
CXorf26 expression goes down greatly when CLDN1 is overexpressed, suggesting a relationship between CXorf26 and the cell surface, as predicted by its polysaccharide biosynthesis domain

Gene expression profiles in the Gene Expression Omnibus (GEO) repository located within the NCBI website demonstrated that there were not many treatments that resulted in a changing of expression of CXorf26 in examined tissues. However, one experiment compared CXorf26 expression in lung adenocarcinoma CL1-5 cells either overexpressing or underexpressing Claudin-1. Results indicated that CXorf26 expression greatly drops when CLDN1 is overexpressed. [12] CLDN1 is a major component in forming tight junction complexes between cells, which foster cell-cell adhesion of cell membranes. [13] More tight junctions formed by CLDN1 would likely result in decreased expression of CXorf26 since the cell membrane would be used for tight junctions instead of its normal function related to heparan sulfate.

Alternative splice forms

Alternative splice form of CXorf26 human transcript. The alternative splice form, shown in red, appears to be missing exon 5, but it is likely added onto the original exon 6. Altern splicing CXorf26.JPG
Alternative splice form of CXorf26 human transcript. The alternative splice form, shown in red, appears to be missing exon 5, but it is likely added onto the original exon 6.

There is only one alternative splice form for CXorf26. This splice form has significantly fewer mRNA base pairs at 977, but still has a protein product of 232 amino acids. [14] This alternative splice form appears to be missing exon 5 of the transcript, but it may be added onto exon 6, creating a larger exon compared to the consensus transcript.

There were no other predicted exons within the genomic CXorf26 sequence when 3000 base pairs were added on either side in the search. [15]

Promoter region

The promoter for CXorf26 is predicted to be located from bases 75392235 to 75393075 on the X chromosome positive strand. [16] The promoter region has extensive conservation with all primates and most mammal homologs, but conservation is lessened in more distantly related species. Given the primary transcript begins at base 7539277, the promoter overlaps with it by 304 bases. 20 predicted transcription factor binding sites with their transcription factor family was collected as well. A high amount of the transcriptional factors relate to zinc finger factors, which have the function of stabilizing protein folds, while none of the factors seem to relate to a potential polysaccharide biosynthesis function. One transcription factor family predicted to bind to the promoter region was V$CHRF, and is involved in regulation of the cell cycle. The regulation could be related to ubiquitin function; proteins with ubiquitination type function were found to interact with CXorf26.

Protein

Subcellular distribution

The CXorf26 protein is 56.5% likely to be localized within the cytoplasm [17] while 17.4% likely to localized to the mitochondria. CXorf26's yeast homolog, YPL225W, was GFP tagged and its location was determined to be in the cytoplasm. [18] Cytoplasmic location instead of transmembrane was supported since no hydrophobic signal peptide sequence and TMAP [19] [ non-primary source needed ] predicted no potential transmembrane segments in CXorf26 or any of its homologs in other species.

Polysaccharide domain

Summary of features on the Cxorf26 protein sequence, with conserved polysaccharide biosynthesis domain highlighted in green CXorf26 Protein Annotation.JPG
Summary of features on the Cxorf26 protein sequence, with conserved polysaccharide biosynthesis domain highlighted in green

CXorf26 was found to have conserved domain known as DUF757 within its sequence. [20] The conserved domain spans a majority of the protein sequence, from amino acids 39-159. Conservation of the domain is strong throughout all homologs compared, including mammals, invertebrates such as insects, and even sponges. The yeast homolog, YPL225W, shows 42.4% identity and 62% similarity in this domain. Conservation of the domain is especially high in areas which include one of the multiple alpha helices or beta sheets. There are also multiple conserved phosphorylation sites located in the amino acid sequence at tyrosine 72 and serine 126.

According to NCBI, [21] this domain is in the Pfam PF04669 family of proteins expected play a role in xylan biosynthesis in plant cell walls, but its exact role in the synthesis pathway is unknown. As animal cells do not contain cell walls, its exact function in other organisms such as humans is unknown.

Xylan is made from units of the pentose sugar xylose, which is known for being the first saccharide in multiple biosynthetic pathways of anionic polysaccharides such as heparan sulfate and chondroitin sulfate. Like Xylan, heparan sulfate it is found on the cell surface; [22] since it is needed for both the cell surface and extracellular matrix, it may explain CXorf26's high expression in nearly all human tissues. Heparan biosynthesis occurs in the lumen of the endoplasmic reticulum [23] and is initiated by the transfer of a xylose from UDP-xylose by xylosyltransferase to specific serine residues within the protein core. PSORTII predicts the presence of a KKXX-like motif, GEKA, near the C-terminus of CXorf26. KKXX-like motifs are predicted endoplasmic reticulum membrane retention signals. This motif is only conserved in primates. However, another KKXX-like motif, QDKE, is found to exist at the end of the domain. The K in this motif is highly conserved back to most invertebrates. However, contradicting results from NetNGlyc predicted no N-glycosylation sites, suggesting CXorf26 does not undergo special folding in the endoplasmic reticulum lumen. [24] [ non-primary source needed ] Given that the conserved domain cannot function to create xylan since there are no cell walls in animal cells, the function may be related to this pathway.

Secondary structure

Predictions across multiple programs suggest the presence of 7 alpha helices and 2 beta sheets for CXorf26; the majority of the secondary structures are in the conserved domain. Experimental evidence in the yeast homolog shows 4 alpha helices and 2 beta sheets all in the polysaccharide domain, [25] just as the predicted SWISS model above shows for humans. The location of the secondary structures are also conserved.

Post-translational modifications

Pepsin (pH 1.3), Asp-N endopeptidase, N-terminal Glutamate and Proteinase K all had 50 or more cleavage sites within the protein, but none of the 10 caspases had any cleavage sites. [26] [ non-primary source needed ] This suggests CXorf26 is not likely to be cleaved or degraded during apoptosis. This follows with the observation that CXorf26 is expressed highly in nearly all tissues and experimental conditions.

Lysine 63 and 66 are potential sites of glycation of epsilon amino groups of lysines. [27] [ non-primary source needed ] Lysine 63 was conserved in both Macaca mulatta and Bombus impatiens. There are 10 serine, 3 threonine, and 6 tyrosine phosphorylation sites predicted within the CXorf26 protein. When comparing the predicted phosphorylation sites, those shown in the table below were those conserved in Macaca mulatta as well as Bombus impatiens . S127 was left in the table even though Homo sapiens and Macaca mulatta did not have significant scores above threshold for that position. Through evolutionary change, the serine in Bombus was changed to a tyrosine in Homo sapiens and Macaca mulatta, which is still capable of phosphorylation, suggesting although there was a mutation, it would likely not result in a large change for the protein and its function.

Bombus impatiensHomo sapiens & Macaca mulatta
Serine 20Serine 23
Serine 91Serine 94
Tyrosine 69Tyrosine 72
Tyrosine126Tyrosine 129
Serine 127*Tyrosine 130*

Species distribution

CXorf26 is strongly evolutionary conserved, [28] [ non-primary source needed ] with conservation found in Batrachochytrium dendrobatidis. A multiple sequence alignment of 20 orthologous protein sequences reveals very strong conservation of the polysaccharide biosynthesis domain, but conservation after it was essentially non-existent in invertebrates. [29] [ non-primary source needed ] For those vertebrates that contained a sequence after the conserved domain, it was found to be of low complexity and filled with repetitive sequence of the amino acid motif 'GEK', corresponding to amino acids glycine, glutamic acid, and lysine. Glutamic acid and lysine both are charged, which contributes to the overall hydrophilicity of the section after the conserved domain.

SpeciesCommon nameAccession numberLengthProtein identityProtein similarity
Homo sapiens Human NP_057584.2 233aa100%100%
Nomascus leucogenys Gibbon XP_003269034.1 233aa99%99%
Macaca mulatta Rhesus monkey NP_001181035.1 233aa98%98%
Callithrix jacchus Marmoset XP_002763066.1 232aa95%97%
Mus musculus Mouse NP_080588.1 198aa80%85%
Loxodonta africana African elephant XP_003412818.1 202aa80%88%
Ailuropoda melanoleuca Giant panda XP_002930750.1 219aa80%84%
Bos taurus Cattle XP_002700032.1 219aa78%86%
Monodelphis domestica Opossum XP_001381973.1 226aa59%89%
Oreochromis niloticus Nile tilapia XP_003453679.1 169aa46%83%
Bombus impatiens Bumblebee XP_003487356.1 168aa38%74%
Acromyrmex echinatior Ant EGI60293.1 197aa32%74%
Amphimedon queenslandica Sponge XP_003383281.1 159aa31%74%
Saccharomyces cerevisiae Yeast NP_015099.1 146aa27%62%
Batrachochytrium dendrobatidis Fungus EGF83065.1 74aa16%65%

Yeast homolog YPL225W

The CXorf26 homolog in yeast, YPL225W, has an overall identity match of 27% but a 42.4% identity and 62% similarity with the polysaccharide biosynthesis domain. Like the predicted human secondary structure, YPL225W is experimentally verified to also contain four alpha helices and two beta sheets within the biosynthesis domain. [30] Like CXorf26, YPL225W function in yeast is unknown, but based on co-purification experiments it may interact with ribosomes since many of its 18 interacting proteins were related to RNA and ribosomes. There were also multiple proteins involved with RNA polymerase, which is involved in the cellular process of transcription. Furthermore, multiple proteins were involved in ubiquitination. Some of the interacting yeast proteins with the higher interaction scores were UBI4, RPB8, SRO9, and NAB2.

Interacting proteins

Potential interacting proteins were identified using the tools provided at the I2D Interlogous Interaction Database [31] and the STRING 9.0 program. [32] Although more proteins were predicted, those shown below had the highest scores and showed the greatest possibility of relating to potential CXorf26 function.

SMAD2, PHB, and CTNNB1 were found in an experiment investigating transcriptional factor networks. [33] The BABAM1 interaction was found in both databases using an anti-tag coimmunoprecipitation assay [34] while POLR2H was based on a tandem affinity purification assay using the yeast homolog, YPL225W. [35]

Interacting ProteinAccession NumberProtein Function
SMAD2 AAC39657.1 Part of family acting as signal transducer and transcriptional modulator
PHB CAG46507.1 Evolutionary conserved, ubiquitously expressed, negative regulator of cell proliferation
CTNNB1 NP_001091679.1 Catenin associated, part of protein complex that constructs adherens junctions
BABAM1 NP_001028721.1 Part of complex that recognizes Lys-63 ubiquinated histones
BRIX1 NP_060791.3 Required for biogenesis of 60s large eukaryotic ribosomal subunit
POLR2H NP_006223.2 Encodes essentential subunit of RNA Polymerase II

Related Research Articles

<span class="mw-page-title-main">Morn repeat containing 1</span> Protein-coding gene in the species Homo sapiens

MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.

<span class="mw-page-title-main">DEPDC5</span> Protein-coding gene in the species Homo sapiens

DEPDC5 is a human protein of poorly understood function but has been associated with cancer in several studies. It is encoded by a gene of the same name, located on chromosome 22.

<span class="mw-page-title-main">TMEM131</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.

<span class="mw-page-title-main">ARMH3</span> Protein-coding gene in the species Homo sapiens

ARMH3 or Armadillo Like Helical Domain Containing 3, also known as UPF0668 and c10orf76, is a protein that in humans is encoded by the ARMH3 gene. Its function is not currently known, but experimental evidence has suggested that it may be involved in transcriptional regulation. The protein contains a conserved proline-rich motif, suggesting that it may participate in protein-protein interactions via an SH3-binding domain, although no such interactions have been experimentally verified. The well-conserved gene appears to have emerged in Fungi approximately 1.2 billion years ago. The locus is alternatively spliced and predicted to yield five protein variants, three of which contain a protein domain of unknown function, DUF1741.

<span class="mw-page-title-main">CCDC130</span> Protein-coding gene in the species Homo sapiens

Coiled-coil domain containing 130 is a protein that in humans is encoded by the CCDC130 gene. It is part of the U4/U5/U6 tri-snRNP in the U5 portion. This tri-snRNP comes together with other proteins to form complex B of the mature spliceosome. The mature protein is approximately 45 kilodaltons (kDa) and is extremely hydrophilic due to the abnormally high number of charged and polar amino acids. CCDC130 is a highly conserved protein, it has orthologous genes in some yeasts and plants that were found using nucleotide and protein versions of the basic local alignment search tool (BLAST) from the National Center for Biotechnology Information. GEO profiles for CCDC130 have shown that this protein is ubiquitously expressed, but the highest levels of expression are found in T-lymphocytes.

<span class="mw-page-title-main">FAM203B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">CCDC47</span> Protein-coding gene in the species Homo sapiens

Coiled-coil domain 47 (CCDC47) is a gene located on human chromosome 17, specifically locus 17q23.3 which encodes for the protein CCDC47. The gene has several aliases including GK001 and MSTP041. The protein itself contains coiled-coil domains, the SEEEED superfamily, a domain of unknown function (DUF1682) and a transmembrane domain. The function of the protein is unknown, but it has been proposed that CCDC47 is involved in calcium ion homeostasis and the endoplasmic reticulum overload response.

<span class="mw-page-title-main">CXorf66</span> Human protein

CXorf66 also known as Chromosome X Open Reading Frame 66, is a 361aa protein in humans that is encoded by the CXorf66 gene. The protein encoded is predicted to be a type 1 transmembrane protein; however, its exact function is currently unknown. CXorf66 has one alias: RP11-35F15.2.

<span class="mw-page-title-main">EVI5L</span> Protein-coding gene in the species Homo sapiens

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

TMEM143 is a protein that in humans is encoded by TMEM143 gene. TMEM143, a dual-pass protein, is predicted to reside in the mitochondria and high expression has been found in both human skeletal muscle and the heart. Interaction with other proteins indicate that TMEM143 could potentially play a role in tumor suppression/expression and cancer regulation.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

Forkhead-associated domain containing protein 1 (FHAD1) is a protein encoded by the FHAD1 gene.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">CXorf38 Isoform 1</span> Human protein

Chromosome X Open Reading Frame 38 (CXorf38) is a protein which, in humans, is encoded by the CXorf38 gene. CXorf38 appears in multiple studies regarding the escape of X chromosome inactivation.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000102390 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000031226 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Dhananjayan SC, Ismail A, Nawaz Z (2005). "Ubiquitin and control of transcription". Essays Biochem. 41: 69–80. doi:10.1042/EB0410069. PMID   16250898.
  6. Yang WL, Zhang X, Lin HK (August 2010). "Emerging role of Lys-63 ubiquitination in protein kinase and phosphatase activation and cancer development". Oncogene. 29 (32): 4493–503. doi:10.1038/onc.2010.190. PMC   3008764 . PMID   20531303.
  7. 1 2 GeneCard for CXorf26
  8. Aceview Gene Annotation
  9. Stevenson RE (2000). "Alpha-Thalassemia X-Linked Intellectual Disability Syndrome". In Pagon RA, Bird TD, Dolan CR, Stephens K, Adam MP, Stevenson RE (eds.). GeneReviews . Seattle: University of Washington. OCLC   61197798. PMID   20301622.
  10. Q96MV8
  11. Dezso Z, Nikolsky Y, Sviridov E, Shi W, Serebriyskaya T, Dosymbekov D, Bugrim A, Rakhmatulin E, Brennan RJ, Guryanov A, Li K, Blake J, Samaha RR, Nikolskaya T (2008). "A comprehensive functional analysis of tissue specificity of human gene expression". BMC Biol. 6: 49. doi:10.1186/1741-7007-6-49. PMC   2645369 . PMID   19014478.
  12. NCBI GEO Profile GDS3510: Claudin-1 overexpression effect on lung adenocarcinoma cell line
  13. Chao YC, Pan SH, Yang SC, Yu SL, Che TF, Lin CW, et al. (January 2009). "Claudin-1 is a metastasis suppressor and correlates with clinical outcome in lung adenocarcinoma". Am. J. Respir. Crit. Care Med. 179 (2): 123–33. doi:10.1164/rccm.200803-456OC. PMID   18787218.
  14. [Ensembl Genome Browser http://useast.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000102390;r=X:75392771-75398039]
  15. SoftBerry FGENESH
  16. Genomatix: Eldorado Genome Annotation and Browser [www.genomatix.de]
  17. Nakai, Kenta; Horton, Paul (1999). "PSORT: A program for detecting sorting signals in proteins and predicting their subcellular localization". Trends in Biochemical Sciences. 24 (1): 34–6. doi:10.1016/S0968-0004(98)01336-X. PMID   10087920.
  18. Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK (October 2003). "Global analysis of protein localization in budding yeast". Nature. 425 (6959): 686–91. doi:10.1038/nature02026. PMID   14562095. S2CID   669199.
  19. SDSC BiologyWorkbench: TMAP
  20. NCBI BLAST Assembled RefSeq Genomes
  21. NCBI Conserved Domain Database
  22. Sasisekharan R, Venkataraman G (December 2000). "Heparin and heparan sulfate: biosynthesis, structure and function". Curr Opin Chem Biol. 4 (6): 626–31. doi:10.1016/S1367-5931(00)00145-9. PMID   11102866.
  23. Pinhal MA, Smith B, Olson S, Aikawa J, Kimata K, Esko JD (November 2001). "Enzyme interactions in heparan sulfate biosynthesis: uronosyl 5-epimerase and 2-O-sulfotransferase interact in vivo". Proc. Natl. Acad. Sci. U.S.A. 98 (23): 12984–9. Bibcode:2001PNAS...9812984P. doi: 10.1073/pnas.241175798 . PMC   60811 . PMID   11687650.
  24. ExPASy Tools
  25. [A Novel Solution NMR Structure of Protein yst0336 from Saccharomyces cerevisiae https://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=61478&Dopt=s]
  26. [ExPASy Tools: Peptide Cutter http://expasy.org/tools/]
  27. [ExPASy Tools: NetGlycate http://expasy.org/tools/]
  28. [NCBI BLAST Alignment Tool http://blast.ncbi.nlm.nih.gov/Blast.cgi]
  29. SDSC Biology Workbench tools
  30. Wu B, Yee A, Fares C, Lemak A, Gutmanas A, Semest A, Arrowsmith CH. [A Novel Solution NMR Structure of Protein yst0336 from Saccharomyces cerevisiae https://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=61478&Dopt=s]
  31. I2D Protein Interaction Database
  32. STRING 9.0 Protein Interaction Predictor
  33. Miyamoto-Sato E, Fujimori S, Ishizaka M, Hirai N, Masuoka K, Saito R, et al. (February 2010). "A comprehensive resource of interacting protein regions for refining human transcription factor networks". PLOS ONE. 5 (2): e9289. Bibcode:2010PLoSO...5.9289M. doi: 10.1371/journal.pone.0009289 . PMC   2827538 . PMID   20195357.
  34. Sowa ME, Bennett EJ, Gygi SP, Harper JW (July 2009). "Defining the human deubiquitinating enzyme interaction landscape". Cell. 138 (2): 389–403. doi:10.1016/j.cell.2009.04.042. PMC   2716422 . PMID   19615732.
  35. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, et al. (March 2006). "Global landscape of protein complexes in the yeast Saccharomyces cerevisiae". Nature. 440 (7084): 637–43. Bibcode:2006Natur.440..637K. doi:10.1038/nature04670. PMID   16554755. S2CID   72422.