Chitinase domain-containing protein 1

Last updated
CHID1
Available structures
PDB Ortholog search: PDBe RCSB
Identifiers
Aliases CHID1 , SI-CLP, SICLP, GL008, chitinase domain containing 1
External IDs OMIM: 615692 MGI: 1915288 HomoloGene: 12229 GeneCards: CHID1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001142674
NM_001142675
NM_001142676
NM_001142677
NM_023947

Contents

NM_001142681
NM_026522
NM_001369350
NM_001369351
NM_001369352

RefSeq (protein)

NP_001136146
NP_001136147
NP_001136148
NP_001136149
NP_076436

NP_001136153
NP_080798
NP_001356279
NP_001356280
NP_001356281

Location (UCSC) Chr 11: 0.87 – 0.92 Mb Chr 7: 141.07 – 141.12 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chitinase domain-containing protein 1 (CHID1) is a highly conserved protein of unknown function located on the short (p) arm of chromosome 11 near the telomere. [5] The protein has 27 introns, which allows for many isoforms of this gene. It has several aliases, the most common of which is Stabilin-1 interacting chitinase-like protein (SI-CLP). As indicated by the alias, CHID1 is known to interact with the protein STAB1. [6] CHID1 is expressed ubiquitously at levels nearly 6 times the average gene, [7] and is conserved very far back to organisms such as Caenorhabditis elegans and possibly some prokaryotes. This protein is known to have carbohydrate binding sites, which could be involved in carbohydrate catabolysis.

Gene

CHID1 is located on chromosome 11 at the location p15.5. [8] It is just downstream of TSPAN4 and upstream of AP2A2. [9] CHID1 is ubiquitously expressed at a high levels. Through microarray analysis, it has been shown that CHID1 is generally expressed at 5.7 times the average gene. [7] CHID1 has many known variants, which is attributed to its 37 exons. There are no inherent repeats or hairpin structures to be found within the coding region of CHID1. This gene is a member of the GH18 superfamily, which dictates some of its protein structure. This gene also has several aliases, the most common of which is Stabilin-1 interacting chitinase-like protein, or SI-CLP, which indicates its known interaction with STAB1. [10]

Variations

Due to its large size and many exons, CHID1 has many transcript variants that have been identified through mRNA sequencing. CHID1 has 27 exons and 22 known splice forms. [7] These forms indicate that there may be multiple promotor regions and transcription start sites used within the genome. The most commonly found transcripts translate to about 400 amino acids each.

Homology

CHID1 shows a very high level of conservation. It has been identified in many model systems including Drosophila melanogaster, Caenorhabditis elegans, and Oryza sativa. [11] When conservation is observed over a large region, it becomes clear which protein domains are most important to CHID1. By far, the two best conserved factors are exon junctions you can add a link to exon junction, as well as several known carbohydrate binding sites. Another important region appears to be the last 15 amino acids, which retain a high level of conservation even through insect sequences.

A multiple sequence alignment of several CHID1 homologs with highlighted conserved sequences and important features indicated. CHID1 Multiple Sequence Alignment.png
A multiple sequence alignment of several CHID1 homologs with highlighted conserved sequences and important features indicated.
SpeciesCommon nameDivergence in millions of yearsAccession numberSimilarity to human sequence
Gorilla gorilla gorillaGorilla8.8XP_004050445.1100%
Macaca mulattaRhesus macaque29.0AFI38071.194%
Equus caballusHorse94.2XP_003362692.191%
Felis catusHouse cat94.2XP_003993852.190%
Taeniopygia guttataFinch296XP_002199280.278%
Ictalurus punctatusCatfish400.1NP_001187344.170%
Apis floreaHoney bee782.7XP_003691984.161%

Paralogs

There are also several proposed paralogs to CHID1. These genes are di-N-acetylchitobiase, Chitinase-3-like protein 2, and chitotriosidase 1. All of these genes have roughly 40% similarity, and match human transcripts of CHID1 over 40-50% of the gene. [12] Based on the level of conservation, these paralogs all likely split at a time near to the split from bacteria.

Phylogenetic tree based on the differences in various CHID1 homologs and its paralogs. CHID1phylogeny.png
Phylogenetic tree based on the differences in various CHID1 homologs and its paralogs.
NameAccession numberLength (amino acids)Similarity to CHID1 in humans
di-N-acetylchitobiaseNP_004379.138538%
Chitinase-3-like protein 2NP_001020368.138039%
Chitotriosidase-1NP_001243054.244740%

Protein

The protein translation of CHID1 is typically about 400 amino acids long (though this varies within the many known forms), and has few post-translational modifications. [13] In most forms and orthologs, it is predicted that CHID1 has a signal peptide that varies in length by sequence. [14] Structure predictions and x-ray crystallography structures of CHID1 indicate that its secondary structure is heavy in alpha helices, though it definitely has some beta strands present, as indicated by the SI-CLP x-ray crystallography structure. CHID1 is also a member of the GH18 superfamily of proteins, which has a unique conserved structure. [15] Members of this family contain an 8 stranded beta/alpha barrel. This protein is also shown to interact with other copies of itself in a homodimer, which also uses sulfate ion interacting molecules for an unknown purpose.

The barrel structure of CHID1 found by x-ray crystallography CHID1 barrel.png
The barrel structure of CHID1 found by x-ray crystallography

Expression

An expression profile analysis of CHID1 in normal tissues CHID1 geo profile.png
An expression profile analysis of CHID1 in normal tissues

The expression of CHID1 is consistently shown to be ubiquitous and higher than average in all human tissues. [16] [17] [18] When analyzing the promoter region of CHID1 in humans, this expression is explained by the large number of transcription factor binding sites which are active in a wide range of tissues or are even ubiquitous. [19] In Drosophila melanogaster , tissue expression varies wildly between larval and adult tissue. [20] While expression is extremely widespread in adults, it is not at such a high level as in humans. In larvae, CHID1 is only shown to be expressed in specific tissues, and is up-regulated in very different tissues than in adults. In larval tissues, expression is highest in the salivary gland, but in adults the highest expression by far is in the male accessory gland. [20] Though expression is generally high in some species, it is not necessarily abnormal. In situ hybridization of gene transcripts in mouse brains shows that CHID1 is expressed relatively normally. [21] The pattern of high and low expression with the brain is very similar to another widespread gene: beta actin. [22] This is one indication that CHID1 may show normal expression patterning in mammals, despite its general upregulation.

Function

Although CHID1 has no known function for certain, there are several proposed activities of the protein based on current knowledge. This protein may participate in metabolic processes such as chitin catabolism or carbohydrate metabolism. It may locate in various cellular compartments such as the cytoplasm, or in lysosomes all depending on certain post-translational modifications. [23] Another study proposed that CHID1 may have roles in pathogen sensing. [15] The function of CHID1 may also depend on its putative binding partner, STAB1, which is proposed to participate in cell signaling and defense against bacterium. [24]

Interactions

CHID1 is only strongly suggested to interact with one other protein. The transmembrane protein stabilin-1 has been detected as an interactant by in vitro , in vivo , and yeast two-hybrid assays. [10] STAB1 is a large transmembrane receptor protein which may function in many aspects such as lymphocyte homing or angiogenesis. It is expressed at over twice the average gene level and is expected to play a role in cell defense against bacterium. [25] This interaction may give insight towards the function of both proteins as a whole.

Clinical Significance

Overall, the significance of CHID1 is unknown. Many expression profiles of CHID1 show that its expression does not change in any common diseases or conditions as currently studied. [26] Given the lack of knowledge about the function of CHID1, it is difficult to further study its role within the human condition.

Related Research Articles

<span class="mw-page-title-main">Transmembrane protein 151b</span> Transmembrane protein

Transmembrane protein 151B is a protein that in humans is encoded by the TMEM151B gene.

<span class="mw-page-title-main">Transmembrane protein 268</span>

Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.

LOC105377021 is a protein which in humans is encoded by the LOC105377021 gene. LOC105377021 exhibits expressional pathology related to breast cancer, specifically triple negative breast cancer. LOC105377021 contains a serine rich region in addition to predicted alpha helix motifs.

TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

<span class="mw-page-title-main">Leucine-rich repeats and iq motif containing 1</span> Protein-coding gene in the species Homo sapiens

Leucine-rich repeats and IQ motif containing 1 is a protein that in humans is encoded by the LRRIQ1 gene. The protein is likely a nuclear encoding mitochondrial protein and is found in all Metazoans.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">C12orf60</span>

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">TMCO4</span>

Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

LCHN is a protein that in humans is encoded by the KIAA1147 gene located on chromosome 7. It is likely part of the tripartite DENN domain family of proteins that often function as Rab-GEFs to regulate vesicular trafficking. Both the mRNA and protein have been shown to be upregulated following ischemic stroke, and to be produced at altered levels in patients with FTD-ALS, however the gene's contribution to these states is not well understood.

<span class="mw-page-title-main">SHLD1</span>

SHLD1 or shieldin complex subunit 1 is a gene on chromosome 20. The C20orf196 gene encodes an mRNA that is 1,763 base pairs long, and a protein that is 205 amino acids long.

<span class="mw-page-title-main">FAM71E1</span> Mammalian protein found in Homo sapiens

FAM71E1, also known as Family With Sequence Similarity 71 Member E1, is a protein that in humans is encoded by the FAM71E1 gene. It is thought to be ubiquitously expressed at low levels throughout the body, and it is conserved in vertebrates, particularly mammals and some reptiles. The protein is localized to the nucleus and can be exported to the cytoplasm.

<span class="mw-page-title-main">ZCCHC18</span> Protein-coding gene in the species Homo sapiens

Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.

<span class="mw-page-title-main">C9orf25</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000177830 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000025512 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Taylor TD, Noguchi H, Totoki Y, Toyoda A, Kuroki Y, Dewar K, Lloyd C, Itoh T, Takeda T, Kim DW, She X, Barlow KF, Bloom T, Bruford E, Chang JL, Cuomo CA, Eichler E, FitzGerald MG, Jaffe DB, LaButti K, Nicol R, Park HS, Seaman C, Sougnez C, Yang X, Zimmer AR, Zody MC, Birren BW, Nusbaum C, Fujiyama A, Hattori M, Rogers J, Lander ES, Sakaki Y (Mar 2006). "Human chromosome 11 DNA sequence and analysis including novel gene identification". Nature. 440 (7083): 497–500. Bibcode:2006Natur.440..497T. doi: 10.1038/nature04632 . PMID   16554811.
  6. "CHID1 Gene". GeneCards. Weizmann Institute of Science. Retrieved 30 April 2013.
  7. 1 2 3 "Homo sapiens complex locus CHID1, encoding chitinase domain containing 1". AceView. NCBI. Retrieved 30 April 2013.
  8. GeneCards Human Gene Database. "CHID1 Gene - GeneCards | CHID1 Protein | CHID1 Antibody". GeneCards. Retrieved 2013-07-26.
  9. "Human chr11:867,357-915,058 - UCSC Genome Browser v286". Genome.ucsc.edu. Retrieved 2013-07-26.
  10. 1 2 Kzhyshkowska J, Mamidi S, Gratchev A, Kremmer E, Schmuttermaier C, Krusell L, Haus G, Utikal J, Schledzewski K, Scholtze J, Goerdt S (Apr 2006). "Novel stabilin-1 interacting chitinase-like protein (SI-CLP) is up-regulated in alternatively activated macrophages and secreted via lysosomal pathway". Blood. 107 (8): 3221–8. doi: 10.1182/blood-2005-07-2843 . PMID   16357325.
  11. "HomoloGene Home". Ncbi.nlm.nih.gov. Retrieved 2013-07-26.
  12. "BLAST: Basic Local Alignment Search Tool". Blast.ncbi.nlm.nih.gov. 2013-06-20. Retrieved 2013-07-26.
  13. "Welcome to". Psort.org. 2010-05-17. Retrieved 2013-07-26.
  14. "LipoP 1.0 Server". Cbs.dtu.dk. Retrieved 2013-07-26.
  15. 1 2 Meng G, Zhao Y, Bai X, Liu Y, Green TJ, Luo M, Zheng X (Dec 2010). "Structure of human stabilin-1 interacting chitinase-like protein (SI-CLP) reveals a saccharide-binding cleft with lower sugar-binding selectivity". The Journal of Biological Chemistry. 285 (51): 39898–904. doi: 10.1074/jbc.M110.130781 . PMC   3000971 . PMID   20724479.
  16. "BioGPS - your Gene Portal System".
  17. Danielle Thierry-Mieg; Jean Thierry-Mieg, NCBI/NLM/NIH. "AceView: Gene:CHID1, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". Ncbi.nlm.nih.gov. Retrieved 2013-07-26.
  18. GeneCards Human Gene Database. "CHID1 Gene - GeneCards | CHID1 Protein | CHID1 Antibody". GeneCards. Retrieved 2013-07-26.
  19. "Genomatix: Login Page". Genomatix.de. Retrieved 2013-07-26.
  20. 1 2 "FlyBase Gene Report: Dmel\CG8460". Flybase.org. Retrieved 2013-07-26.
  21. "Experiment Detail :: Allen Brain Atlas: Mouse Brain". Mouse.brain-map.org. Retrieved 2013-07-26.
  22. "Experiment Detail :: Allen Brain Atlas: Mouse Brain". Mouse.brain-map.org. Retrieved 2013-07-26.
  23. Kzhyshkowska J, Mamidi S, Gratchev A, Kremmer E, Schmuttermaier C, Krusell L, Haus G, Utikal J, Schledzewski K, Scholtze J, Goerdt S (Apr 2006). "Novel stabilin-1 interacting chitinase-like protein (SI-CLP) is up-regulated in alternatively activated macrophages and secreted via lysosomal pathway". Blood. 107 (8): 3221–8. doi: 10.1182/blood-2005-07-2843 . PMID   16357325.
  24. Danielle Thierry-Mieg; Jean Thierry-Mieg, NCBI/NLM/NIH. "AceView: Gene:STAB1, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". Ncbi.nlm.nih.gov. Retrieved 2013-07-26.
  25. Danielle Thierry-Mieg; Jean Thierry-Mieg, NCBI/NLM/NIH. "AceView: Gene:STAB1, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". Ncbi.nlm.nih.gov. Retrieved 2013-07-26.
  26. "Home - GEO Profiles - NCBI". Ncbi.nlm.nih.gov. 2013-03-25. Retrieved 2013-07-26.