GOLGA8H

Last updated
GOLGA8H
Identifiers
Aliases GOLGA8H , GOLGA6L11, golgin A8 family member H
External IDs HomoloGene: 85511 GeneCards: GOLGA8H
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001282490

n/a

RefSeq (protein)

NP_001269419

n/a

Location (UCSC) Chr 15: 30.6 – 30.62 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

Contents

Gene

Aliases

The most common aliases for GOLGA8H are the following: [3]

Prevalence and Location

GOLGA8H, when compared to many other genes, exists in many different places that span multiple chromosomes:. [4] NCBI lists the gene’s location on the long (q) arm on Chromosome 15 in the q13.2 region, from 30,604,030 - 30,617,827 (13,798 nt in length) [3]

Location of GOLGA8H on chromosome 15 q13.2 region GOLGA8H Location on chromosome 15 q13.2.png
Location of GOLGA8H on chromosome 15 q13.2 region

In actuality, when running the FASTA protein sequence of GOLGA8H on BLAT (the BLAST-Like Alignment Tool), it is found to exist in 85 or 87 different locations (depending on an individual’s sex chromosomes). [4] 81 copies of the protein exist on chromosome 15, one copy each on chromosomes 7, 9,10, and 12, and two copies on the Y chromosome [4]

Copies of GOLGA8H in Homo Sapiens

Chromosome:Number of Copies:
Chromosome 71
Chromosome 91
Chromosome 101
Chromosome 1581
Y Chromosome2
TOTAL:
85 (XX chromosomes*)
87 (XY chromosomes*)

Neighborhood

It would be tedious and inefficient to list all gene neighborhoods for the 87 locations of GOLGA8H. Thus, here are surrounding genes of GOLGA8H on chromosome 15 in the q13.2 region listed on NCBI: [3]

Gene Neighborhood of GOLGA8H on Chromosome 15 q13.2 in Homo sapiens [5] :

GeneAdditional Information
LOC106736468Gamma inversion proximal recombination region
ARHGAP11BRho GTPase activating protein
LOC106736476Proximal CHRNA7 low-copy repeat recombination region
DNM1P50Pseudogene of DNM1, which is involved in producing microtubule bundles that is additionally able to bind and hydrolyze GTP
LOC106736480Proximal microdeletion recombination region
ULK4P2Pseudogene of ULK4, which encodes a member of the unc-51-like serine/threonine kinase (STK) family, in which members play a role in neuronal growth and endocytosis
RN7SL628PPseudogene stemming from cytoplasmic 7SL, an RNA component of the SRP (signal recognition particle)
LOC106783506A nonconserved acetylation island sequence 49 enhancer which can function as an enhancer in Jurkat T cells

Transcript

There are no isoforms of GOLGA8H. [3]

Multiple Sequence Alignment

Paralogs

A multiple sequence alignment (MSA) of GOLGA8H and its top seven paralogs was created using Clustal Omega [1]. [Appendix A] All eight genes from the Golgin Subfamily A Member 8 group were 632 amino acids in length [1]. All 632 amino acids of GOLGA8H and its top seven paralogs were analyzed and compared using Clustal Omega were analyzed and compared in an attempt to understand what makes Golgin Subfamily A Member 8H, GOLGA8H, a distinct entity. Two amino acids make GOLGA8H unique: Valine at amino acid 32 and Cysteine at amino acid 169. [6] For all seven paralogs, the amino acid in position 32 is Isoleucine and the amino acid in position 169 is Arginine [6]

Protein

The predicted molecular weight of GOLGA8H, rounded down to three significant figures, is 71.3 kDa. [7] This is a theoretical value; predicted molecular weights are merely based on the amino acids present in the protein. The theoretical isoelectric point of GOLGA8H, rounded down to one significant figure, is a pI of 8 [7]

Composition

When compared to other human proteins, GOLGA8H is semi glutamine- and glutamate-enriched. [8] In contrast, GOLGA8H is depleted in threonine, phenylalanine, and tyrosene. [8]

Amino Acid Multiplets in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS) Amino Acid Multiplets in GOLGA8H.png
Amino Acid Multiplets in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS)

There are no charge runs, hydrophobic segments, or transmembrane domains in the GOLGA8H protein. [8] There are 62 amino acid multiplets for the protein, which is higher than the expected range. [8] It also has amino acid patterns with high periodicity [8]

Motifs

There are 11 motifs present in GOLGA8H:. [9] The single experimentally-verified motif is a glutamine-rich protein located in the 323-416 amino acid region.

GOLGA8H Motifs [9]
Motif #Motif Information
1N-Glycosylation site
2cAMP- and cGMP-dependent protein kinase phosphorylation site
3Casein kinase II phosphorylation site
4N-myristoylation site
5Protein kinase C phosphorylation site
6Alanine-rich region profile
7Glutamine-rich region profile (experimentally verified)
8K-box domain profile
9Bipartite nuclear localization signal profile
10HCaRG protein
11Involucrin repeat

Post-Translational Modifications

GOLGA8H is predicted to undergo phosphorylation at multiple locations of serine, threonine, and tyrosine throughout its structure. [10] It is expected to undergo phosphorylation most frequently on serine amino acids. [10] Furthermore, there is one predicted N-linked glycosylation site, which occurs at amino acid 39. [10] The sequence for this site is NGS. [10] N-linked glycosylation functions intrinsically and extrinsically to assist in regulating the migration patterns of cells. [11]

Primary Sequence

The protein is 632 amino acids long. [3] It has 19 exons and two polyadenylation signals. [3] Its sequence only partially matches a Kozak consensus sequence. [3]

Amino Acid Periodicity in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS) Periodicity in GOLGA8H.png
Amino Acid Periodicity in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS)

Secondary Structure

The predicted secondary structure of GOLGA8H is composed of 81% alpha helices, 25.6% beta sheets, and 17.2% turns. [12]

Using Phyre2, 284 residues (45% of GOLGA8H) was modeled with 97.8% confidence by the single highest scoring template. [13] This structure shows an extremely high proportion of alpha helices: [13]

284 residues (45% of GOLGA8H) was modeled with 97.8% confidence by the single highest scoring template. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side). Phyre2 Secondary Structure and Disorder Prediction for GOLGA8H.png
284 residues (45% of GOLGA8H) was modeled with 97.8% confidence by the single highest scoring template. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side).

Tertiary Structure

A predicted model for a tertiary structure of GOLGA8H was generated using I-TASSER [14]

I-TASSER Predicted Tertiary Structure of GOLGA8H. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side). Predicted Tertiary Structure of GOLGA8H.png
I-TASSER Predicted Tertiary Structure of GOLGA8H. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side).

Transcript level regulation

Promoter

There is one promoter for the GOLGA8H gene, GXP_2235212, which is 1197 nt long. [15] It lies from base pairs 30,603,030 to 30,604,226 on the positive strand [15]

Transcription Factor Binding Sites

Several transcription factors are predicted to bind to the promoter sequence. Some examples include: [15]

Homology and evolution

Paralogs

GOLGA8H has several dozen paralogs. There are seven paralogs with identity similarities above 90%, charted below under GOLGA8H (included as a reference point): [3]

GOLGA8H Paralogs (>90% Similarity) [3]
#Gene NameAccession #Similarity (%)
1.GOLGA8HNP_001269419.1100.0
2.GOLGA8JNP_001269401.197.8
3.GOLGA8TNP_001342398.197.2
4.GOLGA8KNP_001269422.193.5
5.GOLGA8I pseudogeneA6NC78.296.0
6.GOLGA8MNP_001269397.195.9
7.GOLGA8ONP_001264237.190.7
8.GOLGA8NNP_001269423.190.4

Orthologs

Putting the amino acid sequence of GOLGA8H through a protein BLAST via NCBI does not yield any hits for orthologs:. [3] However, putting the same sequence through BLAT (the BLAST-Like Alignment Tool) yields multiple orthologs [4]

GOLGA8H Orthologs (Inclusion criteria: 1+ Characterized Chromosome Locations or 50+ BLAT Hits)
Organism Common NameScientific NameDivergence (MYA) [16] BLAT HitsMain Chromosome LocationOther Chromosome Locations
Human Homo sapiens -87Chromosome 15Chromosomes 7, 9, 10, 12, Y
Rhesus MacaqueMacaca mulatta670Uncharacterized**Chromosomes 2, 3, 7, 9, 11, 15
Golden Snub-Nosed MonkeyRhinopithecus roxellana654Uncharacterized**-
Olive BaboonPapio anubis945Uncharacterized**Chromosomes 2, 7, 9, 11
Gorilla Gorilla gorilla 1537Chromosome 15Chromosomes 7, 10, 12
Crab-Eating MacaqueMacaca fascicularis2036Chromosome 7Chromosomes 2, 3, 6, 9, 11, 15
Chimpanzee Pan troglodytes 2935Chromosome 15Chromosomes 7, 8, 12, Y
Bornean orangutan Pongo pygmaeus 2934Chromosome 15Chromosomes 5, 7, 9, 10, 12, 19
Bonobo Pan paniscus 2925Chromosome 15Chromosomes 6, 7, 9, 10, 12
Northern White-Cheeked GibbonNomascus leucogenys2924Chromosome 6Chromosomes 5, 8, 10, 16, 17, 18
Green MonkeyChlorocebus sabaeus2921Chromosome 26Chromosomes 9, 11, 12, 21, 22, 29
Proboscis MonkeyNasalis larvatus2919Chromosome 7Chromosomes 3, 9, 11, 15
Common Marmosetcallithrix jacchus424Chromosome 6Chromosome 9
Horse (Domesticated)Equus ferus caballus893Chromosome 29Chromosome 25
Gray Short-Tailed OpossumMonodelphis domestica943Chromosome 3-
Common House MouseMus musculus943Chromosome 11-
Dog (Domesticated)Canis familiaris943Chromosome 15-
Taurine CowBos taurus1601Chromosome 13-

**Chromosomes labeled as 'uncharacterized' have clone contigs (an assembled set of overlapping DNA sequences) that cannot be confidently placed on a specific chromosome. Similar contigs are concatenated together into short pseudo-chromosomes.

Expression

Data from NCBI shows that GOLGA8H in Homo sapiens has the strongest expression is through the thyroid and testis, with RKPMs of 12.2 and 12.1 respectively. It is also expressed in lesser amounts in 25 other tissues. [3] Data from GEO DataSet show the tissue expression is highest in bone marrow and pancreas tissue. [17] However, samples from all tissues were above the 90th percentile, indicating that the expression value of that gene is much higher in respect to all other genes on the array. [17]

Normal Human Tissue Expression Profiling for GOLGA8H. Two samples are used for each tissue type. Red bars represent count and blue squares represent relative percentile rank within the sample. Normal Human Tissue Expression Profiling for GOLGA8H. .png
Normal Human Tissue Expression Profiling for GOLGA8H. Two samples are used for each tissue type. Red bars represent count and blue squares represent relative percentile rank within the sample.


When comparing GOLGA8H tissue expression in abnormal conditions to normal human tissue levels, there is not significant deviation in its expression with any variable. [17] This supports the notion that GOLGA8H is ubiquitous.

Interactions

GOLGA8H has been shown to interact with Ubiquitin C (UBC). [18] UBC is a polyubiquitin precursor. Polyubiquitin precursors are a chain of the protein ubiquitin that can be turned into an active form by post-translational modifications. This can mark proteins for degradation, alter their cellular location, affect their activity, and promote or prevent protein interactions. Further research on the link between ubiquitin and the Golgi apparatus include a reliance on ubiquitin to achieve certain processes around the Golgi apparatus. [19] [20]

String db lists the following genes as interacting with GOLGA8H: [21]

Genes Interacting with GOLGA8H:
Gene NameFull NameAccession Number [5] Experimentally- DeterminedCoexpressionText Mining
STX5Syntaxin 5NC_000011.10
GORASP1Golgi reassembly stacking protein 1NC_000003.12
GOSR1Golgi SNAP receptor complex member 1NC_000017.1
USO1USO1 vesicle transport factorNC_000004.12
GOLGB1Golgin B1NC_000003.12

Splice variants

The Homo sapiens GOLGA8H gene has 1 splice variant [22]

Related Research Articles

<span class="mw-page-title-main">Interferon-inducible GTPase 5</span> Protein-coding gene in the species Homo sapiens

Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.

<span class="mw-page-title-main">Zinc finger protein 684</span> Protein found in humans

Zinc finger protein 684 is a protein that in humans is encoded by the ZNF684 gene.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C9orf25</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">TEX9</span> Protein-coding gene in the species Homo sapiens

Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">FAM178B</span> Protein coding

FAM178B is a protein coding that is located on the plus strand of chromosome 2. The locus for the gene is 2q11.2. It is also known by the aliases Family with Sequence Similarity 178, Member B, and HSPC234. In total there are 24 exons in the gene. FAM178B spans 110,720 base pairs, and contains 827 amino acids.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">C14orf180</span> Protein-coding gene in the species Homo sapiens

C14orf180 is found on chromosome 14 in humans: 14q32.33. It consists of 1832 bp and 160 amino acids post translation. There is a total number of 6 exons. C14orf180 is also known as NRAC, C14orf77, and Chromosome 14 Open Reading Frame 180.

<span class="mw-page-title-main">SNAP47</span>

Synaptosome-associated protein, 47 kDal (SNAP47) is a human protein encoded by the SNAP47 gene. Other aliases of this gene are SVAP1, HEL170, ESFI5812, and HEL-S-290. SNAP47 is a synaptosome protein which is associated with the protein coding in multiple diseases, including non small cell lung cancer and schizophrenia. SNAP47 is a member of the SNAP protein family. SNAP proteins are t-snare proteins that are a component of SNARE complex. The SNARE complex mediates vesicle fusion by creating tight complex that brings vesicle and membrane together. This protein causes ubiquitous expression in testis, ovary, and many other tissues

RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">TMEM19</span> Protein encoded by the TMEM19 gene

Transmembrane protein 19 is a protein that in humans is encoded by the TMEM19 gene.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000261794 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. 1 2 3 4 5 6 7 8 9 10 11 "GOLGA8H golgin A8 family member H [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-02-24.
  4. 1 2 3 4 "BLAT Search: GOLGA8H". genome.ucsc.edu. Retrieved 2019-05-13.
  5. 1 2 "GOLGA8H golgin A8 family member H [ Homo sapiens (human) ]". National Center for Biotechnology Information.
  6. 1 2 "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-15.
  7. 1 2 "ExPASy". web.expasy.org. Retrieved 2019-05-13.
  8. 1 2 3 4 5 "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-13.
  9. 1 2 "PROSITE". prosite.expasy.org. Retrieved 2019-05-13.
  10. 1 2 3 4 "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2019-05-15.
  11. Taylor ME (2006). Introduction to glycobiology. Drickamer, Kurt. (2nd ed.). Oxford: Oxford University Press. ISBN   0199282781. OCLC   62307306.
  12. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2019-05-13.
  13. 1 2 "Phyre 2 Results for GOLGA8H". www.sbg.bio.ic.ac.uk. Retrieved 2019-05-13.
  14. "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-13. Retrieved 2019-05-13.
  15. 1 2 3 "Genomatix: Gene2Promoter". www.genomatix.de. Archived from the original on 2022-11-27. Retrieved 2019-05-14.
  16. "TimeTree :: The Timescale of Life". timetree.org. Retrieved 2019-05-15.
  17. 1 2 3 "About GEO Profiles - GEO - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-13.
  18. "Gene Set - GOLGA8H". amp.pharm.mssm.edu. Retrieved 2019-02-24.
  19. Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell (4th ed.). Garland Science. ISBN   9780815332183.
  20. Glickman MH, Ciechanover A (April 2002). "The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction". Physiological Reviews. 82 (2): 373–428. doi:10.1152/physrev.00027.2001. PMID   11917093.
  21. "STRING: functional protein association networks". string-db.org. Retrieved 2019-05-15.
  22. "Gene: GOLGA8H (ENSG00000261794) - Splice variants - Homo sapiens - Ensembl genome browser 95". useast.ensembl.org. Retrieved 2019-03-04.