ARMH1

Last updated
ARMH1
Identifiers
Aliases ARMH1 , NCRNA00082, p40, chromosome 1 open reading frame 228, C1orf228, armadillo-like helical domain containing 1, armadillo like helical domain containing 1
External IDs MGI: 2686507 HomoloGene: 28727 GeneCards: ARMH1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001004307
NM_001145636

NM_001033774
NM_001145637

RefSeq (protein)

NP_001139108

n/a

Location (UCSC) Chr 1: 44.67 – 44.73 Mb Chr 4: 117.07 – 117.11 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Armadillo-like Helical Domain Containing 1 (ARMH1) is a protein which in humans is encoded by chromosome 1 open reading frame 228, also known as the ARMH1 gene. The gene shows expression levels significantly higher in bone marrow, lymph nodes, and testis. [5] Currently the function of the gene and subsequent protein is still uncertain.

Contents

Gene

The ARMH1 gene is found on the plus strand of chromosome 1 between base pairs 45,140,361 and 45,191,784. Other known aliases include P40, NCRNA00082, and most commonly C1orf228. The gene has 13 exons, most of which are concentrated near the poly-A site at the end of the gene and two located upstream from the start codon. The gene is highly expressed in bone marrow and lymph nodes, suggesting an immunological function. [6]

Gene expression

RNA seq data was produced using multiple samples of human tissues at varying stages of development. One study was acquired from 20 separate samples of human tissue showing significantly more expression of ARMH1 in the thymus, trachea, and lungs. [7] A second study shows 27 different tissues samples in 95 different individual subjects. The expression levels are significantly higher in bone marrow, lymph nodes, and testis. [8] A third shows high expression in white blood cells and testis again, corroborating previous studies. [9] A temporal study focused on expression in different stages of development collected 35 human fetal samples, from 6 distinct tissues, between 10 and 20 weeks gestational time and sequenced using Illumina TruSeq Stranded Total RNA. The data slightly favored expression in the adrenal glands throughout development. In each of the other tissues there were no stark changes in expression through time, only a small decline of gene expression as development furthers. [10]

Gene transcripts

The ARMH1 gene has extensive abilities to alter its function and size through isoforms. Gene isoforms are mRNAs that are produced from the same locus but are different in their transcription start sites, protein coding DNA sequences and/or untranslated regions, potentially altering gene function. All known isoforms are organized and listed below with information gathered from NCBI gene, [11] and a Bioinformatics tool for calculating molecular weight. [12]

Protein IsoformProtein AccessionProtein LengthMolecular WeightmRNA IsoformmRNA AccessionmRNA length
X1XP_047275293446 aa49.58 KdaX5XM_0115413401693 bp
X2XP_011539647433 aa48.17 KdaX7XM_0115413451909 bp
X3XP_047275308431 aa47.39 KdaX8XM_0474193521782 bp
X4XP_047275309419 aa46.17 KdaX9XM_0474193531507 bp
X5XP_047275314405 aa44.49 KdaX12XM_0474193581588 bp
X6XP_016856631391 aa43.58 KdaX13XM_0170011421546 bp
X7XP_047275318379 aa41.32 KdaX14XM_0474193621393 bp
X8XP_011539651376 aa41.67 KdaX15XM_0115413491645 bp
X9XP_016856632365 aa40.47 KdaX16XM_0170011431468 bp
X10XP_047275323364 aa40.17 KdaX17XM_0474193671342 bp
X11XP_054192270338 aa37.06 KdaX18XM_0543362951264 bp
X12XP_054192271336 aa36.46 KdaX19XM_0543362961207 bp
X13XP_054192272333 aa36.84 KdaX20XM_0543362971474 bp
x14XP_047275327332 aa36.65 KdaX21XM_0474193711262 bp
x15XP_054192274274 aa30.61 KdaX23XM_0543362991670 bp
x16XP_016856635263 aa29.31 KdaX24XM_0170011461146 bp
x17XP_054192276242 aa27.05 KdaX25XM_0543363012306 bp
x18XP_054192277213 aa23.69 KdaX26XM_0543363021380 bp

mRNA

The mRNA for this gene can be spliced in many different ways, making way for approximately 20 known isoforms. The most common mRNA gets spliced down to a coding region that is about 1693 nucleotides long which makes up 440 amino acids in total. [13] In a comprehensive study on oral squamous cell carcinoma, the sixth most prevalent cancer worldwide, identified ARMH1 as a gene of interest by comparing healthy subjects mRNA against affected individuals. Through mRNA inhibition of ARMH1, researchers demonstrated significantly reduced leukemic cell proliferation (P=.0041) and leukemic cell migration (P=.0001), as well as a decreased resistance to the chemotherapy drug Cytarabine. [14] [15]

Protein

The protein encoded by the gene goes by the same name, Armadillo like containing helical domain 1. The isoelectric point of the ARMH1 protein is around a pH of 5.5. [16] The protein has 2 known major domains, one being a transmembrane domain and the other being a coiled coil. [17] Within the coiled coil domains, the ARMH1 protein has 24 alpha helices. [18] [19] [20] [21] The European Bioinformatics Institute's analysis of ARMH1 reveals clearly a significantly enriched lysine content as well as a significantly deficient proline count. [22] The protein has been proven to have one major interaction with the human protein known as ABAT. [23] Gamma-aminobutyric acid transaminase (ABAT) catalyzes the conversion of gamma-aminobutyric acid (GABA) into succinic semialdehyde. Additionally, ABAT expression was associated with glycolysis-related genes, infiltrated immune cells, immunoinhibitors, and immunostimulators in HCC. [24]

AlphaFold, the state-of-the-art AI system developed by DeepMind, is able to computationally predict protein structures in 3D space. ARMH1 alphaphold.png
AlphaFold, the state-of-the-art AI system developed by DeepMind, is able to computationally predict protein structures in 3D space.

Homology and evolution

The ARMH1 gene is extremely diverse and is found in thousands of different species. From primates to fungus, this gene has been evolutionarily relevant for hundreds of millions of years. While in near relatives such as cows, the similarity score is 91% that of our genome, in species of fungi the similarity ranges between 20 and 30%. [26] While attempting to find homologs in any round or flat worms, single celled eukaryotes or prokaryotes, plants, or any fungi besides chitrids, there were no significantly similar genes found. Below is a table of orthologous genes in order of sequence similarity compared to the human ARMH1 isoform X1.

SpeciesCommon nameAccession numberDate of divergenceSequence length (AA)Sequence similaritySequence Identity
Homo sapiens HumanNP_0011391080 mya440100%100%
Microcebus murinus Grey Mouse LemurXP_012631405.174 mya44188%82%
Rattus norvegicus Brown RatNP_001119769.287 mya44180%78%
Bos taurus CowXP_005204913.194 mya44291%83%
Ornithorhynchus anatinus PlatypusXP_028938784.1180 mya45975%60%
Apteryx rowi Oktarito KiwiXP_025942684319 mya41973%59%
Haliaeetus leucocephalus Bald EagleXP_010581029319 mya41870%56%
Gopherus flavomarginatus Bolson TortoiseXP_050817160319 mya42178%65%
Xenopus tropicalis Western Clawed FrogXP_017949069352 mya40970%55%
Danio rerio Zebra FishXP_001341083.1429 mya41071%53%
Leucoraja erinacea Little SkateXP_055497706462 mya40669%53%
Lytechinus pictus Painted UrchinXP_054764007619 mya40667%51%
Owenia fusiformisSegmented WormCAH1776102.1686 mya41071%51%
Aplysia californica California Sea HareXP_012936639.1708 mya41069%52%
Adineta sterineriRotiferaCAF4083605.1708 mya42056%37%
Pocillopora verrucosa Colonial CoralXP_058955966.1708 mya40467%49%
Geodia barretti Sea SpongeCAI8036895.1758 mya40450%35%
Blastocladiella britannicaChytridsKAI92186621275 mya42334%22%
Borealophlyctis nickersoniaeRhizophlyctidalesKAJ32891371275 mya45319%11%

Clinical significance

The ARMH1 gene and subsequent protein have been extensively linked to leukemia, specifically T-cell acute lymphoblastic leukemia (T-ALL). [27] In mostly lymphatic tissue cell lines, T-ALL showed dramatically increased expression of the ARMH1 gene. Bone marrow samples were taken at the initial diagnosis and the conclusion of treatment and ARMH1 along with 5 other genes that were all found to be dramatically changed in expression. To corroborate these findings, once again ARMH1 saw a 1.8x expression increase in samples after diagnosis of leukemia. Higher ARMH1 expression was significantly associated with poor overall survival. [28]

Related Research Articles

<span class="mw-page-title-main">YIF1A</span> Protein-coding gene in the species Homo sapiens

Protein YIF1A is a Yip1 domain family proteins that in humans is encoded by the YIF1A gene.

<span class="mw-page-title-main">Trinucleotide repeat containing 18</span> Protein-coding gene in the species Homo sapiens

Trinucleotide repeat containing 18 is a protein that in humans is encoded by the TNRC18 gene.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">Transmembrane protein 255A</span> Mammalian protein found in Homo sapiens

Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.

<span class="mw-page-title-main">C16orf82</span> Protein-coding gene in the species Homo sapiens

C16orf82 is a protein that, in humans, is encoded by the C16orf82 gene. C16orf82 encodes a 2285 nucleotide mRNA transcript which is translated into a 154 amino acid protein using a non-AUG (CUG) start codon. The gene has been shown to be largely expressed in the testis, tibial nerve, and the pituitary gland, although expression has been seen throughout a majority of tissue types. The function of C16orf82 is not fully understood by the scientific community.

<span class="mw-page-title-main">C1orf112</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 112, is a protein that in humans is encoded by the C1orf112 gene, and is located at position 1q24.2. C1orf112 encodes for seventeen variants of mRNA, fifteen of which are functional proteins. C1orf112 has a determined precursor molecular weight of 96.6 kDa and an isoelectric point of 5.62. C1orf112 has been experimentally determined to localize to the mitochondria, although it does not contain a mitochondrial targeting sequence.

<span class="mw-page-title-main">C13orf38</span>

C13orf38 is a protein found in the thirteenth chromosome with an open reading frame number 38. It is 139 amino acids long. The protein goes by a number of aliases CCDC169-SOHLH2 and CCDC169. The protein is found to be over expressed in the testis of humans. It is not known what the exact function of the protein is at this current time. The human CCDC169 gene contains 753 nucleotides. C13orf contains a domain of unknown function DUF4600. which is conserved in between nucleotide interval 1-79. The protein contains 139 amino acids.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

<span class="mw-page-title-main">C13orf42</span> C13orf42 gene page

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">FAM131A</span> Information on the FAM131A gene and the protein it encodes

FAM131A is a protein that is encoded by the FAM131A gene in humans. Aliases for FAM131A include C3orf40, FLAT715, and PRO1378.

SPMIP10 is a protein that in Homo sapiens is encoded by the SPMIP10 gene.

<span class="mw-page-title-main">FAM86B1</span> Protein found in most eukaryotes

FAM86B1 is a protein, which in humans is encoded by the FAM86B1 gene. FAM86B1 is an essential gene in humans. The protein contains two domains: FAM86, and AdoMet-MTase.

<span class="mw-page-title-main">ZNF839</span> Protein which in humans is encoded by the ZNF839 gene

ZNF839 or zinc finger protein 839 is a protein which in humans is encoded by the ZNF839 gene. It is located on the long arm of chromosome 14. Zinc finger protein 839 is speculated to play a role in humoral immune response to cancer as a renal carcinoma antigen (NY-REN-50). This is because NY-REN-50 was found to be over expressed in cancer patients, especially those with renal carcinoma. Zinc finger protein 839 also plays a role in transcription regulation by metal-ion binding since it binds to DNA via C2H2-type zinc finger repeats.

<span class="mw-page-title-main">ZFP62</span> Gene in Humans

Zinc Finger Protein 62, also known as "ZNF62," "ZNF755," or "ZET," is a protein that in humans is encoded by the ZFP62 gene. ZFP62 is part of the C2H2 Zinc Finger family of genes.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000198520 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000060268 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. (February 2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Molecular & Cellular Proteomics. 13 (2): 397–406. doi: 10.1074/mcp.M113.035600 . PMC   3916642 . PMID   24309898.
  6. https://www.genecards.org/cgi-bin/carddisp.pl?gene=ARMH1>
  7. Duff MO, Olson S, Wei X, Garrett SC, Osman A, Bolisetty M, et al. (May 2015). "Genome-wide identification of zero nucleotide recursive splicing in Drosophila". Nature. 521 (7552): 376–379. Bibcode:2015Natur.521..376D. doi:10.1038/nature14475. PMC   4529404 . PMID   25970244.
  8. Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. (February 2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Molecular & Cellular Proteomics. 13 (2): 397–406. doi: 10.1074/mcp.M113.035600 . PMC   3916642 . PMID   24309898.
  9. "Illumina bodyMap2 transcriptome (ID 204271) - BioProject - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2023-12-08.
  10. Szabo L, Morey R, Palpant NJ, Wang PL, Afari N, Jiang C, et al. (June 2015). "Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development". Genome Biology. 16 (1): 126. doi: 10.1186/s13059-015-0690-5 . PMC   4506483 . PMID   26076956.
  11. "ARMH1 armadillo like helical domain containing 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2023-12-07.
  12. "Protein Molecular Weight". www.bioinformatics.org. Retrieved 2023-12-07.
  13. https://www.ncbi.nlm.nih.gov/gene/339541>
  14. Huang SN, Li GS, Zhou XG, Chen XY, Yao YX, Zhang XG, et al. (June 2020). "Identification of an Immune Score-Based Gene Panel with Prognostic Power for Oral Squamous Cell Carcinoma". Medical Science Monitor. 26: e922854. doi:10.12659/MSM.922854. PMC   7305786 . PMID   32529991.
  15. Bhasin SS, Thomas BE, Summers RJ, Sarkar D, Mumme H, Pilcher W, et al. (August 2023). "Pediatric T-cell acute lymphoblastic leukemia blast signature and MRD associated immune environment changes defined by single cell transcriptomics analysis". Scientific Reports. 13 (1): 12556. Bibcode:2023NatSR..1312556B. doi:10.1038/s41598-023-39152-z. PMC   10397284 . PMID   37532715.
  16. "ARMH1 (human)". www.phosphosite.org. Retrieved 2023-12-07.
  17. https://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?c=geneid&org=9606&l=339541>
  18. "Bioinformatics Toolkit". toolkit.tuebingen.mpg.de. Retrieved 2023-12-07.
  19. "JPred Secondary Structure Prediction". www.jalview.org. Retrieved 2023-12-07.
  20. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. (August 2021). "Highly accurate protein structure prediction with AlphaFold". Nature. 596 (7873): 583–589. Bibcode:2021Natur.596..583J. doi:10.1038/s41586-021-03819-2. PMC   8371605 . PMID   34265844.
  21. Rost B, Liu J (July 2003). "The PredictProtein server". Nucleic Acids Research. 31 (13): 3300–3304. doi:10.1093/nar/gkg508. PMC   168915 . PMID   12824312.
  22. "SAPS Results". www.ebi.ac.uk. Retrieved 2023-12-07.
  23. Huttlin EL, Bruckner RJ, Paulo JA, Cannon JR, Ting L, Baltier K, et al. (May 2017). "Architecture of the human interactome defines protein communities and disease networks". Nature. 545 (7655): 505–509. Bibcode:2017Natur.545..505H. doi:10.1038/nature22366. PMC   5531611 . PMID   28514442.
  24. Gao X, Jia X, Xu M, Xiang J, Lei J, Li Y, et al. (2022-06-24). "Regulation of Gamma-Aminobutyric Acid Transaminase Expression and Its Clinical Significance in Hepatocellular Carcinoma". Frontiers in Oncology. 12: 879810. doi: 10.3389/fonc.2022.879810 . PMC   9280914 . PMID   35847853.
  25. Laura Howes (2020-12-05). "DeepMind AI predicts protein structures". Chemical & Engineering News: 4. doi:10.47287/cen-09847-leadcon. ISSN   1520-605X. S2CID   230619516.
  26. "ARMH1 armadillo like helical domain containing 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2023-12-07.
  27. Bhasin SS, Thomas BE, Summers RJ, Sarkar D, Mumme H, Pilcher W, et al. (August 2023). "Pediatric T-cell acute lymphoblastic leukemia blast signature and MRD associated immune environment changes defined by single cell transcriptomics analysis". Scientific Reports. 13 (1): 12556. Bibcode:2023NatSR..1312556B. doi:10.1038/s41598-023-39152-z. PMC   10397284 . PMID   37532715.
  28. Bakhtiarigheshlaghbakhtiar, Mojtaba; Bhasin, Swati; Thomas, Beena. "Single-Cell Profiling of Acute Myeloid Leukemia Identified ARMH1, a Novel Protein Associated with Proliferation, Migration, and Drug Resistance". ashpublications.org. Retrieved 2023-12-08.