SAAL1 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | SAAL1 , SPACIA1, serum amyloid A like 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1926185 HomoloGene: 34706 GeneCards: SAAL1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Serum amyloid A-like 1 (also known as SAAL1, Synoviocyte proliferation-associated in collagen-induced arthritis 1, and SPACIA1) is a protein in humans encoded by the SAAL1 gene. [5] [6]
The human SAAL1 gene is located at position 11p15.1 on the minus strand spanning from base pairs 18080292-18106082 (25,790 bases). [5] It has 12 exons and 11 introns and encodes a single isoform. [5] [7]
Members of the serum amyloid-A family such as SAA1 reside in the same loci as SAAL1. [7]
The promoter region (GXP_169676) is predicted to span from basepairs 18105980-18107207 and extends into the first exon of SAAL1. [9] Predicted transcription factors include TATA binding factors, NF-κB, and KLF4, KLF5, and KLF6. [10]
SAAL1 is ubiquitously expressed at moderate levels across all human tissues with highest expression in testes as determined by RNA-sequencing and microarray expression profiling. [11] [12]
Predicted 5' UTR binding proteins of the human SAAL1 transcript include SRSF3 and FXR2. [13] Predicted 3' UTR binding proteins include SRSF5 and U2AF2. [13] All predicted proteins are involved in mRNA splicing, export, and translation. [14] [15] [16] [17]
The SAAL1 protein has a single known isoform consisting of 474 amino acids with a molecular weight of 53.5 kDa. [5] The unmodified SAAL1 protein is acidic with an isoelectric point of 4.4. [21]
SAAL1 is abundant in aspartic acid (7.8% by composition) and deficient in glycine (3.4% by composition)compared to other human proteins. [22] It also has 44 more aspartic acid and glutamic acid residues compared to lysine and arginine, indicating an overall negative charge. [23] Two negatively charged and glutamic acid abundant segments were identified and labeled in the SAAL1 conceptual translation. [22]
SAAL1 contains an armadillo-like fold with an enveloped fungal symportin-1 like region. [24] [25] Other motifs were predicted by ELM [26] and MyHits Motif Scan. [27]
Predicted Motifs | Amino Acids | Tools |
---|---|---|
Casein kinase 2 phosphorylation site | 152-155, 165-168 | MyHits, [27] ELM [26] |
Nuclear Export Signal | 72-84 | ELM [26] |
MAPK docking site | 106-115, 344-352 | ELM [26] |
Immunofluorescent staining has identified SAAL1 localization in the nucleus of Caco-2 cells. [28] However, western blotting of hepatocellular carcinoma cell lines identified SAAL1 localization in the cytoplasm with minor amounts in the cell membrane and nucleus. [29]
SAAL1 undergoes phosphorylation at two experimentally verified sites: Ser6 and Thr387. [25] Predicted post-translational modifications are detailed in the following table.
Tool | Predicted Modification | Amino Acids |
---|---|---|
NetPhos [30] [31] | Casein kinase 2 phosphorylation | Thr152, Ser165 |
YinOYang [32] [33] | O-linked glycosylation | Ser6 |
SMART [34] | Ubituitination | Lys209, Val302 |
SAAL1 overexpression has been correlated with the proliferation of rheumatoid and osteoarthritic synovial fibroblasts as well as disease progression. [24] [35] RNAi knockouts of SAAL1 reduced arrested fibroblasts in G0/G1 phase and reduced proliferation by 20% with a 50% reduction when fibroblasts were stimulated by TNF-α. [24] Stability assays reveal that SAAL1 promotes G1/S transition via CDK6 mRNA stabilization. [24] [35] This finding was corroborated by SAAL1 knockdowns in hepatocellular carcinomas which also demonstrated impaired HGF-induced migration and increased sensitivity to sorafenib and foretinib treatment. [29] Additionally, SAAL1 is overexpressed in hepatocellular carcinoma cells and in chondrocytes stimulated by interleukin-1 beta, but this effect is diminished in the presence of glucosamine. [29] [36]
Studies of the rock bream SAAL1 ortholog noted an increase in gene expression in response to bacterial and viral pathogens. [37] Human SAAL1 has been reported to interact with the M protein of SARS-Cov-2, [38] Orf4 of Kaposi's sarcoma-associated herpesvirus, [39] and the M and M2 proteins of influenza A. [40] It has also been reported as an interferon stimulator and TRIM25 interactor. [41] [42] Other interacting proteins include PNKD (which plays a role in cardiac hypertrophy via NF-κB signaling), [43] [44] TMIGD3(which inhibits NF-κB activity), [45] [46] and MARK3. [47]
BLAST searches have found homologs for SAAL1 in organisms as distant as plants, though few orthologs were found for fungi. [49] The following table provides a sample of the ortholog space. Vertebrate orthologs share >50% identity with human protein SAAL1 while displayed invertebrates and non-metazoan orthologs have 30% or less identity.
Species | Organism Common Name | Multiple Sequence Alignment Abbreviation | Date of Divergence from Humans (Millions of Years Ago) [50] | Length (AAs) | Identity | NCBI Accession |
---|---|---|---|---|---|---|
Homo sapiens | Humans | Hsa_SAAL1 | 0 | 474 | 100 | NP_612430.2 |
Macaca mulatta | Rhesus Monkey | Mmu_SAAL1 | 29 | 473 | 98 | XP_001087433.2 |
Ictidomys tridecemlineatus | Thirteen-Lined Ground Squirrel | Itr_SAAL1 | 90 | 474 | 90 | XP_005326805.1 |
Monodelphis domestica | Gray Short-Tailed Opossum | Mdo_SAAL1 | 159 | 475 | 73 | XP_007497074.1 |
Ornithorhynchus anatinus | Platypus | Oan_SAAL1 | 177 | 486 | 71 | XP_028915648.1 |
Calidris pugnax | Ruff | Cpu_SAAL1 | 312 | 472 | 70 | XP_014815565.1 |
Rhinatrema bivittatum | Two-Lined Caecilian | Rbi_SAAL1 | 352 | 472 | 61 | XP_029438391.1 |
Erpetoichthys calabaricus | Reedfish | Eca_SAAL1 | 435 | 484 | 50 | XP_028650019.1 |
Callorhinchus milii | Australian Ghost Shark | Cmi_SAAL1 | 473 | 474 | 54 | XP_007885592.1 |
Saccoglossus kowalevskii | Acorn Worm | Ski_SAAL1 | 684 | 508 | 28 | XP_002732678.2 |
Pomacea canaliculata | Golden Apple Snail | Pca_SAAL1 | 797 | 563 | 30 | XP_025086883.1 |
Orbicella faveolata | Mountainous Star Coral | Ofa_SAAL1 | 824 | 561 | 25 | XP_020625180.1 |
Rhizopus microsporus | (a fungal plant pathogen) | Rmi_SAAL1 | 1105 | 323 | 14 | XP_023467779.1 |
Phycomyces blakesleeanus | (a type of fungus) | Pbl_SAAL1 | 1105 | 346 | 14 | XP_018285622.1 |
Manihot esculenta | Cassava | Mes_SAAL1 | 1496 | 536 | 20 | XP_021611223.1 |
Lactuca sativa | Lettuce | Lsa_SAAL1 | 1496 | 534 | 19 | XP_023753062.1 |
Lupinus angustifolius | Narrowleaf Lupin | Lan_SAAL1 | 1496 | 488 | 18 | XP_019436310.1 |
Elaeis guineensis | Oil Palm | Egu_SAAL1 | 1496 | 568 | 18 | XP_010933466.1 |
Phalaenopsis equestris | (a type of orchid) | Peq_SAAL1 | 1496 | 551 | 17 | XP_020591929.1 |
Phoenix dactylifera | Date Palm | Pda_SAAL1 | 1496 | 508 | 17 | XP_026661658.1 |
SAAL1 exists in up to four isoforms in other vertebrates. Across these orthologs, it is the only member of its gene family.
A multiple sequence alignment of the vertebrate homologs demonstrated high conservation of the protein, especially in the armadillo-type fold and fungal symportin-1 like motif. An alignment of invertebrate and non-metazoan orthologs indicates drastic changes in the protein's primary structure, but some conservation in the labeled motifs. Highly similar amino acids were colored red and less similar amino acids were colored blue; "*" denotes conservation and "." denotes similarity.
The date of divergence from the human ortholog was compared to the corrected % divergence for SAAL1 orthologs. Compared against data for cytochrome c and fibrinogen alpha proteins in similar orthologs, SAAL1 evolved at a moderate rate.
MAP11 is a protein that in human is encoded by the gene MAP11. It was previously referred to by the generic name C7orf43. C7orf43 has no other human alias, but in mice can be found as BC037034.
Zinc finger protein 280D, also known as Suppressor Of Hairy Wing Homolog 4, SUWH4, Zinc Finger Protein 634, ZNF634, or KIAA1584, is a protein that in humans is encoded by the ZNF280D gene located on chromosome 15q21.3.
UPF0687 protein C20orf27 is a protein that in humans is encoded by the C20orf27 gene. It is expressed in the majority of the human tissues. One study on this protein revealed its role in regulating cell cycle, apoptosis, and tumorigenesis via promoting the activation of NFĸB pathway.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.
Solute carrier family 46 member 3 (SLC46A3) is a protein that in humans is encoded by the SLC46A3 gene. Also referred to as FKSG16, the protein belongs to the major facilitator superfamily (MFS) and SLC46A family. Most commonly found in the plasma membrane and endoplasmic reticulum (ER), SLC46A3 is a multi-pass membrane protein with 11 α-helical transmembrane domains. It is mainly involved in the transport of small molecules across the membrane through the substrate translocation pores featured in the MFS domain. The protein is associated with breast and prostate cancer, hepatocellular carcinoma (HCC), papilloma, glioma, obesity, and SARS-CoV. Based on the differential expression of SLC46A3 in antibody-drug conjugate (ADC)-resistant cells and certain cancer cells, current research is focused on the potential of SLC46A3 as a prognostic biomarker and therapeutic target for cancer. While protein abundance is relatively low in humans, high expression has been detected particularly in the liver, small intestine, and kidney.
PRP36 is an extracellular protein in Homo sapiens that is encoded by the PRR36 gene that contains a domain of unknown function, DUF4596, towards the C terminus of the protein. The function of PRP36 is unknown, but high gene expression has been observed in various regions of the brain such as the prefrontal cortex, cerebellum, and the amygdala. PRP36 has one alias: Putative Uncharacterized Protein FLJ22184.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
Chromosome 20 open reading frame 85, or most commonly known as C20orf85 is a gene that encodes for the C20orf85 Protein. This gene is not yet well understood by the scientific community.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.