IFFO1 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | IFFO1 , HOM-TES-103, IFFO, intermediate filament family orphan 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 610495 MGI: 2444516 HomoloGene: 18706 GeneCards: IFFO1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Intermediate filament family orphan 1 is a protein that in humans is encoded by the IFFO1 gene. IFFO1 has uncharacterized function and a weight of 61.98 kDa. [5] IFFO1 proteins play an important role in the cytoskeleton and the nuclear envelope of most eukaryotic cell types. [6]
IFFO in human is located on the minus strand at Chromosome 12p13.3. The protein contains 17,709 nucleotide bases that encodes for 570 amino acids. The basal isoelectric point is 4.83. [7] IFFO1 contains a highly conserved filament domain that spans 299 amino acids from amino residue 230 to 529. [8] This region has been identified as pfam00038 conserved protein domain family. [9] Due to alternative splicing, there are 7 isoforms of IFFO1 in humans with 10 typical coding exons.
IFFO1 is also called Intermediate Filament Family Orphan Isoform X1, Intermediate Filament Family Orphan, HOM-TES-103, Intermediate Filament-Like MGC: 2625, and Tumor Antigen HOM-TES-10. [10]
The gene is found to be highly conserved. The most distant orthologs are found in fish and sharks (cartilaginous fishes) such as Callorhinchus milii. [11] Very low percentages of sequence coverage and identity of the gene's orthologs in fungi and invertebrates suggest that the gene was lost in those organisms. [12] Therefore, it is highly probable that IFFO1 originated in vertebrates.
Genus/Species | Common Name | Divergence from Human (MYA) | Length (aa) | Similarity | Identity | NCBI Accession |
---|---|---|---|---|---|---|
Homo sapiens | Human | N/A | 570 | 100% | 100% | XP_006719036.1 |
Mus musculus | Mouse | 92.3 | 563 | 93% | 95% | XP_006506337.2 |
Lipotes vexillifer | Baiji dolphin | 94.2 | 573 | 92% | 95% | XP_007469487.1 |
Loxodonta africana | African bush elephant | 98.7 | 574 | 94% | 96% | XP_003410688.1 |
Chrysemys picta bellii | Painted turtle | 296 | 557 | 78% | 84% | XP_005291351.1 |
Pseudopodoces humilis | Ground tit | 296 | 531 | 76% | 81% | XP_005523902.1 |
Python bivittatus | Burmese python | 296 | 570 | 75% | 82% | XP_007429680.1 |
Haliaeetus leucocephalus | Bald eagle | 296 | 537 | 74% | 79% | XP_010565842.1 |
Rana catesbeiana | American bull frog | 371.2 | 511 | 25% | 44% | BAB63946.1 |
Ambystoma mexicanum | Axolotl | 371.2 | 372 | 24% | 42% | AFN68290.1 |
Notophthalmus viridescens | Eastern newt | 371.2 | 496 | 23% | 45% | CAA04656.1 |
Danio Rerio | Zebra fish | 400.1 | 640 | 62% | 71% | XP_690165.5 |
Poecilia formosa | Amazon molly | 400.1 | 640 | 57% | 65% | XP_007550181.1 |
Callorhinchus milii | Australian ghostshark | 462.5 | 512 | 62% | 73% | XP_007896103.1 |
One paralog named IFFO2 has been found in humans. The paralog is found to have 99% similarity and 99% coverage when compared to IFFO1. The paralogous sequence is highly conserved, all the way back to fish and amphibians.
Multiple sequence alignments indicated that the Proline-Rich region from amino residues 39 to 61 near the 5' end of the sequence is highly conserved in both close and distant orthologs. [13] In addition, the filament region near the 3' end of the sequence is also highly conserved. Of the 42 conserved amino acid residues found within the IFFO1 sequence, 33 of them are found in the filament region.
When compared to fibrinogen and Cytochrome C (CYCS), IFFO1 is evolving at a moderate rate. The evolutionary history of fibrinogen demonstrates that it is a fast evolving gene, while cytochrome C has been found to be a slow evolving gene. With the most distant ortholog found to be in the Australian ghostshark, IFFO1 gene duplication took place in fish, which diverged from humans 462.5 million years ago. [14]
The predicted secondary structure of the protein consists mostly of alpha helices (47.19%) and random coils (44.74%). The building block of intermediate filaments are elongated coiled-coil dimer consisting of four consecutive alpha-helical segments. [15]
Structurally, it is most similar to 1GK4, which is chain A of the human vimentin coil 2b fragment (Cys2). [16] Vimentin is a class-II intermediate filament that is found in various non-epithelial cells, especially mesenchymal cells. [17] The vimentin protein is also responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. [18] Its 1A subunit, most similar to IFFO1 protein, forms a single, amphipatic alpha-helix that's compatible with a coiled-coil geometry. It is speculated that this chain is involved in specific dimer-dimer interactions during intermediate filament assembly. A "YRKLLEGEE" domain on the C-terminus is found to be important for the formation of authentic tetrameric complexes and also for the control of filament width during assembly. [19]
Based on experimental data on normal tissues in the human body, IFFO1 gene is highly expressed in the cerebellum, cerebral cortex, and especially in the spleen. Medium expression is seen in several areas such as the adrenal gland, colon, lymph nodes, thymus, and ovary. The tissue areas that had the relatively low expression includes CD4 and CD8 T-cells, epidymal cells, the heart, and the stomach. Extremely low levels of expression were observed in tissues obtained from fetus, kidney, testis, thyroid, and especially in the salivary gland. However, the gene has been found to be highly expressed in chondrosarcoma. [20] Chondrosarcoma is the cancer of the cells that generate collagen. Therefore, there seems to be an association between IFFO1's filamentous characteristic and chondrosarcoma.
One nuclear export signal is predicted to be located at Leucine 141. [21] The IFFO1 protein is predicted to have one 11-amino acid long nuclear localization signal at 373. [22] Based on evidence, the protein is predicted to have high nuclear discrimination. [23] One negative charge acidic cluster was found from amino residue 435 to 447. One repetitive sequence PAPLSPAGP appears twice at 40 to 48 and then again from 159 to 166. This proline-rich region is found to be highly conserved. One long amino acid multiplets of 5 prolines is found at 549.
4 ubiquitination sites are found on Four different Lysine residues. They can be found at Lys78, Lys103, Lys113, Lys339. [24] Experimentally, there was evidence of 43 phosphorylation sites located on 31 serines, 7 threonines, and 5 tyrosines. [25] Furthermore, the evidence has shown with high confidence that Ser533 is a phosphorylation site specifically for protein kinase C. The phosphorylation site at Ser162 also acts as a )-glycosylated site. This type of glycosylation functions to have proteins fold properly, stabilizes the protein, and plays a role in cell-cell adhesion. [26] 4 sumolyated amino acids were found at Leu249, Leu293, Leu298, and Leu325. [27] Sumolation have several effects including interfering with the interaction between the protein’s target and its partner or provide a binding site for an interacting partner, causing conformational changes of the modified target, and facilitating or antagonizing ubiquitinization. [28] 5 glycation sites were predicted to be at Lys78, Lys256, Lys305, Lys380, and Lys478. End productions of glycation are involved in protein conformation changes, loss of function, and irreversible crosslinking. [29]
Evidence from two-hybrid screening exists for four protein interactions with IFFO1. [30]
Another protein interaction with ubiquitin C was found from affinity capture-MS assay. [37]
The IFFO1 gene has not been found to be associated with any particular diseases.
Intermediate filaments (IFs) are cytoskeletal structural components found in the cells of vertebrates, and many invertebrates. Homologues of the IF protein have been noted in an invertebrate, the cephalochordate Branchiostoma.
QRICH1, also known as Glutamine-rich protein 1, is a protein that in humans is encoded by the QRICH1 gene. One notable feature of this protein is that it contains a Caspase Activation Recruitment Domain, also known as a CARD domain. As a result of having this domain, QRICH1 is believed to be involved in apoptotic, inflammatory, and host-immune response pathways.
Protein FAM46C also known as family with sequence similarity 46, member C is a protein that, in humans, is encoded by the FAM46C gene at locus 1p12 spanning base pairs from 118,148,556 to 118,171,011.
Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.
Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.
Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
Small integral membrane protein 14, also known as SMIM14 or C4orf34, is a protein encoded on chromosome 4 of the human genome by the SMIM14 gene. SMIM14 has at least 298 orthologs mainly found in jawed vertebrates and no paralogs. SMIM14 is classified as a type I transmembrane protein. While this protein is not well understood by the scientific community, the transmembrane domain of SMIM14 may be involved in ER retention.
Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.