RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. [1] According to DNA microarray data, it is found in at least 15 tissues. [1] [ citation needed ]
In humans, the RNF227 gene is found on chromosome 17 p13.1. Its mRNA sequence is 2850 base pairs in length and includes 2 exons. The coding sequence is from base pairs 95 to 2835. [2]
The RNF227 protein is 190 amino acids in length, seen in the table below. [3]
1 | MQLLVRVPSL PERGELDCNI CYRPFNLGCR APRRLPGTAR ARCGHTICTA CLRELAARGD |
61 | GGGAAARVVR LRRVVTCPFC RAPSQLPRGG LTEMALDSDL WSRLEEKARA KCERDEAGNP |
121 | AKESSDADGE AEEEGESEKG AGPRSAGWRA LRRLWDRVLG PARRWRRPLP SNVLYCAEIK |
181 | DIGHLTRCTL |
Using tools at Expasy, the predicted molecular weight of the protein sequence is 20,875 kilodaltons [3] with an isoelectric point of 9.23. [5] The Statistical Analysis of Protein Sequences tool detected two repetitive structures: CRAPRRLP
from positions 29 to 36 and CRAPSQLP
from positions 80 to 87. [6]
RING Finger Protein 227 has a zinc finger domain from position 18 to 81, which is highly conserved throughout many eukaryotic organisms. [7]
The secondary structure was predicted by the I-TASSER server and shows 7 alpha helices, 4 beta strands, and 12 coils. [4]
The tertiary structure was predicted by the I-TASSER with a confidence score of -3.42, which is typically in the range from -5 to 2. [4]
RNA-seq was performed of tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity of all protein-coding genes. The highest expression can be seen in the skin, with an expression value of 22 ± 4.5 Reads per Kilobase of transcript, per Million mapped reads (RPKM). Transcription profiling was done by high throughput sequencing of individual and mixtures of 16 human tissues RNA to show the highest expression in the testes. Additionally, the lowest expression is seen in the liver. RNA sequencing was conducted of the total RNA from 20 human tissues which showed high expression in the brain, both in the cerebellum and fetal tissues. 35 human fetal samples from 6 tissues (3 – 7 replicates per tissue) collected between 10- and 20-weeks gestational time were sequence using Illumina TruSeq Stranded Total RNA. This shows very high expression in the intestine after 11 weeks and the kidney after 10 weeks. [1]
Three experiments were found that show what conditions RNF227 rises and falls. A study conducted on T cell-driven IL-22 amplification of Il-1beta-driven inflammation in human adipose tissue shows how there is higher expression of RNF227 in obese non-diabetic patients. [8] An analysis of non-invasive NeuN cells and invasive NeuT cells treated with interstitial fluid flow resulted in higher expression of RNF227 in the NeuN cell line in both the static and flow protocols. This gives insight into the molecular pathways activated by interstitial fluid flow in ERBB2-positive breast cancer cells. [9] The last experiment showed how the effect of Rho kinase inhibition on long-term keratinocyte proliferation is rapid and conditional and resulted in higher expression in the control agent as compared to the Y-27632 agent. [10]
The diagram to the right depicts the stem-loop formation of the 5' untranslated region of RNF227. [11] The BED4.02, ZFX.01, and ZIC3.03 transcription factors are seen with RNF227, which is notable because they are all associated with zinc finger domains. [12] Translation is initiated at the AUG start codon, as seen in the conceptual translation.
The Motif Scan tool at MyHits predicted casein kinase II phosphorylation sites (from positions 9 to 12, 102 to 105, and 125 to 128), N-myristylation sites (from positions 37 to 42 and 61 to 66), and protein kinase c phosphorylation sites (from positions 38 to 40 and 137 to 139). [13]
Additionally, PSORT II predicted a 69.6% chance for the protein sequence to be found in the nucleus of a cell. [14]
RING Finger Protein 227 has no paralogs. It does, however, have numerous orthologs extending throughout eukaryotes. The following table presents a selection of orthologs found using searches in BLAST [16] and BLAT. [17] This is not meant to be a comprehensive list, rather a small sample that shows the diversity of species in which orthologs are found.
Genus and Species | Common Name | Taxonomic Group | Date of Divergence (Million Years Ago) | Accession Number | Sequence Length (amino acids) | Sequence Identity | Sequence Similarity |
---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primates | 0 | NP_001345628.1 | 190 | 100% | 100% |
Neotoma lepida | Desert Woodrat | Rodentia | 90 | OBS67541.1 | 164 | 67.9% | 73.7% |
Microtus ochrogaster | Prairie vole | Rodentia | 90 | XP_026636787.1 | 214 | 67.4% | 74.4% |
Dipodomys ordii | Ord's Kangaroo Rat | Rodentia | 90 | XP_012868576.1 | 158 | 64.8% | 68.9% |
Balaenoptera acutorostrata scammoni | Minke Whale | Artiodactyla | 90 | XP_028024073.1 | 166 | 65.3% | 71.1% |
Vulpes vulpes | Red Fox | Carnivora | 96 | XP_025861213.1 | 160 | 62.8% | 72.3% |
Vicugna pacos | Alpaca | Artiodactyla | 96 | XP_006218277.1 | 156 | 24.8% | 30.4% |
Vombatus ursinus | Common Wombat | Diprotondontia | 159 | XP_027712916.1 | 180 | 62.0% | 74.5% |
Sarcophilus harrisii | Tasmanian Devil | Dasyuromorphia | 159 | XP_023358488.2 | 172 | 58.8% | 66.2% |
Gallus gallus | Chicken | Galliformes | 312 | XP_001234238.1 | 168 | 25.9% | 36.1% |
Geotrypetes seraphini | Gaboon Caecilian | Gymnophiona | 351.8 | XP_033780950.1 | 150 | 37.2% | 46.9% |
Rhinatrema bivittatum | Two-lined Caecilian | Gymnophiona | 351.8 | XP_029437562.1 | 152 | 35.7% | 48.0% |
Microcaecilia unicolor | Cayenne Caecilian | Gymnophiona | 351.8 | XP_030043188.1 | 148 | 34.4% | 46.9% |
Xenopus tropicalis | Western Clawed Frog | Anura | 351.8 | XP_031750786.1 | 178 | 19.1% | 43.4% |
Scleropages formosus | Asian Arowana | Osteoglossiformes | 435 | XP_029113159.1 | 165 | 30.6% | 44.9% |
Astyanax mexicanus | Mexican Tetra | Characiformes | 435 | XP_007231481.2 | 161 | 28.0% | 37.4% |
Paramormyrops kinsleyae | Old Calabar Mormyrid | Osteoglossiformes | 435 | XP_023674393.1 | 165 | 26.6% | 32.3% |
Lepisosteus oculatus | Spotted Gar | Lepisosteiformes | 435 | XP_006640609.2 | 172 | 24.6% | 33.0% |
Salmo trutta | Brown Trout | Saloniformes | 435 | XP_029622555.1 | 190 | 24.2% | 29.6% |
Danio rerio | Zebrafish | Cypriniformes | 435 | NP_001121828.1 | 187 | 23.6% | 35.6% |
C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.