FANCD2OS | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | FANCD2OS , C3orf24, FANCD2 opposite strand | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1918229 HomoloGene: 45456 GeneCards: FANCD2OS | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Fanconi Anemia Opposite Strand Transcript protein is a predicted protein that in humans is encoded by the FANCD2OS gene. [5] The name is derived from mRNA transcribed from the strand complementary to the FANCD2 gene.
The gene is encoded on Chromosome 3 (human) at p25.3 on the minus strand from 10081320-10108339nt. The primary transcript is 1105nt which codes for a protein of 177 amino acid in length. [6] The gene on the strand complementary to FANCD2 located 5' from CYCSP11 and 3’ from BRK1 and VHL genes. [5]
There are six alternatively spliced transcripts with differences in the 5' and 3' ends as well changes in exon usage. [7] The most common isoform is 1105 bp. [8]
The promoter region predicted by the Genomatix El Dorado algorithm spans from 10108742-10108009 bp. Promoters are associated with a wide variety of tissues including B-lymphocyes, germ cells, muscle, neurons & prostate.
Secondary structure of the 5’ UTR sequence conserved between humans and gibbon contain sequence recognized by to RBMX, Ras & Rab proteins. The sequence for the secondary structure of the 3’ UTR are recognized by EIF4B and RBMX.
FANCD2OS is moderately expressed in the human brain, placenta and testes. [9]
The protein from the longest transcript is 177AA in length with a mass of 20188.59kD. The protein consists of a domain of unknown function from the DUF4563 superfamily. [10]
Orthologs of FANCD2OS exist throughout mammals, reptiles, birds and in the cartilaginous fish Australian ghostshark. [11] The protein sequence undergoes mutation at a rate similar to the blood protein fibrinogen.
Species | Common name | NCBI accession | Identity | similarity | E-value |
---|---|---|---|---|---|
Pongo abelii | Sumatran orangutan | XP_003776240.1 | 98.9 | 99.4 | 1.21E-133 |
Cricetulus griseus | Chinese hamster | XP_003511850.1 | 73.6 | 80.9 | 1.56E-95 |
Ovis aries | Sheep | XP_004018322.1 | 97.2 | 98.3 | 6.79E-131 |
Leptonychotes weddellii | Weddell seal | XP_006739055.1 | 96.6 | 98.3 | 5.13E-130 |
Condylura cristata | Star-nosed mole | XP_004692437.1 | 91.0 | 95.5 | 3.01E-123 |
Orcinus orca | Killer whale | XP_004286877.2 | 88.1 | 92.3 | 1.05E-90 |
Trichechus manatus latirostris | Florida manatee | XP_004368376.1 | 96.6 | 98.9 | 9.95E-131 |
Alligator mississippiensis | American alligator | XP_014457972.1 | 69.3 | 88.6 | 1.40E-48 |
Chrysemys picta bellii | Painted turtle | XP_005307285.1 | 58.3 | 75 | 2.18E-78 |
Gekko japonicus | Schlegel's Japanese gecko | XP_015261231.1 | 56.5 | 72.3 | 4.39E-74 |
Thamnophis sirtalis | Common garter snake | XP_013914711.1 | 55.9 | 69.5 | 2.75E-69 |
Xenopus laevis | African clawed frog | NP_001088778.1 | 55.4 | 71.2 | 1.98E-68 |
Xenopus tropicalis | Western clawed frog | NP_001090717.1 | 52 | 71.2 | 3.57E-65 |
Callorhinchus milii | Australian ghostshark | XP_007888744.1 | 39.9 | 56.8 | 5.67E-43 |
FANCD2OS has no known paralogs in Homo sapiens.
MALSU1 is a gene on chromosome 7 in humans that encodes the protein MALSU1. This protein localizes to mitochondria and is probably involved in mitochondrial translation or the biogenesis of the large subunit of the mitochondrial ribosome.
Transmembrane protein 229b is a protein that in humans is encoded by the TMEM229b gene.
Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
C11orf42 is an uncharacterized protein in homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
KIAA2013, also known as Q8IYS2 or MGC33867, is a single-pass transmembrane protein encoded by the KIAA2013 gene in humans. The complete function of KIAA2013 has not yet been fully elucidated.