CFAP299 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | CFAP299 , chromosome 4 open reading frame 22, C4orf22, cilia and flagella associated protein 299 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1916571; HomoloGene: 51893; GeneCards: CFAP299; OMA:CFAP299 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Cilia- and flagella-associated protein 299 (CFAP299) is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis. [5]
CFAP299 gene is located at chromosome 4, 4q21.21 spanning 642,492 bases from position 80,321,265 to position 80,963,756 on the plus strand. CFAP299 gene is also known as C4orf22, chromosome 4 Open Reading Frame 22 and Uncharacterized Protein C4orf22. [6] CFAP299 gene is located near MRPS25P1 and BMP3 and it has 13 exons. [7]
CFAP299 is widely expressed in a variety of normal tissue in Homo sapiens . CFAP299 is highly expressed in testis, trachea, lung, fetal lung and epididymis. [8] In terms of health state, CFAP299 has a decreased expression level in glioma, germ cell tumors and chondrosarcoma. An even higher expression of CFAP299 is shown in condition of soft tissue tumor and muscle tissue tumor. CFAP299 is only exist in fetus and adult. [9]
The promoter of CFAP299 gene is predicted to present 1000 base pairs upstream of the start of transcription. A variety of transcription factors such as CCAAT binding factors, X-box binding factors and AT rich interactive domain factor bind to promoter to regulate the sequence. [10]
CFAP299 has 9 alternatively spliced variants and 1 unspliced form. [11]
CFAP299 protein contains 233 amino acids in length. The molecular weight of Homo sapiens CFAP299 protein is 26869 Da and the predicted isoelectric point is 5.28. Total number of negatively charged residues is 39 and total number of positively charged residues is 33. [12] Aspartic acid has a higher frequency in CFAP299 protein than in other human proteins. [13]
CFAP299 protein has two important isoforms. Cilia- and flagella-associated protein 299 isoform 1 is the longest isoform [7] and cilia- and flagella-associated protein 299 isoform 2 is chosen as canonical sequence, [14] which is also the target for this article.
There is only one conserved domain DUF4464 from position 13 to position 232 in CFAP299 protein. [7] This domain belongs to DUF4464 family, which is found in eukaryotes and the proteins in this family has a length of 224 to 241 amino acids. [15] This domain is conserved through the orthologs of CFAP299 as indicated by BLAST. [16]
CFAP299 proteins secondary structure is dominated by alpha helix and random coil as predicted by GOR4. [17]
Tertiary structure of CFAP299 protein predicted by I-TASSER showed that the protein is comprised by alpha helix and coils. [18]
CFAP299 is predicted to undergo phosphorylation in various site as shown in graph. [19] CFAP299 also predicted to have sumoylation site in position 58, 137 and 232 and two SUMO-interaction Motifs in position 45-49 and 212-216. [20]
CFAP299 protein is believed to interact with amyloid beta (A4) precursor protein (APP) [22] and BCL2-associated athanogene 3 (BCL2). [23]
CFAP299 protein orthologs exists in mammals, reptiles, birds, amphibians, fish, sponges, sea urchins, insects, fungi and plants. Its most distant relative appear in plants. The table below shows orthologs found by BLAST. [16]
Genus and species | Common name | Taxonomic Group | Date of divergence | accession number | sequence length | sequence identity | sequence similarity |
Homo Sapiens | Human | Mammalia | 0 | NP_689983.2 | 233 | 100% | 100% |
Ochotona princeps | American pika | Lagomorpha | 88 | XP_004590671.1 | 233 | 85% | 93% |
Mus musculus | House mouse | Rodentia | 88 | NP_001019785 | 233 | 85% | 91% |
Eumetopias jubatus | Steller sea lion | Carnivora | 94 | XP_027980031 | 233 | 86% | 93% |
Erinaceus europaeus | European hedgehog | Soricomorpha | 94 | XP_007518562 | 233 | 83% | 93% |
Ornithorhynchus anatinus | platypus | Monotremata | 169 | XP_007659769 | 164 | 74% | 88% |
Pogona vitticeps | Central bearded dragon | Reptilia | 320 | XP_020658829 | 236 | 72% | 85% |
Anolis carolinensis | Green anole | Reptilia | 320 | XP_008118093 | 193 | 71% | 85% |
Dromaius novaehollandiae | Emu | Aves | 320 | XP_025959155 | 226 | 64% | 81% |
Anas platyrhynchos | Mallard | Aves | 320 | XP_027312784.1 | 243 | 58% | 75% |
Xenopus laevis | African clawed frog | Amphibia | 353 | NP_001088722 | 233 | 73% | 89% |
Nanorana parkeri | Xizang Plateau frog | Amphibia | 353 | XP_018414504.1 | 233 | 73% | 88% |
Danio rerio | Zebrafish | Actinopterygii | 432 | NP_001108596 | 239 | 60% | 77% |
Callorhinchus milii | Australian ghostshark | Chondrichthyes | 465 | XP_007895157 | 235 | 68% | 82% |
Strongylocentrotus purpuratus | Pacific purple sea urchin | Echinoidea | 627 | XP_011663002 | 236 | 66% | 80% |
Nematostella vectensis | Starlet sea anemone | Anthozoa | 685 | XP_001619741.1 | 199 | 61% | 70% |
Drosophila melanogaster | Fruit fly | Insecta | 794 | NP_650260.1 | 233 | 31% | 46% |
Amphimedon queenslandica | Sponge | Demospongiae | 951.8 | XP_003382446 | 235 | 64% | 80% |
Batrachochytrium dendrobatidis | Chytridiomycetes | Amphibian chytrid fungus | 1150 | XP_006681372 | 238 | 61% | 78% |
Physcomitrella patens | Spreading earthmoss | Bryopsida | 1624 | XP_024379106 | 255 | 50% | 65% |
CFAP299 expression is lowered in people with teratozoospermia, a condition that causes abnormal morphology of sperm and decreased fertility. [24]
In airway epithelial cells that had excessive mucous secretion, a condition that simulated chronic lung disease, CFAP299 showed a reduced expression. [25]
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).
Exosomal polycystin-1-interacting protein is a protein that, in humans, is encoded by the EPCIP gene. EPCIP is found on human chromosome 21, and it is thought to be expressed in tissues of the brain and reproductive organs. Additionally, EPCIP is highly expressed in ovarian surface epithelial cells during normal regulation, but is not expressed in cancerous ovarian surface epithelial cells.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.