KIAA1257

CFAP92
Identifiers
Aliases	CFAP92 , cilia and flagella associated protein 92 (putative), KIAA1257, FAP92
External IDs	HomoloGene: 131623 GeneCards: CFAP92
Gene location (Human)
Chr.	Chromosome 3 (human)
End	129,002,690 bp
RNA expression pattern
	Top expressed in
	sperm; ; secondary oocyte; ; right uterine tube; ; blood; ; islet of Langerhans; ; bone marrow cells; ; anterior pituitary; ; sural nerve; ; prefrontal cortex; ; Achilles tendon;
	n/a
	More reference expression data
	n/a
Orthologs
	57501
	n/a
	ENSG00000114656
	n/a
	Q9ULG3
	n/a
NM_020741 ; NM_001348520 ; NM_001348521 ; NM_001348522 ; NM_001348523 ;
	NM_001394090
	n/a
	NP_065792 ; NP_001335449 ; NP_001335450 ; NP_001335451 ; NP_001335452 Contents Gene ; Transcripts ; Protein ; Expression and Regulation ; Clinical Significance ; Homology ; References ; Further reading ;
	n/a
	Wikidata
View/Edit Human

Last updated July 15, 2023

KIAA1257 is a protein that in humans is encoded by the KIAA1257 gene. KIAA1257 has been shown to be involved with activation of genes involved in sex determination^[3] .^[4]

Gene

In humans the gene KIAA1257 is located on chromosome 3q21.3. It spans 122 kilobasepairs (kBp) and contains 22 exons. It is flanked by Ras-related protein Rab-43 and several pseudogenes and on the opposite strand Acyl CoA dehydrogenase family member 9 (ACAD9) and EF-hand and coiled-coil domain containing 1 (EFCC1).

KIAA1257 genetic locus

Transcripts

The exons of KIAA1257 are alternatively spliced into 17 different isoforms (Table 1). Isoform X1 encodes the longest protein product and isoform X4 is the most common variant translated. Both the 5' and 3' UTR's are capable of forming stem loop structures that could serve as binding site for RNA-binding proteins.^[5]

Isoform	Length (bp)
X1	8645
X2	8641
X3	8218
X4	8612
X5	8370
X6	8190
X7	3524
X8	3428
X9	7801
X10	7685
X11	7862
X12	7809
X13	13296
X14	13401
X15	7579
X16	7585
X17	2163

Table 1

Protein

The protein KIAA1257 exists most commonly as a translation of the mRNA isoform X4, which is only half the length of isoform X1's product even though they have similar mRNA lengths. Protein isoform X1 is 1179 amino acids long, has a molecular weight of 136.4 kilodaltons (kDa) and an isoelectric point (pI) of 8.1.^[6]^[7] KIAA1257 contains a domain of unknown function (DUF) 4550 in the first third of the protein sequence that has a high lysine content (15%).^[6] Most of the protein exists in a random coil structure but the final thirds contains 6 predicted alpha helices.^[8] KIAA1257 is predicted to be localized to the nucleus and contains several nuclear localization signals.^[9] A summary of KIAA1257 orthologs is shown below.

Species	Identity^[10]	Length^[6]	MW^[6]	pI^[7]	Localization (confidence)^[9]
Human	100%	1179	136.4	8.1	Nucleus (73.9%)
Chimp	97%	1147	131.7	8.5	Nucleus (65.2%)
Dog	69%	1163	133.6	8.9	Nucleus (82.6%)
Turkey	39%	1174	132.0	8.5	Nucleus (65.2%)
Spotted gar	36%	1320	148.2	7.7	Nucleus (73.9%)

Table 2

Expression and Regulation

KIAA1257 is mainly expressed in the testes and ovaries of adult humans, however expression is low in these tissues. KIAA1257 is most highly expressed during the earliest stages of development. Expression is the highest in the 2 through 8 cell stages of embryonic development and begins to decline steadily after morula and then blastocyst formation.^[11]

KIAA1257 has a promoter region upstream of the 5' UTR with several transcription factor binding sites including a Sox11 binding site.^[12] Sox11 is involved in the regulation of many developmental genes.

Clinical Significance

KIAA1257 has been shown to activate expression of Nuclear receptor subfamily 5 group A member 1 (NR5A1).^[3] NR5A1 is involved in sex determination and defects in the gene are related to XY sex reversal.

Homology

KIAA1257 is found in all vertebrates except for cartilaginous and jawless fishes. KIAA1257 orthologs in birds, fish, and reptiles have 30-40% identity with humans while mammals such as goats, cats, and dogs have 60-70% identity and primates have 85-99% identity.^[13]

Species	Identity	Cover	Length
Human	100%	100%	1179
Chimp	97%	99%	1147
Dog	69%	92%	1163
Prairie deer mouse	67%	93%	1164
Goat	61%	75%	931
Common shrew	58%	53%	660
Brown spotted pit viper	36%	77%	1080
Nile tilapia	34%	84%	1050

Table 3

Related Research Articles

KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.

KIAA1704, also known as LSR7, is a protein that in humans is encoded by the GPALPP1 gene. The function of KIAA1704 is not yet well understood. KIAA1704 contains one domain of unknown function, DUF3752. The protein contains a conserved, uncharged, repeated motif GPALPP(GF) near the N terminus and an unusual, conserved, mixed charge throughout. It is predicted to be localized to the nucleus.

Transmembrane protein 241 is a ubiquitous sugar transporter protein which in humans is encoded by the TMEM241 gene.

Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene.

UPF0575 protein C19orf67 is a protein which in humans is encoded by the C19orf67 gene. Orthologs of C19orf67 are found in many mammals, some reptiles, and most jawed fish. The protein is expressed at low levels throughout the body with the exception of the testis and breast tissue. Where it is expressed, the protein is predicted to be localized in the nucleus to carry out a function. The highly conserved and slowly evolving DUFF3314 region is predicted to form numerous alpha helices and may be vital to the function of the protein.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

1 2 3 GRCh38: Ensembl release 89: ENSG00000114656 - Ensembl, May 2017
↑ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
1 2 Noriko Sakai et al., Identification of NR5A1 (SF-1/AD4BP) gene expression modulators by large-scale gain and loss of function studies. J Endocrinol 198 (3) 489-497, doi : 10.1677/JOE-08-0027 First published online 25 June 2008
↑ "Entrez Gene: KIAA1257" . Retrieved 2017-03-02.
↑ M. Zuker, D. H. Mathews & D. H. Turner. Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide In RNA Biochemistry and Biotechnology, 11-43, J. Barciszewski and B. F. C. Clark, eds., NATO ASI Series, Kluwer Academic Publishers, Dordrecht, NL, (1999)
1 2 3 4 Algorithm Citation: Brendel, V., Bucher, P., Nourbakhsh, I.R., Blaisdell, B.E. & Karlin, S. (1992) "Methods and algorithms for statistical analysis of protein sequences" Proc. Natl. Acad. Sci. U.S.A. 89, 2002-2006. Program Citation: Volker Brendel, Department of Mathematics, Stanford University, Stanford CA 94305, U.S.A., modified; any errors are due to the modification.
1 2 Program by Dr. Luca Toldo, developed at http://www.embl-heidelberg.de. Changed by Bjoern Kindler to print also the lowest found net charge. Available at EMBL WWW Gateway to Isoelectric Point Service {{cite web |url=http://www.embl-heidelberg.de/cgi/pi-wrapper.pl |title=Archived copy |access-date=2014-05-10 |url-status=dead |archive-url=https://web.archive.org/web/20081026062821/http://www.embl-heidelberg.de/cgi/pi-wrapper.pl |archive-date=2008-10-26 }}
↑ A. W. Burgess and P. K. Ponnuswamy and H. A. Sheraga, Analysis of conformations of amino acid residues and prediction of backbone topography in proteins, Israel J. Chem., p239-286, 1974, vol12.
1 2 Psort II
↑ Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410
↑ NCBI geo profiles GDS3959 / 1554852_a_at
↑ "KIAA1257 promoter analysis".
↑ Algorithm citation: E. W. Myers and W. Miller, (1989) CABIOS 4:11-17.W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448.W. R. Pearson (1990) "Rapid and Sensitive Sequence Comparison with FASTP and FASTA" Methods in Enzymology 183:63-98). Program citation: © 1997 by William R. Pearson and the University of Virginia (This is from distribution "fasta20u66", version 2.0u66, Sep., 1998, sale or incorporation into a commercial product expressly forbidden without permission).