EamA

Last updated
EamA
Identifiers
SymbolEamA
Pfam PF00892
Pfam clan CL0184
InterPro IPR000620
TCDB 2.A.7
cysteine and O-acetyl-L-serine efflux system
Identifiers
Organism Escherichia coli
(strain K12 substrain MG1655)
SymboleamA
Alt. symbolsydeD
RefSeq (Prot) NP_416050.4
UniProt P31125
Other data
Chromosome genome: 1.62 - 1.62 Mb

EamA (named after the O-acetyl-serine/cysteine export gene in E. coli ) is a protein domain found in a wide range of proteins including the Erwinia chrysanthemi PecM protein, which is involved in pectinase, cellulase and blue pigment regulation, the Salmonella typhimurium PagO protein (function unknown), and some members of the solute carrier family group 35 (SLC35) nucleoside-sugar transporters. Many members of this family have no known function and are predicted to be integral membrane proteins and many of the proteins contain two copies of the domain.

Contents

Domain

EamA was previously called DUF6 (domain unknown function) 6, and was one of the first DUF families to appear in Pfam. [1] Maximum likelihood phylogenetic analysis indicates that this family contains four stable sub-families with high bootstrap values: SLC35C/E, SLC35F, SLC35G (acyl-malonyl condensing enzyme-like AMAC), and purine permeases. [2]

The EamA HMM domain organization shows the two domain structure of EamA. However, the entries for UAA, Nuc_sug_transp, and DUF914, which may likely have derived from EamA, the HMM covers the duplicated structure as a single HMM.

Function

AMAC (acyl-malonyl condensing enzyme) is an interchangeable, but more general biochemical term than FAE 3-ketoacyl-CoA synthase 1, which would refer only to synthase #1. However, the transmembrane structure indicates that AMACs are transporters, not enzymes. Hence TMEM20, TMEM22, AMAC1 and AMAC-like (AMAC1L1, AMAC1L2, AMAC1L3) sequences have been renamed to SLC35Gs in RefSeq for Human and Mouse (SLC35G1 – 6). Furthermore, EamA is the only drug/metabolite transporter family to cross the prokaryote/eukaryote border, even though none of the original families crossed this border. [3] The highly diverse EamA Pfam family has been created by iterative expansion of the original dataset.

Evolution

The likely evolutionary order of human 5 + 5 TM nucleotide sugar transporters is identified. [2] It was done by training HMMs on each halve of these proteins: EamA, TPT, DUF914, UAA, and NST. The first method was multidimensional scaling in IBM SPSS, where a matrix of pairwise similarity measures from HMM-HMM comparisons was used as input. The output was a graph, showing a clear bipartitioning between DMT-1 and DMT-2 domains, where EamA-1 and EamA-2 were clearly in the middle. This result could be interpreted that EamA duplicated, and that the other families represent “diverged” copies from EamA.

The distance (100-p) between domain halves was measured, and the families were sorted by the following distances: EamA (smallest distance between domain halves), TPT, DUF914, UAA, and NST (highest distance between domain halves). What was perhaps surprising was that this order also replicated the distance to EamA, so that NST had the highest “distance” to EamA, UAA the second highest, and so on. The possibility that EamA (previously DUF6) may be an “artifact”, that has formed a "multipotent" HMM through iterative expansion of a diverse seed data, should be considered. [2]

During DNA replication of circular bacterial genomes, multiple proteins are involved in synthesizing the leading strand, and the Okazaki fragments on the lagging strand. If a sequence contains an inverted repeat (a palindrome) longer than 10 bp, and a spacer/insert of less than 75-150 basepairs, the sequence could be accessible to SbcCD, [4] a protein which inhibits the propagation of replicons containing long palindromic DNA sequences. Watson-Crick basepairing of the palindrome, and a break in the sequence may occur, creating an opportunity for priming DNA synthesis in the opposite direction. This may be followed by spontaneous strand switching and continuation of normal replication. This phenomenon is referred to as Tandem Inversion Duplication (TID). [5] Then there may have been degradation of the third (inverted) copy which would be in the middle. Strand slippage deletion (illegitimate recombination) may be responsible. The presence of two palindromes in the regional duplication may increase the probability of degradation.

A concrete bioinformatic example could be a DUF606 protein, known to exist in both paired and fused copies in bacterial genomes, [6] where a DUF606 protein (Accession: ACL39356.1) from Arthrobacter chlorophenolicus A6, has a 5+5 TM structure and matches 2 x DUF606 HMM in Pfam, and thus appears to be duplicated. When the genomic sequence (1530600 – 1531700) of the protein from Arthrobacter is obtained, it is found that it contains a palindrome (cgtggcggcg and gcaccgccgc) in the middle of the domain halves, although it may be too short and have too long a spacer to be able to initiate a new TID.

See also

Related Research Articles

Pfam Database of protein families

Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 33.1, was released in May 2020 and contains 18,259 families.

Palindromic sequence DNA or RNA sequence that matches its complement when read backwards

A palindromic sequence is a nucleic acid sequence in a double-stranded DNA or RNA molecule wherein reading in a certain direction on one strand matches the sequence reading in the same direction on the complementary strand. This definition of palindrome thus depends on complementary strands being palindromic of each other.

The Olduvai domain, known until 2018 as DUF1220 and the NBPF repeat, is a protein domain that shows a striking human lineage-specific (HLS) increase in copy number and appears to be involved in human brain evolution. The protein domain has also been linked to several neurogenetic disorders such as schizophrenia and increased severity of autism. In 2018, it was named by its discoverers after Olduvai Gorge in Tanzania, one of the most important archaeological sites for early humans, to reflect data indicating its role in human brain size and evolution.

Protein domain Conserved part of a protein

A protein domain is a conserved part of a given protein sequence and tertiary structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded. Many proteins consist of several structural domains. One domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions. In general, domains vary in length from between about 50 amino acids up to 250 amino acids in length. The shortest domains, such as zinc fingers, are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimeric proteins.

UDP-xylose and UDP-N-acetylglucosamine transporter protein-coding gene in the species Homo sapiens

UDP-xylose and UDP-N-acetylglucosamine transporter is a protein that in humans is encoded by the SLC35B4 gene.

TMEM63A protein-coding gene in the species Homo sapiens

Transmembrane protein 63A is a protein that in humans is encoded by the TMEM63A gene. The mature human protein is approximately 92.1 kilodaltons (kDa), with a relatively high conservation of mass in orthologs. The protein contains eleven transmembrane domains and is inserted into the membrane of the lysosome. BioGPS analysis for TMEM63A in humans shows that the gene is ubiquitously expressed, with the highest levels of expression found in T-cells and dendritic cells.

HMMER

HMMER is a free and commonly used software package for sequence analysis written by Sean Eddy. Its general usage is to identify homologous protein or nucleotide sequences, and to perform sequence alignments. It detects homology by comparing a profile-HMM to either a single sequence or a database of sequences. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM. Profile-HMMs are constructed from a multiple sequence alignment in the HMMER package using the hmmbuild program. The profile-HMM implementation used in the HMMER software was based on the work of Krogh and colleagues. HMMER is a console utility ported to every major operating system, including different versions of Linux, Windows, and Mac OS.

SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.

A domain of unknown function (DUF) is a protein domain that has no characterised function. These families have been collected together in the Pfam database using the prefix DUF followed by a number, with examples being DUF2992 and DUF1220. As of 2019, there are almost 4,000 DUF families within the Pfam database representing over 22% of known families. Some DUFs are not named using the nomenclature due to popular usage but are nevertheless DUFs.

TMEM242 protein-coding gene in the species Homo sapiens

Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.

Amino acid permeases are membrane permeases involved in the transport of amino acids into the cell. A number of such proteins have been found to be evolutionary related. These proteins contain 12 transmembrane segments.

TMEM106A protein-coding gene in the species Homo sapiens

TMEM106A is a gene that encodes the transmembrane protein 106A (TMEM106A) in Homo sapiens. It is located at 17q21.31 on the plus strand next to cancer-related genes NBR1 and BRCA1. The TMEM106A gene contains a domain of unknown function, DUF1356.

TMEM131 protein-coding gene in the species Homo sapiens

Transmembrane protein 131 (TMEM131) is a protein that is encoded by the TMEM131 gene in humans. The TMEM131 protein contains three domains of unknown function 3651 (DUF3651) and two transmembrane domains. This protein has been implicated as having a role in T cell function and development. TMEM131 also resides in a locus (2q11.1) that is associated with Nievergelt's Syndrome when deleted.

KIAA0922 protein-coding gene in the species Homo sapiens

Transmembrane protein 131-like(TMEM131L protein), alternatively named uncharacterized protein KIAA0922, is an integral transmembrane protein encoded by the human gene KIAA0922 that is significantly conserved in eukaryotes, at least through protists. Although the function of this gene is not yet fully elucidated, initial microarray evidence suggests that it may be involved in immune responses. Furthermore, its paralog, prolyl endopeptidase (PREP) whose function is known, provides clues as to the function of TMEM131L.

Transmembrane protein 241 is a ubiquitous sugar transporter protein which in humans is encoded by the TMEM241 gene.

TMEM143 is a protein that in humans is encoded by TMEM143 gene. TMEM143, a dual-pass protein, is predicted to reside in the mitochondria and high expression has been found in both human skeletal muscle and the heart. Interaction with other proteins indicate that TMEM143 could potentially play a role in tumor suppression/expression and cancer regulation.

Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.

DMAC1 protein-coding gene in the species Mus musculus

Transmembrane protein 261 is a protein that in humans is encoded by the TMEM261 gene located on chromosome 9. TMEM261 is also known as C9ORF123 and DMAC1, Chromosome 9 Open Reading Frame 123 and Transmembrane Protein C9orf123 and Distal membrane-arm assembly complex protein 1.

The anion exchanger family is a member of the large APC superfamily of secondary carriers. Members of the AE family are generally responsible for the transport of anions across cellular barriers, although their functions may vary. All of them exchange bicarbonate. Characterized protein members of the AE family are found in plants, animals, insects and yeast. Uncharacterized AE homologues may be present in bacteria. Animal AE proteins consist of homodimeric complexes of integral membrane proteins that vary in size from about 900 amino acyl residues to about 1250 residues. Their N-terminal hydrophilic domains may interact with cytoskeletal proteins and therefore play a cell structural role. Some of the currently characterized members of the AE family can be found in the Transporter Classification Database.

Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.

References

  1. Bateman A, Coggill P, Finn RD (October 2010). "DUFs: families in search of function". Acta Crystallographica Section F. 66 (Pt 10): 1148–52. doi:10.1107/S1744309110001685. PMC   2954198 . PMID   20944204.
  2. 1 2 3 Västermark Å, Almén MS, Simmen MW, Fredriksson R, Schiöth HB (2011). "Functional specialization in nucleotide sugar transporters occurred through differentiation of the gene cluster EamA (DUF6) before the radiation of Viridiplantae". BMC Evol. Biol. 11: 123. doi:10.1186/1471-2148-11-123. PMC   3111387 . PMID   21569384.
  3. Jack DL, Yang NM, Saier MH (July 2001). "The drug/metabolite transporter superfamily". Eur. J. Biochem. 268 (13): 3620–39. doi:10.1046/j.1432-1327.2001.02265.x. PMID   11432728.
  4. Leach DR, Lloyd RG, Coulson AF (1992). "The SbcCD protein of Escherichia coli is related to two putative nucleases in the UvrA superfamily of nucleotide-binding proteins". Genetica. 87 (2): 95–100. doi:10.1007/bf00120998. PMID   1490631. S2CID   27391960.
  5. Kugelberg E, Kofoid E, Andersson DI, Lu Y, Mellor J, Roth FP, Roth JR (May 2010). "The tandem inversion duplication in Salmonella enterica: selection drives unstable precursors to final mutation types". Genetics. 185 (1): 65–80. doi:10.1534/genetics.110.114074. PMC   2870977 . PMID   20215473.
  6. Lolkema JS, Dobrowolski A, Slotboom DJ (May 2008). "Evolution of antiparallel two-domain membrane proteins: tracing multiple gene duplication events in the DUF606 family" (PDF). J. Mol. Biol. 378 (3): 596–606. doi:10.1016/j.jmb.2008.03.005. PMID   18384811.
This article incorporates text from the public domain Pfam and InterPro: IPR000620