Unique Ingredient Identifier

Last updated

The Unique Ingredient Identifier (UNII) is an alphanumeric identifier linked to a substance's molecular structure or descriptive information and is generated by the Global Substance Registration System (GSRS) of the Food and Drug Administration (FDA). It classifies substances as chemical, protein, nucleic acid, polymer, structurally diverse, or mixture [1] [2] according to the standards outlined by the International Organization for Standardization in ISO 11238 [3] and ISO DTS 19844. [4] UNIIs are non-proprietary, unique, unambiguous, and free to generate and use. [2] A UNII can be generated for substances at any level of complexity, being broad enough to include "any substance, from an atom to an organism." [1]

Contents

The GSRS is used to generate permanent, unique identifiers for substances in regulated products, such as ingredients in drug and biological products. The GSRS uses molecular structure, protein and nucleic sequences and descriptive information to generate the UNII. The preferred means for defining a chemical substance is by its two-dimensional molecular structure since it is pertinent to a substance's identity and information regarding a substance's stereochemistry is readily available. [5] Nucleic acids are defined by their sequences and by any modifications that may be present. In the case of proteins only end-group modifications will be uniquely identified, along with any other modifications that are essential for activity. This is because of the inherently heterogenous nature of proteins. Therefore, two different protein substances can share the same UNII and yet have no biosimilarity or therapeutic equivalence. [5] Polymers are defined by their structural repeating units and physical properties such as molecular weight or properties related to molecular weight (e.g. viscosity). Structurally diverse materials are inherently heterogenous preparations from natural materials such as plant extract and vaccines. [2]

The GSRS is a freely distributable software system provided through a collaboration between the FDA, the National Center for Advancing Translational Sciences (NCATS) and the European Medicines Agency (EMA). [1] The GSRS was developed to implement the ISO 11238 standard which is one of the core ISO Identification of Medicinal Product (IDMP) standards. The GSRS Board which governs the GSRS includes experts from FDA, European Regulatory Agencies, and the United States Pharmacopoeia (USP). [1]

Examples

Preferred TermUNII
Methadone hydrochloride 229809935B
Methadone UC6VBE7V1Z
Oxygen S88TT14065
Hydrogen 7YNJ3PO35Z
Water 059QF0KO0R

Related Research Articles

Gel electrophoresis Method for separation and analysis of macromolecules

Gel electrophoresis is a method for separation and analysis of macromolecules and their fragments, based on their size and charge. It is used in clinical chemistry to separate proteins by charge or size and in biochemistry and molecular biology to separate a mixed population of DNA and RNA fragments by length, to estimate the size of DNA and RNA fragments or to separate proteins by charge.

Nucleic acid Class of large biomolecules essential to all known life

Nucleic acids are the biopolymers, or large biomolecules, essential to all known forms of life. The term nucleic acid is the overall name for DNA and RNA. They are composed of nucleotides, which are the monomers made of three components: a 5-carbon sugar, a phosphate group and a nitrogenous base. If the sugar is a compound ribose, the polymer is RNA ; if the sugar is derived from ribose as deoxyribose, the polymer is DNA . Nucleic acids are naturally occurring chemical compounds that serve as the primary information-carrying molecules in cells. They play an especially important role in directing protein synthesis. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Nucleic acids are found in abundance in all living things, where they create, encode, and then store information of every living cell of every life-form on Earth. In turn, they function to transmit and express that information inside and outside the cell nucleus—to the interior operations of the cell and ultimately to the next generation of each living organism. The encoded information is contained and conveyed via the nucleic acid sequence, which provides the 'ladder-step' ordering of nucleotides within the molecules of RNA and DNA.

Protein Biological molecule consisting of chains of amino acid residues

Proteins are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

Protein primary structure Linear sequence of amino acids in a peptide or protein

Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthesis is most commonly performed by ribosomes in cells. Peptides can also be synthesized in the laboratory. Protein primary structures can be directly sequenced, or inferred from DNA sequences.

Polyacrylamide gel electrophoresis

Polyacrylamide gel electrophoresis (PAGE) is a technique widely used in biochemistry, forensic chemistry, genetics, molecular biology and biotechnology to separate biological macromolecules, usually proteins or nucleic acids, according to their electrophoretic mobility. Electrophoretic mobility is a function of the length, conformation and charge of the molecule. Polyacrylamide gel electrophoresis is a powerful tool used to analyze RNA samples. When polyacrylamide gel is denatured after electrophoresis, it provides information on the sample composition of the RNA species.

Macromolecule A macromolecule is a large molecule that is composed of atoms.

A macromolecule is a very large molecule, such as a protein. They are composed of thousands of covalently bonded atoms. Many macromolecules are the polymerization of smaller molecules called monomers. The most common macromolecules in biochemistry are biopolymers and large non-polymeric molecules such as lipids and macrocycles. Synthetic fibers and experimental materials such as carbon nanotubes are also examples of macromolecules.

Monoclonal antibody Monospecific antibody that is made by identical immune cells that are all clones of a unique parent cell

A monoclonal antibody is an antibody made by cloning a unique white blood cell. All subsequent antibodies derived this way trace back to a unique parent cell.

BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a huge range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.

Structural Classification of Proteins database

The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins. Proteins with the same shapes but having little sequence or functional similarity are placed in different superfamilies, and are assumed to have only a very distant common ancestor. Proteins having the same shape and some similarity of sequence and/or function are placed in "families", and are assumed to have a closer common ancestor.

Protein structure Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer may also be called a residue indicating a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo electron microscopy (cryo-EM) and dual polarisation interferometry to determine the structure of proteins.

Aptamer

Aptamers are oligonucleotide or peptide molecules that bind to a specific target molecule. Aptamers are usually created by selecting them from a large random sequence pool, but natural aptamers also exist in riboswitches. Aptamers can be used for both basic research and clinical purposes as macromolecular drugs. Aptamers can be combined with ribozymes to self-cleave in the presence of their target molecule. These compound molecules have additional research, industrial and clinical applications.

KEGG

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

The history of molecular biology begins in the 1930s with the convergence of various, previously distinct biological and physical disciplines: biochemistry, genetics, microbiology, virology and physics. With the hope of understanding life at its most fundamental level, numerous physicists and chemists also took an interest in what would become molecular biology.

Biomolecular structure

Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule of protein, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length scales ranging from the level of individual atoms to the relationships among entire protein subunits. This useful distinction among scales is often expressed as a decomposition of molecular structure into four levels: primary, secondary, tertiary, and quaternary. The scaffold for this multiscale organization of the molecule arises at the secondary level, where the fundamental structural elements are the molecule's various hydrogen bonds. This leads to several recognizable domains of protein structure and nucleic acid structure, including such secondary-structure features as alpha helixes and beta sheets for proteins, and hairpin loops, bulges, and internal loops for nucleic acids. The terms primary, secondary, tertiary, and quaternary structure were introduced by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at Stanford University.

Analyte-specific reagents (ASRs) are a class of biological molecules which can be used to identify and measure the amount of an individual chemical substance in biological specimens.

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.

Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.

<i>International Journal of Biological Macromolecules</i>

The International Journal of Biological Macromolecules is a peer-reviewed scientific journal covering research into chemical and biological aspects of all natural macromolecules. It publishes articles on the molecular structure of proteins, macromolecular carbohydrates, lignins, biological poly-acids, and nucleic acids. It also includes biological activities and interactions, molecular associations, chemical and biological modifications, and functional properties as well as development of related model systems, structural including conformational studies, new analytical techniques, and relevant theoretical developments.

This glossary of biology terms is a list of definitions of fundamental terms and concepts used in biology, the study of life and of living organisms. It is intended as introductory material for novices; for more specific and technical definitions from sub-disciplines and related fields, see Glossary of genetics, Glossary of evolutionary biology, Glossary of ecology, and Glossary of scientific naming, or any of the organism-specific glossaries in Category:Glossaries of biology.

References

  1. 1 2 3 4 "Substance Registration System - Unique Ingredient Identifier (UNII)". fda.gov.
  2. 1 2 3 Peryea, Tyler; Southall, Noel; Miller, Mitch; Katzel, Daniel; Anderson, Niko; Neyra, Jorge; Stemann, Sarah; Nguyễn, Ðắc-Trung; Amugoda, Dammika; Newatia, Archana; Ghazzaoui, Ramez (2020-11-02). "Global Substance Registration System: consistent scientific descriptions for substances related to health". Nucleic Acids Research. doi: 10.1093/nar/gkaa962 . ISSN   1362-4962. PMID   33137173.
  3. 14:00-17:00. "ISO 11238:2018". ISO. Retrieved 2020-11-25.CS1 maint: numeric names: authors list (link)
  4. 14:00-17:00. "ISO/TS 19844:2018". ISO. Retrieved 2020-11-25.CS1 maint: numeric names: authors list (link)
  5. 1 2 "Substance Definition Manual". fda.gov. June 10, 2007. Retrieved November 25, 2020.