Unique Ingredient Identifier

Last updated

The Unique Ingredient Identifier (UNII) is an alphanumeric identifier linked to a substance's molecular structure or descriptive information and is generated by the Global Substance Registration System (GSRS) of the Food and Drug Administration (FDA). It classifies substances as chemical, protein, nucleic acid, polymer, structurally diverse, or mixture [1] [2] according to the standards outlined by the International Organization for Standardization in ISO 11238 [3] and ISO DTS 19844. [4] UNIIs are non-proprietary, unique, unambiguous, and free to generate and use. [2] A UNII can be generated for substances at any level of complexity, being broad enough to include "any substance, from an atom to an organism." [1]


The GSRS is used to generate permanent, unique identifiers for substances in regulated products, such as ingredients in drug and biological products. The GSRS uses molecular structure, protein and nucleic sequences and descriptive information to generate the UNII. The preferred means for defining a chemical substance is by its two-dimensional molecular structure since it is pertinent to a substance's identity and information regarding a substance's stereochemistry is readily available. [5] Nucleic acids are defined by their sequences and by any modifications that may be present. In the case of proteins only end-group modifications will be uniquely identified, along with any other modifications that are essential for activity. This is because of the inherently heterogenous nature of proteins. Therefore, two different protein substances can share the same UNII and yet have no biosimilarity or therapeutic equivalence. [5] Polymers are defined by their structural repeating units and physical properties such as molecular weight or properties related to molecular weight (e.g. viscosity). Structurally diverse materials are inherently heterogenous preparations from natural materials such as plant extract and vaccines. [2]

The GSRS is a freely distributable software system provided through a collaboration between the FDA, the National Center for Advancing Translational Sciences (NCATS) and the European Medicines Agency (EMA). [1] The GSRS was developed to implement the ISO 11238 standard which is one of the core ISO Identification of Medicinal Product (IDMP) standards. The GSRS Board which governs the GSRS includes experts from FDA, European Regulatory Agencies, and the United States Pharmacopoeia (USP). [1]


Preferred TermUNII
Methadone hydrochloride 229809935B
Methadone UC6VBE7V1Z
Oxygen S88TT14065
Hydrogen 7YNJ3PO35Z
Water 059QF0KO0R

Related Research Articles

<span class="mw-page-title-main">Biopolymer</span> Polymer produced by a living organism

Biopolymers are natural polymers produced by the cells of living organisms. Like other polymers, biopolymers consist of monomeric units that are covalently bonded in chains to form larger molecules. There are three main classes of biopolymers, classified according to the monomers used and the structure of the biopolymer formed: polynucleotides, polypeptides, and polysaccharides. The Polynucleotides, RNA and DNA, are long polymers of nucleotides. Polypeptides include proteins and shorter polymers of amino acids; some major examples include collagen, actin, and fibrin. Polysaccharides are linear or branched chains of sugar carbohydrates; examples include starch, cellulose, and alginate. Other examples of biopolymers include natural rubbers, suberin and lignin, cutin and cutan, melanin, and polyhydroxyalkanoates (PHAs).

<span class="mw-page-title-main">Nucleic acid</span> Class of large biomolecules essential to all known life

Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomer components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If the sugar is ribose, the polymer is RNA; if the sugar is deoxyribose, a version of ribose, the polymer is DNA.

<span class="mw-page-title-main">Protein primary structure</span> Linear sequence of amino acids in a peptide or protein

Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthesis is most commonly performed by ribosomes in cells. Peptides can also be synthesized in the laboratory. Protein primary structures can be directly sequenced, or inferred from DNA sequences.

<span class="mw-page-title-main">Macromolecule</span> Very large molecule, such as a protein

A macromolecule is a very large molecule important to biological processes, such as a protein or nucleic acid. It is composed of thousands of covalently bonded atoms. Many macromolecules are polymers of smaller molecules called monomers. The most common macromolecules in biochemistry are biopolymers and large non-polymeric molecules such as lipids, nanogels and macrocycles. Synthetic fibers and experimental materials such as carbon nanotubes are also examples of macromolecules.

<span class="mw-page-title-main">Polyethylene glycol</span> Chemical compound

Polyethylene glycol (PEG; ) is a polyether compound derived from petroleum with many applications, from industrial manufacturing to medicine. PEG is also known as polyethylene oxide (PEO) or polyoxyethylene (POE), depending on its molecular weight. The structure of PEG is commonly expressed as H−(O−CH2−CH2)n−OH.

<span class="mw-page-title-main">Structural Classification of Proteins database</span> Biological database of proteins

The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins. Proteins with the same shapes but having little sequence or functional similarity are placed in different superfamilies, and are assumed to have only a very distant common ancestor. Proteins having the same shape and some similarity of sequence and/or function are placed in "families", and are assumed to have a closer common ancestor.

<span class="mw-page-title-main">Protein structure</span> Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo-electron microscopy (cryo-EM) and dual polarisation interferometry, to determine the structure of proteins.

The International Nomenclature of Cosmetic Ingredients (INCI) are the unique identifiers for cosmetic ingredients such as waxes, oils, pigments, and other chemicals that are assigned in accordance with rules established by the Personal Care Products Council (PCPC), previously the Cosmetic, Toiletry, and Fragrance Association (CTFA). INCI names often differ greatly from systematic chemical nomenclature or from more common trivial names and is a mixture of conventional scientific names, Latin and English words. INCI nomenclature conventions "are continually reviewed and modified when necessary to reflect changes in the industry, technology, and new ingredient developments".

<span class="mw-page-title-main">Aptamer</span> Oligonucleotide or peptide molecules that bind specific targets

Aptamers are short sequences of artificial DNA, RNA, XNA, or peptide that bind a specific target molecule, or family of target molecules. They exhibit a range of affinities, with variable levels of off-target binding and are sometimes classified as chemical antibodies. Aptamers and antibodies can be used in many of the same applications, but the nucleic acid-based structure of aptamers, which are mostly oligonucleotides, is very different from the amino acid-based structure of antibodies, which are proteins. This difference can make aptamers a better choice than antibodies for some purposes.

<span class="mw-page-title-main">KEGG</span> Collection of bioinformatics databases

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

The history of molecular biology begins in the 1930s with the convergence of various, previously distinct biological and physical disciplines: biochemistry, genetics, microbiology, virology and physics. With the hope of understanding life at its most fundamental level, numerous physicists and chemists also took an interest in what would become molecular biology.

<span class="mw-page-title-main">Biomolecular structure</span> 3D conformation of a biological sequence, like DNA, RNA, proteins

Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule of protein, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length scales ranging from the level of individual atoms to the relationships among entire protein subunits. This useful distinction among scales is often expressed as a decomposition of molecular structure into four levels: primary, secondary, tertiary, and quaternary. The scaffold for this multiscale organization of the molecule arises at the secondary level, where the fundamental structural elements are the molecule's various hydrogen bonds. This leads to several recognizable domains of protein structure and nucleic acid structure, including such secondary-structure features as alpha helixes and beta sheets for proteins, and hairpin loops, bulges, and internal loops for nucleic acids. The terms primary, secondary, tertiary, and quaternary structure were introduced by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at Stanford University.

Analyte-specific reagents (ASRs) are a class of biological molecules which can be used to identify and measure the amount of an individual chemical substance in biological specimens.

<span class="mw-page-title-main">Residue (chemistry)</span> Whatever remains following a given physical or chemical process

In chemistry, residue is whatever remains or acts as a contaminant after a given class of events.

Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.

<i>International Journal of Biological Macromolecules</i>

The International Journal of Biological Macromolecules is a peer-reviewed scientific journal covering research into chemical and biological aspects of all natural macromolecules. It publishes articles on the molecular structure of proteins, macromolecular carbohydrates, lignins, biological poly-acids, and nucleic acids. It also includes biological activities and interactions, molecular associations, chemical and biological modifications, and functional properties as well as development of related model systems, structural including conformational studies, new analytical techniques, and relevant theoretical developments.

This glossary of biology terms is a list of definitions of fundamental terms and concepts used in biology, the study of life and of living organisms. It is intended as introductory material for novices; for more specific and technical definitions from sub-disciplines and related fields, see Glossary of cell biology, Glossary of genetics, Glossary of evolutionary biology, Glossary of ecology, Glossary of environmental science and Glossary of scientific naming, or any of the organism-specific glossaries in Category:Glossaries of biology.

ncRNA therapy

A majority of the human genome is made up of non-protein coding DNA. It infers that such sequences are not commonly employed to encode for a protein. However, even though these regions do not code for protein, they have other functions and carry necessary regulatory information.They can be classified based on the size of the ncRNA. Small noncoding RNA is usually categorized as being under 200 bp in length, whereas long noncoding RNA is greater than 200bp. In addition, they can be categorized by their function within the cell; Infrastructural and Regulatory ncRNAs. Infrastructural ncRNAs seem to have a housekeeping role in translation and splicing and include species such as rRNA, tRNA, snRNA.Regulatory ncRNAs are involved in the modification of other RNAs.

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.

This glossary of cell and molecular biology is a list of definitions of terms and concepts commonly used in the study of cell biology, molecular biology, and related disciplines, including genetics, microbiology, and biochemistry. It is split across two articles:


  1. 1 2 3 4 "Substance Registration System - Unique Ingredient Identifier (UNII)". fda.gov. August 30, 2019.
  2. 1 2 3 Peryea, Tyler; Southall, Noel; Miller, Mitch; Katzel, Daniel; Anderson, Niko; Neyra, Jorge; Stemann, Sarah; Nguyễn, Ðắc-Trung; Amugoda, Dammika; Newatia, Archana; Ghazzaoui, Ramez; Johanson, Elaine; Diederik, Herman; Callahan, Larry; Switzer, Frank (November 10, 2020). "Global Substance Registration System: consistent scientific descriptions for substances related to health". Nucleic Acids Research. 49 (D1): D1179–D1185. doi: 10.1093/nar/gkaa962 . ISSN   1362-4962. PMC   7779023 . PMID   33137173.
  3. "ISO 11238:2018". ISO. October 20, 2017. Retrieved November 25, 2020.
  4. "ISO/TS 19844:2018". ISO. Retrieved November 25, 2020.
  5. 1 2 "Substance Definition Manual". fda.gov. June 10, 2007. Retrieved November 25, 2020.