A CAS Registry Number [1] (also referred to as CAS RN [2] or informally CAS Number) is a unique identification number, assigned by the Chemical Abstracts Service (CAS) in the US to every chemical substance described in the open scientific literature, in order to index the substance in the CAS Registry. This registry includes all substances described since 1957, plus some substances from as far back as the early 1800s; [3] it is a chemical database that includes organic and inorganic compounds, minerals, isotopes, alloys, mixtures, and nonstructurable materials (UVCBs, substances of unknown or variable composition, complex reaction products, or biological origin). [4] CAS RNs are generally serial numbers (with a check digit), so they do not contain any information about the structures themselves the way SMILES and InChI strings do.
The CAS Registry is an authoritative collection of disclosed chemical substance information. It identifies more than 204 million unique organic and inorganic substances and 69 million protein and DNA sequences, [3] plus additional information about each substance. It is updated with around 15,000 additional new substances daily. [5] A collection of almost 500 thousand CAS registry numbers are made available under a CC BY-NC license at ACS Commons Chemistry. [6]
Historically, chemicals have been identified by a wide variety of synonyms. One of the biggest challenges in the early development of substance indexing, a task undertaken by the Chemical Abstracts Service, was in identifying if a substance in literature was new or if it had been previously discovered. Well-known chemicals may additionally be known via multiple generic, historical, commercial, and/or (black)-market names, and even systematic nomenclature based on structure alone was not universally useful. An algorithm was developed to translate the structural formula of a chemical into a computer-searchable table, which provided a basis for the service that listed each chemical with its CAS Registry Number, the CAS Chemical Registry System, which became operational in 1965. [7]
CAS Registry Numbers (CAS RN) are simple and regular, convenient for database searches. They offer a reliable, common and international link to every specific substance across the various nomenclatures and disciplines used by branches of science, industry, and regulatory bodies. Almost all molecule databases today allow searching by CAS Registry Number, and it is used as a global standard. [8]
A CAS Registry Number has no inherent meaning, but is assigned in sequential, increasing order when the substance is identified by CAS scientists for inclusion in the CAS Registry database.
A CAS RN is separated by hyphens into three parts, the first consisting from two up to seven digits, [9] the second consisting of two digits, and the third consisting of a single digit serving as a check digit. This format gives CAS a maximum capacity of 1,000,000,000 unique numbers.
The check digit is found by taking the last digit times 1, the preceding digit times 2, the preceding digit times 3 etc., adding all these up and computing the sum modulo 10. For example, the CAS number of water is 7732-18-5: the checksum 5 is calculated as (8×1 + 1×2 + 2×3 + 3×4 + 7×5 + 7×6) = 105; 105 mod 10 = 5.
The American Chemical Society (ACS) is a scientific society based in the United States that supports scientific inquiry in the field of chemistry. Founded in 1876 at New York University, the ACS currently has more than 155,000 members at all degree levels and in all fields of chemistry, chemical engineering, and related fields. It is one of the world's largest scientific societies by membership. The ACS is a 501(c)(3) non-profit organization and holds a congressional charter under Title 36 of the United States Code. Its headquarters are located in Washington, D.C., and it has a large concentration of staff in Columbus, Ohio.
The Merck Index is an encyclopedia of chemicals, drugs and biologicals with over 10,000 monographs on single substances or groups of related compounds published online by the Royal Society of Chemistry.
A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.
Chemical Abstracts Service (CAS) is a division of the American Chemical Society. It is a source of chemical information and is located in Columbus, Ohio, United States.
Tautomers are structural isomers of chemical compounds that readily interconvert. The chemical reaction interconverting the two is called tautomerization. This conversion commonly results from the relocation of a hydrogen atom within the compound. The phenomenon of tautomerization is called tautomerism, also called desmotropism. Tautomerism is for example relevant to the behavior of amino acids and nucleic acids, two of the fundamental building blocks of life.
Mercury(II) oxide, also called mercuric oxide or simply mercury oxide, is the inorganic compound with the formula HgO. It has a red or orange color. Mercury(II) oxide is a solid at room temperature and pressure. The mineral form montroydite is very rarely found.
A chemical file format is a type of data file which is used specifically for depicting molecular data. One of the most widely used is the chemical table file format, which is similar to Structure Data Format (SDF) files. They are text files that represent multiple chemical structure records and associated data fields. The XYZ file format is a simple format that usually gives the number of atoms in the first line, a comment on the second, followed by a number of lines with atomic symbols and cartesian coordinates. The Protein Data Bank Format is commonly used for proteins but is also used for other types of molecules. There are many other types which are detailed below. Various software systems are available to convert from one format to another.
Chemical space is a concept in cheminformatics referring to the property space spanned by all possible molecules and chemical compounds adhering to a given set of construction principles and boundary conditions. It contains millions of compounds which are readily accessible and available to researchers. It is a library used in the method of molecular docking.
The International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. Initially developed by the International Union of Pure and Applied Chemistry (IUPAC) and National Institute of Standards and Technology (NIST) from 2000 to 2005, the format and algorithms are non-proprietary. Since May 2009, it has been developed by the InChI Trust, a nonprofit charity from the United Kingdom which works to implement and promote the use of InChI.
PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.
The European Community number is a unique seven-digit identifier that was assigned to substances for regulatory purposes within the European Union by the European Commission. The EC Inventory comprises three individual inventories, EINECS, ELINCS and the NLP list.
Substructure search (SSS) is a method to retrieve from a database only those chemicals matching a pattern of atoms and bonds which a user specifies. It is an application of graph theory, specifically subgraph matching in which the query is a hydrogen-depleted molecular graph. The mathematical foundations for the method were laid in the 1870s, when it was suggested that chemical structure drawings were equivalent to graphs with atoms as vertices and bonds as edges. SSS is now a standard part of cheminformatics and is widely used by pharmaceutical chemists in drug discovery.
A chemical substance is a unique form of matter with constant chemical composition and characteristic properties. Chemical substances may take the form of a single element or chemical compounds. If two or more chemical substances can be combined without reacting, they may form a chemical mixture. If a mixture is separated to isolate one chemical substance to a desired degree, the resulting substance is said to be chemically pure.
ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.
Dichlorodiphenyldichloroethane (DDD) is an organochlorine insecticide that is slightly irritating to the skin. DDD is a metabolite of DDT. DDD is colorless and crystalline; it is closely related chemically and is similar in properties to DDT, but it is considered to be less toxic to animals than DDT. The molecular formula for DDD is (ClC6H4)2CHCHCl2 or C14H10Cl4, whereas the formula for DDT is (ClC6H4)2CHCCl3 or C14H9Cl5.
The Hazardous Substances Data Bank (HSDB) was a toxicology database on the U.S. National Library of Medicine's (NLM) Toxicology Data Network (TOXNET). It focused on the toxicology of potentially hazardous chemicals, and included information on human exposure, industrial hygiene, emergency handling procedures, environmental fate, regulatory requirements, and related areas. All data were referenced and derived from a core set of books, government documents, technical reports, and selected primary journal literature. Prior to 2020, all entries were peer-reviewed by a Scientific Review Panel (SRP), members of which represented a spectrum of professions and interests. Last Chairs of the SRP are Dr. Marcel J. Cassavant, MD, Toxicology Group, and Dr. Roland Everett Langford, PhD, Environmental Fate Group. The SRP was terminated due to budget cuts and realignment of the NLM.
The CompTox Chemicals Dashboard is a freely accessible online database created and maintained by the U.S. Environmental Protection Agency (EPA). The database provides access to multiple types of data including physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay. EPA and other scientists use the data and models contained within the dashboard to help identify chemicals that require further testing and reduce the use of animals in chemical testing. The Dashboard is also used to provide public access to information from EPA Action Plans, e.g. around perfluorinated alkylated substances.
Poly(ethyl methacrylate) (PEMA) is a hydrophobic synthetic acrylate polymer. It has properties similar to the more common PMMA, however it produces less heat during polymerization, has a lower modulus of elasticity and has an overall softer texture. It may be vulcanized using lead oxide as a catalyst and it can be softened using ethanol.
To find the CAS number of a compound given its name, formula or structure, the following free resources can be used: