A CAS Registry Number [1] (also referred to as CAS RN [2] or informally CAS Number) is a unique identification number assigned by the Chemical Abstracts Service (CAS), US to every chemical substance described in the open scientific literature. It includes all substances described from 1957 through the present, plus some substances from as far back as the early 1800s. [3] It is a chemical database that includes organic and inorganic compounds, minerals, isotopes, alloys, mixtures, and nonstructurable materials (UVCBs, substances of unknown or variable composition, complex reaction products, or biological origin). [4] CAS RNs are generally serial numbers (with a check digit), so they do not contain any information about the structures themselves the way SMILES and InChI strings do.
The registry maintained by CAS is an authoritative collection of disclosed chemical substance information. It identifies more than 182 million unique organic and inorganic substances and 68 million protein and DNA sequences, [3] plus additional information about each substance. It is updated with around 15,000 additional new substances daily. [5] A collection of almost 500 thousand CAS registry numbers are made available under a CC BY-NC license at ACS Commons Chemistry. [6]
Historically, chemicals have been identified by a wide variety of synonyms. Frequently these are arcane and constructed according to regional naming conventions relating to chemical formulae, structures or origins. Well-known chemicals may additionally be known via multiple generic, historical, commercial, and/or (black)-market names.
CAS Registry Numbers (CAS RN) are simple and regular, convenient for database searches. They offer a reliable, common and international link to every specific substance across the various nomenclatures and disciplines used by branches of science, industry, and regulatory bodies. Almost all molecule databases today allow searching by CAS Registry Number.
A CAS Registry Number has no inherent meaning, but is assigned in sequential, increasing order when the substance is identified by CAS scientists for inclusion in the CAS REGISTRY database.
A CAS RN is separated by hyphens into three parts, the first consisting from two up to seven digits, [7] the second consisting of two digits, and the third consisting of a single digit serving as a check digit. This format gives CAS a maximum capacity of 1,000,000,000 unique numbers.
The check digit is found by taking the last digit times 1, the preceding digit times 2, the preceding digit times 3 etc., adding all these up and computing the sum modulo 10. For example, the CAS number of water is 7732-18-5: the checksum 5 is calculated as (8×1 + 1×2 + 2×3 + 3×4 + 7×5 + 7×6) = 105; 105 mod 10 = 5.
To find the CAS number of a compound given its name, formula or structure, the following free resources can be used:
The United States National Library of Medicine (NLM), operated by the United States federal government, is the world's largest medical library.
A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.
CAS is a division of the American Chemical Society. It is a source of chemical information. CAS is located in Columbus, Ohio, United States.
Mercury(II) oxide, also called mercuric oxide or simply mercury oxide, is the inorganic compound with the formula HgO. It has a red or orange color. Mercury(II) oxide is a solid at room temperature and pressure. The mineral form montroydite is very rarely found.
A chemical file format is a type of data file which is used specifically to depicting molecular data. One of the most widely used is the chemical table file format, which is similar to Structure Data Format (SDF) files. They are text files that represent multiple chemical structure records and associated data fields. The XYZ file format is a simple format that usually gives the number of atoms in the first line, a comment on the second, followed by a number of lines with atomic symbols and cartesian coordinates. The Protein Data Bank Format is commonly used for proteins but is also used for other types of molecules. There are many other types which are detailed below. Various software systems are available to convert from one format to another.
Registry of Toxic Effects of Chemical Substances (RTECS) is a database of toxicity information compiled from the open scientific literature without reference to the validity or usefulness of the studies reported. Until 2001 it was maintained by US National Institute for Occupational Safety and Health (NIOSH) as a freely available publication. It is now maintained by the private company BIOVIA or from several value-added resellers and is available only for a fee or by subscription.
The International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. Initially developed by the International Union of Pure and Applied Chemistry (IUPAC) and National Institute of Standards and Technology (NIST) from 2000 to 2005, the format and algorithms are non-proprietary. Since May 2009, it has been developed by the InChI Trust, a nonprofit charity from the United Kingdom which works to implement and promote the use of InChI.
Ammonium hydrosulfide is the chemical compound with the formula [NH4]SH.
The Beilstein database is the largest database in the field of organic chemistry, in which compounds are uniquely identified by their Beilstein Registry Number. The database covers the scientific literature from 1771 to the present and contains experimentally validated information on millions of chemical reactions and substances from original scientific publications. The electronic database was created from Handbuch der Organischen Chemie, founded by Friedrich Konrad Beilstein in 1881, but has appeared online under a number of different names, including Crossfire Beilstein. Since 2009, the content has been maintained and distributed by Elsevier Information Systems in Frankfurt under the product name "Reaxys".
PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.
The European Community number is a unique seven-digit identifier that was assigned to substances for regulatory purposes within the European Union by the European Commission. The EC Inventory comprises three individual inventories, EINECS, ELINCS and the NLP list.
ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry.
Dichlorodiphenyldichloroethane (DDD) is an organochlorine insecticide that is slightly irritating to the skin. DDD is a metabolite of DDT. DDD is colorless and crystalline; it is closely related chemically and is similar in properties to DDT, but it is considered to be less toxic to animals than DDT. The molecular formula for DDD is (ClC6H4)2CHCHCl2 or C14H10Cl4, whereas the formula for DDT is (ClC6H4)2CHCCl3 or C14H9Cl5.
The Hazardous Substances Data Bank (HSDB) is a toxicology database on the U.S. National Library of Medicine's (NLM) Toxicology Data Network (TOXNET). It focuses on the toxicology of potentially hazardous chemicals, and includes information on human exposure, industrial hygiene, emergency handling procedures, environmental fate, regulatory requirements, and related areas. All data are referenced and derived from a core set of books, government documents, technical reports, and selected primary journal literature. Prior to 2020, all entries were peer-reviewed by a Scientific Review Panel (SRP), members of which represented a spectrum of professions and interests. Last Chairs of the SRP are Dr. Marcel J. Cassavant, MD, Toxicology Group, and Dr. Roland Everett Langford, PhD, Environmental Fate Group. The SRP was terminated due to budget cuts and realignment of the NLM.
Reaxys is a web-based tool for the retrieval of chemistry information and data from published literature, including journals and patents. The information includes chemical compounds, chemical reactions, chemical properties, related bibliographic data, substance data with synthesis planning information, as well as experimental procedures from selected journals and patents. It is licensed by Elsevier.
The CompTox Chemicals Dashboard is a freely accessible online database created and maintained by the U.S. Environmental Protection Agency (EPA). The database provides access to multiple types of data including physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay. EPA and other scientists use the data and models contained within the dashboard to help identify chemicals that require further testing and reduce the use of animals in chemical testing. The Dashboard is also used to provide public access to information from EPA Action Plans, e.g. around perfluorinated alkylated substances.
Poly(ethyl methacrylate) (PEMA) is a hydrophobic synthetic acrylate polymer. It has properties similar to the more common PMMA, however it produces less heat during polymerization, has a lower modulus of elasticity and an overall softer texture. It may be vulcanized using lead oxide as a catalyst and it can be softened using ethanol.
Erbium phosphide is a binary inorganic compound of erbium and phosphorus with the chemical formula ErP.