Molecular descriptor

Last updated September 25, 2025

Molecular descriptors play a fundamental role in chemistry, pharmaceutical sciences, environmental protection policy, and health researches, as well as in quality control, being the way molecules, thought of as real bodies, are transformed into numbers, allowing some mathematical treatment of the chemical information contained in the molecule. This was defined by Todeschini and Consonni as:

By this definition, the molecular descriptors are divided into two main categories: experimental measurements, such as log P, molar refractivity, dipole moment, polarizability, and, in general, additive physico-chemical properties, and theoretical molecular descriptors, which are derived from a symbolic representation of the molecule and can be further classified according to the different types of molecular representation.^[2]

The main classes of theoretical molecular descriptors are: 1) 0D-descriptors (i.e. constitutional descriptors, count descriptors), 2) 1D-descriptors (i.e. list of structural fragments, fingerprints),3) 2D-descriptors (i.e. graph invariants),4) 3D-descriptors (such as, for example, 3D-MoRSE descriptors, WHIM descriptors, GETAWAY descriptors, quantum-chemical descriptors, size, steric, surface and volume descriptors),5) 4D-descriptors (such as those derived from GRID or CoMFA methods, Volsurf). The outspread of artificial intelligence and machine learning to computational chemistry has also lead to various attempts to uncover new descriptors or to find the most predictive ones among some sort of candidates.^[3]^[4]

Invariance properties of molecular descriptors

The invariance properties of molecular descriptors can be defined as the ability of the algorithm for their calculation to give a descriptor value that is independent of the particular characteristics of the molecular representation, such as atom numbering or labeling, spatial reference frame, molecular conformations, etc. Invariance to molecular numbering or labeling is assumed as a minimal basic requirement for any descriptor.^{[ citation needed ]}

Two other important invariance properties, translational invariance and rotational invariance, are the invariance of a descriptor value to any translation or rotation of the molecules in the chosen reference frame. These last invariance properties are required for the 3D-descriptors.^{[ citation needed ]}

Degeneracy of molecular descriptors

This property refers to the ability of a descriptor to avoid equal values for different molecules. In this sense, descriptors can show no degeneracy at all, low, intermediate, or high degeneracy. For example, the number of molecule atoms and the molecular weights are high degeneracy descriptors, while, usually, 3D-descriptors show low or no degeneracy at all.^{[ citation needed ]}

Criteria for Molecular Descriptors

Molecular descriptors are numerical values that encapsulate chemical information about molecules, facilitating their mathematical analysis. Given the vast array of available descriptors, it's essential to establish foundational principles to ensure their reliability and utility. A robust molecular descriptor should:^[5]^[6]

Be invariant to atom labeling and numbering
Be invariant to the molecule roto-translation
Be defined by an unambiguous algorithm
Have a well-defined applicability on molecular structures

Beyond these foundational criteria, to be practically valuable, a molecular descriptor should also:

Should have structural interpretation
Should have a good correlation with at least one experimental property
Should not have trivial relation with other molecular descriptors
Should not be based on experimental properties 9. Should preferably be continuous
Should preferably show minimal degeneracy
Should preferably be simple
Should preferably be applicable to a broad class of molecules
Should preferably be able to discriminate among isomers
Should preferably have calculated values in a suitable numerical range for the set of molecules where it is applicable to

The initial set of principles ensures that a descriptor is well-defined and invariant to manipulations that don't alter the intrinsic molecular structure. Historically, many descriptors were designed for small organic molecules. However, contemporary challenges necessitate descriptors that can be applied to diverse compounds, including salts, ionic liquids, peptides, polymers, and nanostructures.

The subsequent set of guidelines emphasizes the descriptor's practical utility. An effective descriptor should be interpretable, correlate with experimental properties, and provide unique information not captured by other descriptors. Continuity and low degeneracy are crucial, as they ensure the descriptor can sensitively reflect minor structural variations. Ultimately, the information a descriptor provides is contingent upon the chosen molecular representation and its alignment with the specific property or activity being studied.^[2]

Software for molecular descriptors calculation

Here there is a list of a selection of commercial and free descriptor calculation tools.

Name	0D descriptors	Fingerprints	3D descriptors	Python library	CLI	GUI	KNIME	Comments	License	Website
alvaDesc ^[7]^[8]	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Available for Windows, Linux and macOS. Last update 2025.	Proprietary, commercial	https://www.alvascience.com/alvadesc/
Dragon^[9]	Yes	Yes	Yes	No	Yes	Yes	Yes	Discontinued.	Proprietary, commercial	https://chm.kode-solutions.net/products_dragon.php
Mordred^[10]	Yes	No	Yes	Yes	Yes	No	No	Based on RDKit. Official version discontinued (last update 2019), but has a community-maintained fork.	Free open source	https://github.com/mordred-descriptor/mordred, https://github.com/JacksonBurns/mordred-community
PaDEL-descriptor^[11]	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Based on CDK. Discontinued (last update 2014).	Free	http://www.yapcwsoft.com/dd/padeldescriptor/
RDKit	Yes	Yes	Yes	Yes	No	No	Yes	Last update 2024	Free open source	https://github.com/rdkit/rdkit
scikit-fingerprints^[12]	Yes	Yes	Yes	Yes	No	No	No	Last update 2025	Free open source	https://github.com/scikit-fingerprints/scikit-fingerprints

References

↑ Todeschini, Roberto; Consonni, Viviana (2000). Handbook of Molecular Descriptors. Methods and Principles in Medicinal Chemistry. Wiley. doi:10.1002/9783527613106. ISBN 978-3-527-29913-3.
1 2 Mauri, Andrea; Consonni, Viviana; Todeschini, Roberto (2017). "Molecular Descriptors". Handbook of Computational Chemistry. Springer International Publishing. pp. 2065–2093. doi:10.1007/978-3-319-27282-5_51.
↑ Mueller, Tim; Kusne, Aaron Gilad; Ramprasad, Rampi (2016-04-01). "Machine Learning in Materials Science". In Parrill, Abby L.; Lipkowitz, Kenny B. (eds.). Reviews in Computational Chemistry. Vol. 29 (1st ed.). Wiley. pp. 186–273. doi:10.1002/9781119148739.ch4. ISBN 978-1-119-10393-6.
↑ Ghiringhelli, Luca M.; Vybiral, Jan; Levchenko, Sergey V.; Draxl, Claudia; Scheffler, Matthias (2015-03-10). "Big Data of Materials Science: Critical Role of the Descriptor". Physical Review Letters. 114 (10). 105503. arXiv: 1411.7437 . Bibcode:2015PhRvL.114j5503G. doi:10.1103/PhysRevLett.114.105503. PMID 25815947.
↑ Randić, M. (1996). Molecular bonding profiles. Journal of Mathematical Chemistry, 19(3), 375–392. https://doi.org/10.1007/BF01166727
↑ Guha, R., & Willighagen, E. (2012). A Survey of Quantitative Descriptions of Molecular Structure. Current Topics in Medicinal Chemistry, 12(18), 1946–1956. https://doi.org/10.2174/156802612804910278
↑ Mauri, Andrea (2020). "alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints". Methods in Pharmacology and Toxicology. New York, NY: Springer US. pp. 801–820. doi:10.1007/978-1-0716-0150-1_32.
↑ Mauri, Andrea; Bertola, Matteo (2022). "Alvascience: A New Software Suite for the QSAR Workflow Applied to the Blood–Brain Barrier Permeability". International Journal of Molecular Sciences. 23 12882. doi: 10.3390/ijms232112882 . PMC 9655980 .
↑ Mauri, A., Consonni, V., Pavan, M., & Todeschini, R. (2006). Dragon software: An easy approach to molecular descriptor calculations. Match Communications In Mathematical And In Computer Chemistry, 56(2), 237–248.
↑ Moriwaki, H., Tian, Y. S., Kawashita, N., & Takagi, T. (2018). Mordred: A molecular descriptor calculator. Journal of Cheminformatics, 10(1), 1–14. https://doi.org/10.1186/s13321-018-0258-y
↑ Yap, C. W. (2011). PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints. Journal of Computational Chemistry. https://doi.org/10.1002/jcc.21707
↑ Adamczyk, J., & Ludynia, P. (2024). Scikit-fingerprints: Easy and efficient computation of molecular fingerprints in Python. SoftwareX, 28, 101944. https://doi.org/https://doi.org/10.1016/j.softx.2024.101944

Molecular descriptor

Contents

Invariance properties of molecular descriptors

Degeneracy of molecular descriptors

Criteria for Molecular Descriptors

Software for molecular descriptors calculation

See also

References

Further reading