VACUUM [1] [2] [3] [4] is a set of normative guidance principles for achieving training and test dataset quality for structured datasets in data science and machine learning. The garbage-in, garbage out principle motivates a solution to the problem of data quality but does not offer a specific solution. Unlike the majority of the ad-hoc data quality assessment metrics often used by practitioners [5] VACUUM specifies qualitative principles for data quality management and serves as a basis for defining more detailed quantitative metrics of data quality. [6]
VACUUM is an acronym that stands for:
{{cite web}}
: CS1 maint: bot: original URL status unknown (link)