Auto-WEKA

Last updated August 27, 2024

Auto-WEKA is an automated machine learning system based on Weka by Chris Thornton, Frank Hutter, Holger H. Hoos and Kevin Leyton-Brown.^[1] An extended version was published as Auto-WEKA 2.0.^[2] Auto-WEKA was named the first prominent AutoML system in a neutral comparison study.^[3]

It received the test-of-time award of the SIGKDD conference in 2023.^[4]

Description

Auto-WEKA introduced the Combined Algorithm Selection and Hyperparameter optimization (CASH) problem, that extends both the Algorithm selection problem and the Hyperparameter optimization problem, by searching for the best algorithm and also its hyperparameters for a given dataset. Baratchi et al. state that "[T]he real power of AutoML was unlocked through the definition of the combined algorithm selection and hyperparameter optimisation problem".^[5]

The CASH for formalism was picked up and also extended by later AutoML systems and methods such as Auto-sklearn^[6], ATM^[7], AutoPrognosis^[8], MCPS^[9], MOSAIC^[10], naive AutoML^[11] and ADMM^[12].

Related Research Articles

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent ought to take actions in a dynamic environment in order to maximize the cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions. Recently, artificial neural networks have been able to surpass many previous approaches in performance.

Waikato Environment for Knowledge Analysis (Weka) is a collection of machine learning and data analysis free software licensed under the GNU General Public License. It was developed at the University of Waikato, New Zealand and is the companion software to the book "Data Mining: Practical Machine Learning Tools and Techniques".

Holger H. Hoos is a German-Canadian computer scientist and a Alexander von Humboldt-professor of artificial intelligence at RWTH Aachen University. He also holds a part-time appointment as a professor of machine learning at Leiden University, and he is an adjunct professor at the Computer Science Department of the University of British Columbia, where he held a full-time professorial appointment from 2000 until 2016. His research interests are focused on artificial intelligence, at the intersection of machine learning, automated reasoning and optimization, with applications in empirical algorithmics, bioinformatics and operations research. In particular, he works on automated algorithm design and on stochastic local search algorithms. Since 2015, he is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), and since 2020 a Fellow of the European Association for Artificial Intelligence (EurAI) as well as a Fellow of the Association for Computing Machinery (ACM).

In machine learning, a hyperparameter is a parameter, such as the learning rate or choice of optimizer, which specifies details of the learning process, hence the name hyperparameter. This is in contrast to parameters which determine the model itself. An additional contrast is that hyperparameters typically cannot be inferred while fitting the machine to the training set because the objective function is typically non-differentiable with respect to them. As a result, gradient based optimization methods cannot be applied directly.

Kevin Leyton-Brown is a Professor of Computer Science at the University of British Columbia. He received his Ph.D. at Stanford University in 2003. He was the recipient of a 2014 NSERC E.W.R. Steacie Memorial Fellowship, a 2013/14 Killam Teaching Prize, and a 2013 Outstanding Young Computer Science Researcher Prize from the Canadian Association of Computer Science. Leyton-Brown co-teaches a popular game theory course on Coursera.org, along with Matthew O. Jackson and Yoav Shoham. Leyton-Brown serves as an associate editor for the Journal of Artificial Intelligence Research, the Artificial Intelligence journal, and ACM Transactions on Economics and Computation, and was program chair for the ACM Conference on Electronic Commerce in 2012. Leyton-Brown and coauthors have received the IJCAI-JAIR Best Paper Prize, the ACM SIGKDD test-of-time award for his work on Auto-WEKA, and numerous medals in international SAT competitions (2003–12). He was elected an AAAI Fellow in 2018.

Bayesian optimization is a sequential design strategy for global optimization of black-box functions that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions.

Feature engineering is a preprocessing step in supervised machine learning and statistical modeling which transforms raw data into a more effective set of inputs. Each input comprises several attributes, known as features. By providing models with relevant information, feature engineering significantly enhances their predictive accuracy and decision-making capability.

Shlomo Zilberstein is an Israeli-American computer scientist. He is a Professor of Computer Science and Associate Dean for Research and Engagement in the College of Information and Computer Sciences at the University of Massachusetts, Amherst. He graduated with a B.A. in Computer Science summa cum laude from Technion – Israel Institute of Technology in 1982, and received a Ph.D. in Computer Science from University of California at Berkeley in 1993, advised by Stuart J. Russell. He is known for his contributions to artificial intelligence, anytime algorithms, multi-agent systems, and automated planning and scheduling algorithms, notably within the context of Markov decision processes (MDPs), Partially Observable MDPs (POMDPs), and Decentralized POMDPs (Dec-POMDPs).

Algorithm selection is a meta-algorithmic technique to choose an algorithm from a portfolio on an instance-by-instance basis. It is motivated by the observation that on many practical problems, different algorithms have different performance characteristics. That is, while one algorithm performs well in some scenarios, it performs poorly in others and vice versa for another algorithm. If we can identify when to use which algorithm, we can optimize for each scenario and improve overall performance. This is what algorithm selection aims to do. The only prerequisite for applying algorithm selection techniques is that there exists a set of complementary algorithms.

Apache SINGA is an Apache top-level project for developing an open source machine learning library. It provides a flexible architecture for scalable distributed training, is extensible to run over a wide range of hardware, and has a focus on health-care applications.

The following outline is provided as an overview of and topical guide to machine learning:

In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process.

Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. It is the combination of automation and ML.

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par with or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used:

In machine learning and statistics, the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. Since it influences to what extent newly acquired information overrides old information, it metaphorically represents the speed at which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain.

Federated learning is a sub-field of machine learning focusing on settings in which multiple entities collaboratively train a model while ensuring that their data remains decentralized. This stands in contrast to machine learning settings in which data is centrally stored. One of the primary defining characteristics of federated learning is data heterogeneity. Due to the decentralized nature of the clients' data, there is no guarantee that data samples held by each client are independently and identically distributed.

Marius Lindauer is a German computer scientist and professor of machine learning at the institute of artificial intelligence of the Leibniz University Hannover. He is known for his research on Automated Machine Learning and other meta-algorithmic approaches.

Himabindu "Hima" Lakkaraju is an Indian-American computer scientist who works on machine learning, artificial intelligence, algorithmic bias, and AI accountability. She is currently an Assistant Professor at the Harvard Business School and is also affiliated with the Department of Computer Science at Harvard University. Lakkaraju is known for her work on explainable machine learning. More broadly, her research focuses on developing machine learning models and algorithms that are interpretable, transparent, fair, and reliable. She also investigates the practical and ethical implications of deploying machine learning models in domains involving high-stakes decisions such as healthcare, criminal justice, business, and education. Lakkaraju was named as one of the world's top Innovators Under 35 by both Vanity Fair and the MIT Technology Review.

References

↑ Thornton, Chris; Hutter, Frank; Hoos, Holger H.; Leyton-Brown, Kevin (August 11, 2013). "Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms". Association for Computing Machinery. pp. 847–855. doi:10.1145/2487575.2487629 – via ACM Digital Library.
↑ Kotthoff, Lars; Thornton, Chris; Hoos, Holger H.; Hutter, Frank; Leyton-Brown, Kevin (August 12, 2017). "Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA". Journal of Machine Learning Research. 18 (25): 1–5 – via jmlr.org.
↑ Gijsbers, Pieter; Bueno, Marcos L. P. (2024). "AMLB: an AutoML Benchmark". Journal of Machine Learning Research. 25: 6. arXiv: 2207.12560 .
↑ "KDD 2023 - Awards". kdd.org.
↑ Baratchi, Mitra; Wang, Can; Limmer, Steffen; van Rijn, Jan N.; Hoos, Holger; Bäck, Thomas; Olhofer, Thomas (2024). "Automated machine learning: past, present and future". Artificial Intelligence Review. 57 (5): 2. doi: 10.1007/s10462-024-10726-1 .
↑ Feurer, Matthias; Klein, Aaron; Eggensperger, Katharina; Springenberg, Jost Tobias; Blum, Manuel; Hutter, Frank (2015). "Efficient and Robust Automated Machine Learning". Advances in Neural Information Processing Systems. Vol. 28.
↑ Swearingen, Thomas; Drevo, Will; Cyphers, Benett; Cuesta-Infante, Alfredo; Ross, Arun; Veeramachaneni, Kalyan (2017). "ATM: A distributed, collaborative, scalable system for automated machine learning". 2017 IEEE International Conference on Big Data (Big Data). doi:10.1109/BigData.2017.8257923.
↑ Alaa, Ahmed M.; van der Schaar, Mihaela (2018). "AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning". Proceedings of the 35th International Conference on Machine Learning.
↑ Salvador, Manuel Martin; Budka, Marcin; Gabrys, Bogdan (2019). "Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA". IEEE Transactions on Automation Science and Engineering. 16 (2): 946–959. arXiv: 1612.08789 . doi:10.1109/TASE.2018.2876430.
↑ Rakotoarison, Herilalaina; Schoenauer, Marc; Sebag, Michèle (2019). "Automated Machine Learning with Monte-Carlo Tree Search". Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. arXiv: 1906.00170 . doi:10.24963/ijcai.2019/457.
↑ Mohr, Felix; Wever, Marcel (2023). "Naive automated machine learning". Machine Learning. 112 (4): 1131–1170. doi: 10.1007/s10994-022-06200-0 .
↑ Liu, Sijia; Ram, Parikshit; Vijaykeerthy, Deepak; Bouneffouf, Djallel; Bramble, Gregory; Samulowitz, Horst; Wang, Dakuo; Conn, Andrew; Gray, Alexander (2020). "An ADMM based framework for automl pipeline configuration". Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. arXiv: 1905.00424 . doi:10.1609/aaai.v34i04.5926.

This artificial intelligence-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Thornton, Chris; Hutter, Frank; Hoos, Holger H.; Leyton-Brown, Kevin (August 11, 2013). "Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms". Association for Computing Machinery. pp. 847–855. doi:10.1145/2487575.2487629 – via ACM Digital Library.

[2] Kotthoff, Lars; Thornton, Chris; Hoos, Holger H.; Hutter, Frank; Leyton-Brown, Kevin (August 12, 2017). "Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA". Journal of Machine Learning Research. 18 (25): 1–5 – via jmlr.org.

[3] Gijsbers, Pieter; Bueno, Marcos L. P. (2024). "AMLB: an AutoML Benchmark". Journal of Machine Learning Research. 25: 6. arXiv: 2207.12560 .

[4] "KDD 2023 - Awards". kdd.org.

[5] Baratchi, Mitra; Wang, Can; Limmer, Steffen; van Rijn, Jan N.; Hoos, Holger; Bäck, Thomas; Olhofer, Thomas (2024). "Automated machine learning: past, present and future". Artificial Intelligence Review. 57 (5): 2. doi: 10.1007/s10462-024-10726-1 .

[6] Feurer, Matthias; Klein, Aaron; Eggensperger, Katharina; Springenberg, Jost Tobias; Blum, Manuel; Hutter, Frank (2015). "Efficient and Robust Automated Machine Learning". Advances in Neural Information Processing Systems. Vol. 28.

[7] Swearingen, Thomas; Drevo, Will; Cyphers, Benett; Cuesta-Infante, Alfredo; Ross, Arun; Veeramachaneni, Kalyan (2017). "ATM: A distributed, collaborative, scalable system for automated machine learning". 2017 IEEE International Conference on Big Data (Big Data). doi:10.1109/BigData.2017.8257923.

[8] Alaa, Ahmed M.; van der Schaar, Mihaela (2018). "AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning". Proceedings of the 35th International Conference on Machine Learning.

[9] Salvador, Manuel Martin; Budka, Marcin; Gabrys, Bogdan (2019). "Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA". IEEE Transactions on Automation Science and Engineering. 16 (2): 946–959. arXiv: 1612.08789 . doi:10.1109/TASE.2018.2876430.

[10] Rakotoarison, Herilalaina; Schoenauer, Marc; Sebag, Michèle (2019). "Automated Machine Learning with Monte-Carlo Tree Search". Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. arXiv: 1906.00170 . doi:10.24963/ijcai.2019/457.

[11] Mohr, Felix; Wever, Marcel (2023). "Naive automated machine learning". Machine Learning. 112 (4): 1131–1170. doi: 10.1007/s10994-022-06200-0 .

[12] Liu, Sijia; Ram, Parikshit; Vijaykeerthy, Deepak; Bouneffouf, Djallel; Bramble, Gregory; Samulowitz, Horst; Wang, Dakuo; Conn, Andrew; Gray, Alexander (2020). "An ADMM based framework for automl pipeline configuration". Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. arXiv: 1905.00424 . doi:10.1609/aaai.v34i04.5926.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]