Rules extraction system family

Last updated September 03, 2023

The rules extraction system (RULES) family is a family of inductive learning that includes several covering algorithms. This family is used to build a predictive model based on given observation. It works based on the concept of separate-and-conquer to directly induce rules from a given training set and build its knowledge repository.

Overview

RULES family algorithms are mainly used in data mining to create a model that predicts the actions of a given input features. It goes under the umbrella of inductive learning, which is a machine learning approach. In this type of learning, the agent is usually provided with previous information to gain descriptive knowledge based on the given historical data. Thus, it is a supervised learning paradigm that works as a data analysis tool, which uses the knowledge gained through training to reach a general conclusion and identify new objects using the produced classifier.

Inductive learning had been divided into two types: decision tree (DT) and covering algorithms (CA). DTs discover rules using decision tree based on the concept of divide-and-conquer, while CA directly induces rules from the training set based on the concept of separate and conquers. Although DT algorithms was well recognized in the past few decades, CA started to attract the attention due to its direct rule induction property, as emphasized by Kurgan et al. [1]. Under this type of inductive learning approach, several families have been developed and improved. RULES family [2], known as rule extraction system, is one family of covering algorithms that separate each instance or example when inducing the best rules. In this family, the resulting rules are stored in an ‘IF condition THEN conclusion’ structure. It has its own induction procedure that is used to induce the best rules and build the knowledge repository.

Induction procedure

To induce the best rules based on a given observation, RULES family start by selecting (separating) a seed example to build a rule, condition by condition. The rule that covers the most positive examples and the least negative examples are chosen as the best rule of the current seed example. It allows the best rule to cover some negative examples to handle the increase flexibility and reduce the overfitting problem and noisy data in the rule induction. When the coverage performance reaches a specified threshold, it marks the examples that match the induced rules without deletion. This prevents the repetition of discovering the same rule as well as preserves the coverage accuracy and the generality of new rules. After that, the algorithm is repeated to select (conquer) another seed example until all the examples are covered. Hence, only one rule can be generated at each step.

Algorithms

Several versions and algorithms have been proposed in RULES family, and can be summarized as follows:

RULES-1 [3] is the first version in RULES family and was proposed by prof. Pham and prof. Aksoy in 1995.
RULES-2 [4] is an upgraded version of RULES-1, in which every example is studied separately.
RULES-3 [5] is another version that contained all the properties of RULES-2 as well as other additional features to generates more general rules.
RULES-3Plus [6] is an extended version of RULES-3 with two additional functionalities.
RULES-4 [7] is the first incremental version in the RULES family.
RULES-5 [8] is the first RULES version that handles continuous attributes without discretization. It was also extended to produce RULES-5+[9], which improves the performance using a new rule space representation scheme.
RULES-6 [10] is a scalable version of RULES family developed as an extension of RULES-3 plus.
RULES-F [11] is an extension of RULES-5 that handles not only continuous attributes but also continuous classes. A new rule space representation scheme was also integrated to produce an extended version called RULES-F+ [9].
RULES-SRI [12] is another scalable RULES algorithm, developed to improve RULES-6 scalability.
Rule Extractor-1 (REX-1) [13] is an improvement of RULES-3, RULES-3 Plus, and RULES-4 to shortened the process time and produced simpler models with fewer rules.
RULES-IS [14] an incremental algorithm inspired by the immune systems.
RULES-3EXT [15] is an extension of RULES-3 with additional features.
RULES-7 [16] is an extension of RULES-6, in which it applies specialization over one seed at a time.
RULES-8 [17] is an improved version that deals with continuous attributes online.
RULES-TL [18] is another scalable algorithm that was proposed to enhance the performance and speed while introducing more intelligent aspects.
RULES-IT [19] is an incremental version that is built based on RULES-TL to incrementally deal with large and incomplete problems.

Applications

Covering algorithms, in general, can be applied to any machine learning application field, as long as it supports its data type. Witten, Frank and Hall [20] identified six main fielded applications that are actively used as ML applications, including sales and marketing, judgment decisions, image screening, load forecasting, diagnosis, and web mining.

RULES algorithms, in particular, were applied in different manufacturing and engineering applications [21]. RULES-3 EXT was also applied over signature verification and the algorithm performance was verified by Aksoy and Mathkour [22]. Recently, Salem and Schmickl [23] have studied the efficiency of RULEs-4 in predating agent's density.

Related Research Articles

Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples.

In artificial intelligence and philosophy, case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems.

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches. Machine learning approaches have been applied to large language models, computer vision, speech recognition, email filtering, agriculture and medicine, where it is too costly to develop algorithms to perform the needed tasks.

Inductive reasoning is a method of reasoning in which a general principle is derived from a body of observations. It consists of making broad generalizations based on specific observations. Inductive reasoning is distinct from deductive reasoning, where the conclusion of a deductive argument is certain given the premises are correct; in contrast, the truth of the conclusion of an inductive argument is probable, based upon the evidence given.

Ray Solomonoff was the inventor of algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

Solomonoff's theory of inductive inference is a mathematical theory of induction introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component with a learning component. Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions. This approach allows complex solution spaces to be broken up into smaller, simpler parts.

In logic, statistical inference, and supervised learning, transduction or transductive inference is reasoning from observed, specific (training) cases to specific (test) cases. In contrast, induction is reasoning from observed training cases to general rules, which are then applied to the test cases. The distinction is most interesting in cases where the predictions of the transductive model are not achievable by any inductive model. Note that this is caused by transductive inference on different test sets producing mutually inconsistent predictions.

In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern recognition, classification and regression. Features are usually numeric, but structural features such as strings and graphs are used in syntactic pattern recognition. The concept of "feature" is related to that of explanatory variable used in statistical techniques such as linear regression.

Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems.

Inductionism is the scientific philosophy where laws are "induced" from sets of data. As an example, one might measure the strength of electrical forces at varying distances from charges and induce the inverse square law of electrostatics. This concept is considered one of the two pillars of the old view of the philosophy of science, together with verifiability. An application of inductionism can show how experimental evidence can confirm or inductively justify the belief in generalization and the laws of nature.

John Ross Quinlan is a computer science researcher in data mining and decision theory. He has contributed extensively to the development of decision tree algorithms, including inventing the canonical C4.5 and ID3 algorithms. He also contributed to early ILP literature with First Order Inductive Learner (FOIL). He is currently running the company RuleQuest Research which he founded in 1997.

Grammar induction is the process in machine learning of learning a formal grammar from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as strings, trees and graphs.

Version space learning is a logical approach to machine learning, specifically binary classification. Version space learning algorithms search a predefined space of hypotheses, viewed as a set of logical sentences. Formally, the hypothesis space is a disjunction

An incremental decision tree algorithm is an online machine learning algorithm that outputs a decision tree. Many decision tree methods, such as C4.5, construct a tree using a complete dataset. Incremental decision tree methods allow an existing tree to be updated using only new individual data instances, without having to re-process past instances. This may be useful in situations where the entire dataset is not available when the tree is updated, the original data set is too large to process or the characteristics of the data change over time.

In computer science, an evolving intelligent system is a fuzzy logic system which improves the own performance by evolving rules. The technique is known from machine learning, in which external patterns are learned by an algorithm. Fuzzy logic based machine learning works with neuro-fuzzy systems.

Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative and often recursive programs from incomplete specifications, such as input/output examples or constraints.

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

In computer science, incremental learning is a method of machine learning in which input data is continuously used to extend the existing model's knowledge i.e. to further train the model. It represents a dynamic technique of supervised learning and unsupervised learning that can be applied when training data becomes available gradually over time or its size is out of system memory limits. Algorithms that can facilitate incremental learning are known as incremental machine learning algorithms.

The following outline is provided as an overview of and topical guide to machine learning. Machine learning is a subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

References

[1] L. A. Kurgan, K. J. Cios, and S. Dick, "Highly Scalable and Robust Rule Learner: Performance Evaluation and Comparison," IEEE SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, vol. 36, pp. 32–53, 2006.

[2] M. S. Aksoy, "A review of rules family of algorithms," Mathematical and Computational Applications, vol. 13, pp. 51–60, 2008.

[3] D. T. Pham and M. S. Aksoy, "RULES: A simple rule extraction system," Expert Systems with Applications, vol. 8, pp. 59–65, 1995.

[4] D. T. Pham and M. S. Aksoy, "An algorithm for automatic rule induction," Artificial Intelligence in Engineering, vol. 8, pp. 277–282, 1993.

[5] D. T. Pham and M. S. Aksoy, "A new algorithm for inductive learning," Journal of Systems Engenering, vol. 5, pp. 115–122, 1995.

[6] D. T. Pham and S. S. Dimov, "The RULES-3 Plus inductive learning algorithm," in In Proceedings of the Third World Congress on Expert Systems, Seoul, Korea, 1996, pp. 917–924.

[7] D. T. Pham and S. S. Dimov, "An algorithm for incremental inductive learning," Journal of Engineering Manufacture, vol. 211, pp. 239–249, 1997.

[8] D. Pham, S. Bigot, and S. Dimov, "RULES-5: a rule induction algorithm for classification problems involving continuous attributes," in Institution of Mechanical Engineers, 2003, pp. 1273–1286.

[9] S. Bigot, "A new rule space representation scheme for rule induction in classification and control applications," Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, 2011.

[10] D. T. Pham and A. A. Afify, "RULES-6: A Simple Rule Induction Algorithm for Supporting Decision Making," in 31st Annual Conference of IEEE Industrial Electronics Society (IECON '05), 2005, pp. 2184–2189.

[11] D. T. Pham, S. Bigot, and S. S. Dimov, "RULES-F: A fuzzy inductive learning algorithm," Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, vol. 220, pp. 1433–1447, 2006.

[12] A. A. Afify and D. T. Pham, "SRI: A Scalable Rule Induction Algorithm," Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, vol. 220, pp. 537–552, 2006.

[13] Ö. Akgöbek, Y. S. Aydin, E. Öztemel, and M. S. Aksoy, "A new algorithm for automatic knowledge acquisition in inductive learning," Knowledge-Based Systems, vol. 19, pp. 388–395, 2006.

[14] D. T. Pham and A. J. Soroka, "An Immune-network inspired rule generation algorithm (RULES-IS)," in Third Virtual International Conference on Innovative Production Machines and Systems, WhittlesDunbeath, 2007.

[15] H. I. Mathkour, "RULES3-EXT Improvement on RULES-3 Induction Algorithm," Mathematical and Computational Applications, Vol. 15, No. 3, pp., 2010, vol. 15, pp. 318–324, 2010.

[16] K. Shehzad, "EDISC: A Class-tailored Discretization Technique for Rule-based Classification," IEEE Transactions on Knowledge and Data Engineering, vol. 24, pp. 1435–1447, 2012.

[17] D. Pham, "A novel rule induction algorithm with improved handling of continuous valued attributes," Doctor of Philosophy, School of Engineering, Cardiff University, Cardiff, 2012.

[18] H. ElGibreen and M. S. Aksoy, "RULES – TL: A simple and Improved RULES Algorithm for Incomplete and Large Data," Journal of Theoretical and Applied Information Technology, vol. 47, pp. 28–40, 2013.

[19] H. Elgibreen and M. Aksoy, "RULES-IT: incremental transfer learning with RULES family," Frontiers of Computer Science, vol. 8, pp. 537–562, 2014.

[20] I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques, Third ed.: Morgan Kaufmann, 2011.

[21] D. Pham and A. Afify, "Machine-learning techniques and their applications in manufacturing," Proceedings of the Institution of Mechanical Engineers Part B Journal of Engineering Manufacture, vol. 219, pp. 395–412, 2005.

[22] M. S. Aksoy and H. Mathkour, "Signature verification using rules 3-ext inductive learning system," International Journal of the Physical Sciences, vol. 6, pp. 4428–4434, 2011.

[23] Z. Salem and T. Schmickl, "The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents," Cogent Engineering, vol. 1, p. 986262, 2014.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.