Associative classifier

Last updated

An associative classifier (AC) is a kind of supervised learning model that uses association rules to assign a target value. The term associative classification was coined by Bing Liu et al., [1] in which the authors defined a model made of rules "whose right-hand side are restricted to the classification class attribute".

Contents

Model

The model generated by an AC and used to label new records consists of association rules, where the consequent corresponds to the class label. As such, they can also be seen as a list of "if-then" clauses: if the record matches some criteria (expressed in the left side of the rule, also called antecedent), it is then labeled accordingly to the class on the right side of the rule (or consequent).

Most ACs read the list of rules in order, and apply the first matching rule to label the new record. [2]

Metrics

The rules of an AC inherit some of the metrics of association rules, like the support or the confidence. [3] Metrics can be used to order or filter the rules in the model [4] and to evaluate their quality.

Implementations

The first proposal of a classification model made of association rules was FBM. The approach was popularized by CBA, [1] although other authors had also previously proposed the mining of association rules for classification. [5] Other authors have since then proposed multiple changes to the initial model, like the addition of a redundant rule pruning phase [6] or the exploitation of Emerging Patterns. [7]

Notable implementations include:

Related Research Articles

Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.

Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations.

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. In any given transaction with a variety of items, association rules are meant to discover the rules that determine how or why certain items are connected.

<span class="mw-page-title-main">Cluster analysis</span> Grouping a set of objects by similarity

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning.

In predictive analytics, data science, machine learning and related fields, concept drift or drift is an evolution of data that invalidates the data model. It happens when the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes. Drift detection and drift adaptation are of paramount importance in the fields that involve dynamically changing data and data models.

Non-negative matrix factorization, also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Also, in applications such as processing of audio spectrograms or muscular activity, non-negativity is inherent to the data being considered. Since the problem is not exactly solvable in general, it is commonly approximated numerically.

Molecule mining is the process of data mining, or extracting and discovering patterns, as applied to molecules. Since molecules may be represented by molecular graphs, this is strongly related to graph mining and structured data mining. The main problem is how to represent molecules while discriminating the data instances. One way to do this is chemical similarity metrics, which has a long tradition in the field of cheminformatics.

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

In machine learning, multi-label classification or multi-output classification is a variant of the classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification is a generalization of multiclass classification, which is the single-label problem of categorizing instances into precisely one of several classes. In the multi-label problem the labels are nonexclusive and there is no constraint on how many of the classes the instance can be assigned to.

In data analysis, anomaly detection is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behavior. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data.

AMiner is a free online service used to index, search, and mine big scientific data.

Massive Online Analysis (MOA) is a free open-source software project specific for data stream mining with concept drift. It is written in Java and developed at the University of Waikato, New Zealand.

In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into ensembles.

In computer vision, rigid motion segmentation is the process of separating regions, features, or trajectories from a video sequence into coherent subsets of space and time. These subsets correspond to independent rigidly moving objects in the scene. The goal of this segmentation is to differentiate and extract the meaningful rigid motion from the background and analyze it. Image segmentation techniques labels the pixels to be a part of pixels with certain characteristics at a particular time. Here, the pixels are segmented depending on its relative movement over a period of time i.e. the time of the video sequence.

Bing Liu is a Chinese-American professor of computer science who specializes in data mining, machine learning, and natural language processing. In 2002, he became a scholar at University of Illinois at Chicago. He holds a PhD from the University of Edinburgh (1988). His PhD advisors were Austin Tate and Kenneth Williamson Currie, and his PhD thesis was titled Reinforcement Planning for Resource Allocation and Constraint Satisfaction.

<span class="mw-page-title-main">Haesun Park</span> South Korean American mathematician

Haesun Park is a professor and chair of Computational Science and Engineering at the Georgia Institute of Technology. She is an IEEE Fellow, ACM Fellow, and Society for Industrial and Applied Mathematics Fellow. Park's main areas of research are Numerical Algorithms, Data Analysis, Visual Analytics and Parallel Computing. She has co-authored over 100 articles in peer-reviewed journals and conferences.

Frequent pattern discovery is part of knowledge discovery in databases, Massive Online Analysis, and data mining; it describes the task of finding the most frequent and relevant patterns in large datasets. The concept was first introduced for mining transaction databases. Frequent patterns are defined as subsets that appear in a data set with frequency no less than a user-specified or auto-determined threshold.

<span class="mw-page-title-main">Gautam Das (computer scientist)</span> Indian computer scientist

Gautam Das is a computer scientist in the field of databases research. He is an ACM Fellow and IEEE Fellow.

<span class="mw-page-title-main">Martin Ester</span>

Martin Ester is a Canadian-German Full Professor of Computing Science at Simon Fraser University. His research focuses on researcher data mining and machine learning.

References

  1. 1 2 Liu, Bing; Hsu, Wynne; Ma, Yiming (1998). "Integrating Classification and Association Rule Mining": 80––86. CiteSeerX   10.1.1.48.8380 .{{cite journal}}: Cite journal requires |journal= (help)
  2. Thabtah, Fadi (2007). "A review of associative classification mining" (PDF). The Knowledge Engineering Review. 22 (1): 37–65. doi:10.1017/s0269888907001026. ISSN   0269-8889. S2CID   15986963.
  3. Liao, T Warren; Triantaphyllou, Evangelos (2008). Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications. Series on Computers and Operations Research. WORLD SCIENTIFIC. doi:10.1142/6689. ISBN   9789812779854. S2CID   34599426.
  4. "CBA homepage" . Retrieved 2018-10-04.
  5. Ali, Kamal; Manganaris, Stefanos; Srikant, Ramakrishnan (1997-08-14). "Partial classification using association rules". KDD'97. AAAI Press: 115–118.{{cite journal}}: Cite journal requires |journal= (help)
  6. 1 2 Wenmin Li; Jiawei Han; Jian Pei (2001). "CMAR: Accurate and efficient classification based on multiple class-association rules". Proceedings 2001 IEEE International Conference on Data Mining. IEEE Comput. Soc. pp. 369–376. CiteSeerX   10.1.1.13.219 . doi:10.1109/icdm.2001.989541. ISBN   978-0769511191. S2CID   2243455.
  7. 1 2 Dong, Guozhu; Zhang, Xiuzhen; Wong, Limsoon; Li, Jinyan (1999), "CAEP: Classification by Aggregating Emerging Patterns" , Discovery Science, Springer Berlin Heidelberg, pp.  30–42, CiteSeerX   10.1.1.37.3226 , doi:10.1007/3-540-46846-3_4, ISBN   9783540667131, S2CID   6465422
  8. "CMAR Implementation". cgi.csc.liv.ac.uk. Retrieved 2018-10-04.
  9. Yin, Xiaoxin; Han, Jiawei (2003), "CPAR: Classification based on Predictive Association Rules", Proceedings of the 2003 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, pp. 331–335, CiteSeerX   10.1.1.12.7268 , doi:10.1137/1.9781611972733.40, ISBN   9780898715453
  10. "THE LUCS-KDD IMPLEMENTATIONS OF THE FOIL, PRM AND CPAR ALGORITHMS". cgi.csc.liv.ac.uk. Retrieved 2018-10-04.
  11. Baralis, E.; Chiusano, S.; Garza, P. (2008). "A Lazy Approach to Associative Classification". IEEE Transactions on Knowledge and Data Engineering. 20 (2): 156–171. doi:10.1109/tkde.2007.190677. ISSN   1041-4347. S2CID   14829459.
  12. "L3 implementation". dbdmg.polito.it. Retrieved 2018-10-08.
  13. Chen, Guoqing; Liu, Hongyan; Yu, Lan; Wei, Qiang; Zhang, Xing (2006). "A new approach to classification based on association rule mining". Decision Support Systems. 42 (2): 674–689. doi:10.1016/j.dss.2005.03.005. ISSN   0167-9236.
  14. Wang, Ke; Zhou, Senqiang; He, Yu (2000). "Growing decision trees on support-less association rules". Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, New York, USA: ACM Press. pp. 265–269. CiteSeerX   10.1.1.36.9265 . doi:10.1145/347090.347147. ISBN   978-1581132335. S2CID   8296096.