Intention mining

Last updated

In data mining, intention mining or intent mining is the problem of determining a user's intention from logs of his/her behavior in interaction with a computer system, such as in search engines, where there has been research on user intent or query intent prediction since 2002 (see Section 7.2.3 in [1] ); and commercial intents expressed in social media posts. [2]

Contents

The notion of intention mining has been introduced in the Ph.D. thesis of Dr. Ghazaleh Khodabandelou in 2014. [3] [4] [5] This thesis presents a novel approach of process mining, called Map Miner Method (MMM). This method is designed to automate the construction of intentional process models from traces. MMM uses Hidden Markov Models to model the relationship between users' activities and the strategies (i.e., the different ways to fulfill the intentions). The method also includes some specific algorithms developed to infer users' intentions and construct intentional process model (Map), respectively. MMM models the intentions as an oriented graph (with different levels of granularity) in order to have a better understanding of the human way of thinking. [5]

Application

Intention Mining has already been used in several domains:

See also

Related Research Articles

Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Web crawler Software which systematically browses the World Wide Web

A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing.

Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.

Meta-process modeling

Meta-process modeling is a type of metamodeling used in software engineering and systems engineering for the analysis and construction of models applicable and useful to some predefined problems.

Ricardo Baeza-Yates

Ricardo A. Baeza-Yates is a Chilean-Catalan computer scientist that currently is a Research Professor at the Institute for Experiential AI of Northeastern University in the Silicon Valley campus. He is also part-time professor at Universitat Pompeu Fabra in Barcelona and Universidad de Chile in Santiago. He is an expert member of the Global Partnership on Artificial Intelligence, a member of Spain's Advisory Council on AI, and a member of the Association for Computing Machinery's US Technology Policy Subcommittee on AI and Algorithms.

Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are:

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or "pseudo" feedback.

Process mining is a family of techniques relating the fields of data science and process management to support the analysis of operational processes based on event logs. The goal of process mining is to turn event data into insights and actions. Process mining is an integral part of data science, fueled by the availability of event data and the desire to improve processes. Process mining techniques use event data to show what people, machines, and organizations are really doing. Process mining provides novel insights that can be used to identify the executional path taken by operational processes and address their performance and compliance problems.

A web query or web search query is a query that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text and boolean search directives are rarely used. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

The unified theory of acceptance and use of technology (UTAUT) is a technology acceptance model formulated by Venkatesh and others in "User acceptance of information technology: Toward a unified view". The UTAUT aims to explain user intentions to use an information system and subsequent usage behavior. The theory holds that there are four key constructs: 1) performance expectancy, 2) effort expectancy, 3) social influence, and 4) enabling conditions.

Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags and how often they are applied or searched for, in contrast to a taxonomic classification designed by the owners of the content and specified when it is published. This practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Folksonomy was originally "the result of personal free tagging of information [...] for one's own retrieval", but online sharing and interaction expanded it into collaborative forms. Social tagging is the application of tags in an open online environment where the tags of other users are available to others. Collaborative tagging is tagging performed by a group of users. This type of folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.

Legal information retrieval is the science of information retrieval applied to legal text, including legislation, case law, and scholarly works. Accurate legal information retrieval is important to provide access to the law to laymen and legal professionals. Its importance has increased because of the vast and quickly increasing amount of legal documents available through electronic means. Legal information retrieval is a part of the growing field of legal informatics.

Learning to rank Use of machine learning to rank items

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data consists of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

Cyber crime, or computer crime, refers to any crime that involves a computer and a network. The computer may have been used in the commission of a crime, or it may be the target. Netcrime refers, more precisely, to criminal exploitation of the Internet. Issues surrounding this type of crime have become high-profile, particularly those surrounding hacking, copyright infringement, identity theft, child pornography, and child grooming. There are also problems of privacy when confidential information is lost or intercepted, lawfully or otherwise.

Social media mining is the process of obtaining big data from user-generated content on social media sites and mobile apps in order to extract actionable patterns, form conclusions about users, and act upon the information, often for the purpose of advertising to users or conducting research. The term is an analogy to the resource extraction process of mining for rare minerals. Resource extraction mining requires mining companies to shift through vast quantities of raw ore to find the precious minerals; likewise, social media mining requires human data analysts and automated software programs to shift through massive amounts of raw social media data in order to discern patterns and trends relating to social media usage, online behaviours, sharing of content, connections between individuals, online buying behaviour, and more. These patterns and trends are of interest to companies, governments and not-for-profit organizations, as these organizations can use these patterns and trends to design their strategies or introduce new programs, new products, processes or services.

Social navigation is a form of social computing introduced by Paul Dourish and Matthew Chalmers in 1994, who defined it as when "movement from one item to another is provoked as an artifact of the activity of another or a group of others". According to later research in 2002, "social navigation exploits the knowledge and experience of peer users of information resources" to guide users in the information space, and that it is becoming more difficult to navigate and search efficiently with all the digital information available from the World Wide Web and other sources. Studying others' navigational trails and understanding their behavior can help improve one's own search strategy by guiding them to make more informed decisions based on the actions of others.

Shih-Fu Chang is a Taiwanese American computer scientist and electrical engineer noted for his research on multimedia information retrieval, computer vision, machine learning, and signal processing. He is currently the dean of the School of Engineering and Applied Science of Columbia University, where he is also the Richard Dicker Professor. He served as the chair of the Special Interest Group of Multimedia (SIGMM) of Association of Computing Machinery (ACM) from 2013 to 2017. He was ranked as the Most Influential Scholar in the field of Multimedia by Aminer in 2016. He was elected as an ACM Fellow in 2017.

Automated machine learning Process of automating the application of machine learning

Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based solution to the growing challenge of applying machine learning. The high degree of automation in AutoML aims to allow non-experts to make use of machine learning models and techniques without requiring them to become experts in machine learning. Automating the process of applying machine learning end-to-end additionally offers the advantages of producing simpler solutions, faster creation of those solutions, and models that often outperform hand-designed models. Common techniques used in AutoML include hyperparameter optimization, meta-learning and neural architecture search.

Wei Wang is a Chinese-born American computer scientist. She is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). Her research specializes in big data analytics and modeling, database systems, natural language processing, bioinformatics and computational biology, and computational medicine.

References

  1. R. Baeza-Yates and B. Ribeiro-Neto. "Modern Information Retrieval", second edition, Addison-Wesley, 2011.
  2. 1 2 Zhiyuan Chen, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. "Identifying Intention Posts in Discussion Forums." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), June 9–15, 2013, Atlanta, USA.
  3. 1 2 "Home". khodabandelou.com.
  4. "Ghazaleh Khodabandelou".
  5. 1 2 3 4 Khodabandelou, G.; Hug, C.; Deneckere, R.; Salinesi, C. (2013). Supervised intentional process models discovery using Hidden Markov models. IEEE Seventh International Conference on Research Challenges in Information Science (RCIS). pp. 1–11. CiteSeerX   10.1.1.636.130 . doi:10.1109/RCIS.2013.6577711. ISBN   978-1-4673-2914-9. S2CID   15772563.
  6. Hashemi, R.R., Bahrami, A., LaPlant, J. & Thurber, K. (2008). Discovery of Intent through the Analysis of Visited Sites. In Arabnia, H.A & Hashemi, R.R., (Eds.), Proceedings of the 2008 International Conference on Information & Knowledge Engineering (pp. 417-422). CSREA Press.
  7. Zheng, C., Fan, L., Huan, L., Yin, L., Wei-Ying, M. & Liu, W. (2002, November). User Intention Modeling in Web Applications Using Data Mining. World Wide Web, 5 (3) 181-191.
  8. Strohmaier, M. & Kröll, M. (2012). Acquiring knowledge about human goals from Search Query Logs. Information Processing & Management, 48 (1) 63-82.
  9. Kröll, M. & Strohmaier, M. (2009). Analyzing Human Intentions in Natural Language Text. In Gil, Y., & Fridman Noy, N. (Eds.), Proceedings of the 5th International Conference on Knowledge Capture (pp. 197-198). New York, NY, USA: ACM.
  10. Park, K., Lee, T., Jung, S., Lim, H. & Nam, S. (2010). Extracting Search Intentions from Web Search Logs. In 2nd International Conference on Information Technology Convergence and Services (pp. 1-6).
  11. Jethava, V., Calderón-Benavides, L., Baeza-Yates, R., Bhattacharyya, C. & Dubhashi, D. (2011). Scalable Multi-Dimensional User Intent Identification using Tree Structured Distributions. In Ma, W.-Y., Nie, J.-Y., Baeza-Yates, R.A., Chua, T.-S. & Croft, W.B. (Eds.), Proceedings of the 34th International ACM Conference on Research and development in Information Retrieval (pp. 395-404). New York, NY, USA: ACM.
  12. González-Caro, C. & Baeza-Yates, R. (2011). A multi-faceted approach to query intent classification. In Grossi, R., Sebastiani, F. & Silvestri F. (Eds.), Proceedings of the 18th International Conference on String Processing and Information Retrieval (pp. 368-379). Berlin, Heidelberg: Springer.
  13. Baeza-Yates, R., Calderón-Benavides, R. & González-Caro, C. (2006). The intention behind web queries. In Crestani, F., Ferragina, P. & Sanderson, M. (Eds.), Proceedings of the 13th International Conference on String Processing and Information Retrieval (pp. 98-109). Berlin, Heidelberg: Springer.
  14. Khodabandelou, Ghazaleh, et al. "Unsupervised discovery of intentional process models from event logs." Proceedings of the 11th Working Conference on Mining Software Repositories. ACM, 2014.
  15. Khodabandelou, Ghazaleh, Charlotte Hug, and Camille Salinesi. "A novel approach to process mining: Intentional process models discovery." Research Challenges in Information Science (RCIS), 2014 IEEE Eighth International Conference on. IEEE, 2014.
  16. Outmazgin, N. & Soffer, P. (2010). Business Process Workarounds: What Can and Cannot Be Detected by Process Mining. Lecture Notes in Business Information Processing, 147, 48-62.
  17. Epure, E.V. (2013). Intention-mining: A solution to process participant support in process aware information systems (Master thesis). Utrecht University, The Netherlands.
  18. What Shall I Do Next? Intention Mining for Flexible Process Enactment Elena V. Epure, Charlotte Hug, Rebecca Deneckere, Sjaak Brinkkemper, 26th International Conference on Advanced Information Systems Engineering (CAiSE), Thessaloniki : Greece (2014)
  19. Intelligent Agile Method Framework, Jankovic M., Bajec M., Khodabandelou G., Deneckere R., Hug C., Salinesi C., 8th International Conference on Evaluation of Novel Approaches to Software Engineering 2013
  20. Laflaquière, J., Lotfi, Settouti, S., Prié, Y. & Mille, A. (2006). Trace-Based framework for experience management and engineering. In Gabrys, B, Howlett, R.J. & Jain, L.C. (Eds.), Proceedings of the 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, 1 (1) Berlin, Heidelberg: Springer, 1171-1178.
  21. Clauzel, D., Sehaba, K., & Prié, Y. (2009). Modelling and Visualising Traces for Reflexivity in Synchronous Collaborative Systems. In Badr, Y., Caballé, S., Xhafa, F., Abraham, A., & Gros, B. (Eds.), Proceedings of the 1st International Conference on Intelligent Networking and Collaborative Systems (pp. 16-23). IEEE.
  22. Supervised vs. Unsupervised Learning for Intentional Process Model Discovery Khodabandelou G., Hug C., Deneckere R., Salinesi C. Dans Proceedings of Business Process Modeling, Development, and Support (BPMDS) pp. 282-291, 2014
  23. Mei, T., Hua, X.-S. & Zhou, H.-Q. (2005). Tracking users' capture intention: a novel complementary view for home video content analysis. In Proceedings of the 13th annual ACM International Conference on Multimedia (pp. 531-534). New York, NY, USA: ACM.