Intention mining

Last updated

In Artificial Intelligence, intention mining or intent mining is the problem of determining a user's intention from logs of his/her behavior in interaction with a computer system, such as in search engines, where there has been research on user intent or query intent prediction since 2002 (see Section 7.2.3 in [1] ); and commercial intents expressed in social media posts. [2]

Contents

The notion of intention mining has been introduced in the Ph.D. thesis of Dr. Ghazaleh Khodabandelou in 2014. [3] [4] [5] This thesis presents a novel approach in Artificial Intelligence to automate the construction of intention models from users' activities. The proposed model uses Hidden Markov Models to model the relationship between users' activities and the strategies (i.e., the different ways to fulfill the intentions). The method also includes some specific algorithms and new optimization methods developed to infer users' intentions and construct intentional models as an oriented graph (with different levels of granularity) in order to have a better understanding of the human way of thinking. [5]

Application

Intention Mining has already been used in several domains:

See also

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) that is concerned with building systems that automatically answer questions that are posed by humans in a natural language.

A recommender system (RecSys), or a recommendation system (sometimes replacing system with terms such as platform, engine, or algorithm), is a subclass of information filtering system that provides suggestions for items that are most pertinent to a particular user. Recommender systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer.

<span class="mw-page-title-main">Ricardo Baeza-Yates</span> Chilean computer scientist

Ricardo A. Baeza-Yates is a Chilean computer scientist that currently is the Director of Research of the Institute for Experiential AI at Northeastern University in the Silicon Valley campus. He is also part-time professor at Universitat Pompeu Fabra in Barcelona and Universidad de Chile in Santiago. He is an expert member of the Global Partnership on Artificial Intelligence, a member of the Association for Computing Machinery's US Technology Policy Committee as well as IEEE's Ethics Committee.

Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are:

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or "pseudo" feedback.

Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents. Query expansion involves techniques such as:

A web query or web search query is a query that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text and boolean search directives are rarely used. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.

Knowledge retrieval seeks to return information in a structured form, consistent with human cognitive processes as opposed to simple lists of data items. It draws on a range of fields including epistemology, cognitive psychology, cognitive neuroscience, logic and inference, machine learning and knowledge discovery, linguistics, and information technology.

Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags and how often they are applied or searched for, in contrast to a taxonomic classification designed by the owners of the content and specified when it is published. This practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Folksonomy was originally "the result of personal free tagging of information [...] for one's own retrieval", but online sharing and interaction expanded it into collaborative forms. Social tagging is the application of tags in an open online environment where the tags of other users are available to others. Collaborative tagging is tagging performed by a group of users. This type of folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

Collaborative information seeking (CIS) is a field of research that involves studying situations, motivations, and methods for people working in collaborative groups for information seeking projects, as well as building systems for supporting such activities. Such projects often involve information searching or information retrieval (IR), information gathering, and information sharing. Beyond that, CIS can extend to collaborative information synthesis and collaborative sense-making.

<span class="mw-page-title-main">International Society for Music Information Retrieval</span> Research organisation in music data

The International Society for Music Information Retrieval (ISMIR) is an international forum for research on the organization of music-related data. It started as an informal group steered by an ad hoc committee in 2000 which established a yearly symposium - whence "ISMIR", which meant International Symposium on Music Information Retrieval. It was turned into a conference in 2002 while retaining the acronym. ISMIR was incorporated in Canada on July 4, 2008.

Cyber crime, or computer crime, refers to any crime that involves a computer and a network. The computer may have been used in the commission of a crime, or it may be the target. Netcrime refers, more precisely, to criminal exploitation of the Internet. Issues surrounding this type of crime have become high-profile, particularly those surrounding hacking, copyright infringement, identity theft, child pornography, and child grooming. There are also problems of privacy when confidential information is lost or intercepted, lawfully or otherwise.

<span class="mw-page-title-main">Infobox</span> Template used to collect and present a subset of information about a subject

An infobox is a digital or physical table used to collect and present a subset of information about its subject, such as a document. It is a structured document containing a set of attribute–value pairs, and in Wikipedia represents a summary of information about the subject of an article. In this way, they are comparable to data tables in some aspects. When presented within the larger document it summarizes, an infobox is often presented in a sidebar format.

ChengXiang Zhai is a computer scientist. He is a Donald Biggar Willett Professor in Engineering in the Department of Computer Science at the University of Illinois at Urbana-Champaign.

Gonzalo Navarro Badino is a full professor of computer science at the University of Chile and ACM Fellow, whose interests include algorithms and data structures, data compression and text searching. He also participates in the Center for Biotechnology and Bioengineering and the Millennium Institute for Foundational Research on Data .. He obtained his PhD at the University of Chile in 1998 under the supervision of Ricardo Baeza-Yates with the thesis Approximate Text Searching, then worked as a post-doctoral researcher with Esko Ukkonen and Maxime Crochemore.

Wei Wang is a Chinese-born American computer scientist. She is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). Her research specializes in big data analytics and modeling, database systems, natural language processing, bioinformatics and computational biology, and computational medicine.

References

  1. R. Baeza-Yates and B. Ribeiro-Neto. "Modern Information Retrieval", second edition, Addison-Wesley, 2011.
  2. 1 2 Zhiyuan Chen, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. "Identifying Intention Posts in Discussion Forums." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), June 9–15, 2013, Atlanta, USA.
  3. 1 2 "Home". khodabandelou.com.
  4. "Ghazaleh Khodabandelou".
  5. 1 2 3 4 Khodabandelou, G.; Hug, C.; Deneckere, R.; Salinesi, C. (2013). "Supervised intentional process models discovery using Hidden Markov models". IEEE 7th International Conference on Research Challenges in Information Science (RCIS). pp. 1–11. CiteSeerX   10.1.1.636.130 . doi:10.1109/RCIS.2013.6577711. ISBN   978-1-4673-2914-9. S2CID   15772563.
  6. Hashemi, R.R., Bahrami, A., LaPlant, J. & Thurber, K. (2008). Discovery of Intent through the Analysis of Visited Sites. In Arabnia, H.A & Hashemi, R.R., (Eds.), Proceedings of the 2008 International Conference on Information & Knowledge Engineering (pp. 417-422). CSREA Press.
  7. Zheng, C., Fan, L., Huan, L., Yin, L., Wei-Ying, M. & Liu, W. (2002, November). User Intention Modeling in Web Applications Using Data Mining. World Wide Web, 5 (3) 181-191.
  8. Strohmaier, M. & Kröll, M. (2012). Acquiring knowledge about human goals from Search Query Logs. Information Processing & Management, 48 (1) 63-82.
  9. Kröll, M. & Strohmaier, M. (2009). Analyzing Human Intentions in Natural Language Text. In Gil, Y., & Fridman Noy, N. (Eds.), Proceedings of the 5th International Conference on Knowledge Capture (pp. 197-198). New York, NY, USA: ACM.
  10. Park, K., Lee, T., Jung, S., Lim, H. & Nam, S. (2010). Extracting Search Intentions from Web Search Logs. In 2nd International Conference on Information Technology Convergence and Services (pp. 1-6).
  11. Jethava, V., Calderón-Benavides, L., Baeza-Yates, R., Bhattacharyya, C. & Dubhashi, D. (2011). Scalable Multi-Dimensional User Intent Identification using Tree Structured Distributions. In Ma, W.-Y., Nie, J.-Y., Baeza-Yates, R.A., Chua, T.-S. & Croft, W.B. (Eds.), Proceedings of the 34th International ACM Conference on Research and development in Information Retrieval (pp. 395-404). New York, NY, USA: ACM.
  12. González-Caro, C. & Baeza-Yates, R. (2011). A multi-faceted approach to query intent classification. In Grossi, R., Sebastiani, F. & Silvestri F. (Eds.), Proceedings of the 18th International Conference on String Processing and Information Retrieval (pp. 368-379). Berlin, Heidelberg: Springer.
  13. Baeza-Yates, R., Calderón-Benavides, R. & González-Caro, C. (2006). The intention behind web queries. In Crestani, F., Ferragina, P. & Sanderson, M. (Eds.), Proceedings of the 13th International Conference on String Processing and Information Retrieval (pp. 98-109). Berlin, Heidelberg: Springer.
  14. Khodabandelou, Ghazaleh, et al. "Unsupervised discovery of intentional process models from event logs." Proceedings of the 11th Working Conference on Mining Software Repositories. ACM, 2014.
  15. Khodabandelou, Ghazaleh, Charlotte Hug, and Camille Salinesi. "A novel approach to process mining: Intentional process models discovery." Research Challenges in Information Science (RCIS), 2014 IEEE Eighth International Conference on. IEEE, 2014.
  16. Outmazgin, N. & Soffer, P. (2010). Business Process Workarounds: What Can and Cannot Be Detected by Process Mining. Lecture Notes in Business Information Processing, 147, 48-62.
  17. Epure, E.V. (2013). Intention-mining: A solution to process participant support in process aware information systems (Master thesis). Utrecht University, The Netherlands.
  18. What Shall I Do Next? Intention Mining for Flexible Process Enactment Elena V. Epure, Charlotte Hug, Rebecca Deneckere, Sjaak Brinkkemper, 26th International Conference on Advanced Information Systems Engineering (CAiSE), Thessaloniki : Greece (2014)
  19. Intelligent Agile Method Framework, Jankovic M., Bajec M., Khodabandelou G., Deneckere R., Hug C., Salinesi C., 8th International Conference on Evaluation of Novel Approaches to Software Engineering 2013
  20. Laflaquière, J., Lotfi, Settouti, S., Prié, Y. & Mille, A. (2006). Trace-Based framework for experience management and engineering. In Gabrys, B, Howlett, R.J. & Jain, L.C. (Eds.), Proceedings of the 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, 1 (1) Berlin, Heidelberg: Springer, 1171-1178.
  21. Clauzel, D., Sehaba, K., & Prié, Y. (2009). Modelling and Visualising Traces for Reflexivity in Synchronous Collaborative Systems. In Badr, Y., Caballé, S., Xhafa, F., Abraham, A., & Gros, B. (Eds.), Proceedings of the 1st International Conference on Intelligent Networking and Collaborative Systems (pp. 16-23). IEEE.
  22. Supervised vs. Unsupervised Learning for Intentional Process Model Discovery Khodabandelou G., Hug C., Deneckere R., Salinesi C. Dans Proceedings of Business Process Modeling, Development, and Support (BPMDS) pp. 282-291, 2014
  23. Mei, T., Hua, X.-S. & Zhou, H.-Q. (2005). Tracking users' capture intention: a novel complementary view for home video content analysis. In Proceedings of the 13th annual ACM International Conference on Multimedia (pp. 531-534). New York, NY, USA: ACM.