Business process discovery

Last updated

Business process discovery (BPD) related to business process management and process mining is a set of techniques that manually or automatically construct a representation of an organisations' current business processes and their major process variations. These techniques use data recorded in the existing organisational methods of work, documentations, and technology systems that run business processes within an organisation. The type of data required for process discovery is called an event log. Any record of data that contains the case id (a unique identifier that is helpful in grouping activities belonging to the same case), activity name (description of the activity taking place), and timestamp. Such a record qualifies for an event log and can be used to discover the underlying process model. The event log can contain additional information related to the process, such as the resources executing the activity, the type or nature of the events, or any other relevant details. Process discovery aims to obtain a process model that describes the event log as closely as possible. The process model acts as a graphical representation of the process (Petri nets, BPMN, activity diagrams, state diagrams, etc.). The event logs used for discovery could contain noise, irregular information, and inconsistent/incorrect timestamps. Process discovery is challenging due to such noisy event logs and because the event log contains only a part of the actual process hidden behind the system. The discovery algorithms should solely depend on a small percentage of data provided by the event logs to develop the closest possible model to the actual behaviour.

Contents

Process discovery techniques

Various algorithms have been developed over the years for the discovering the process model using an event log:

Application

Business Process Discovery complements and builds upon the work in many other fields.

Resources are allocated based on the process category with resources first dedicated to red processes, then yellow processes and finally green processes. In the event that resources become limited, resources are first withheld from Green Processes, then Yellow Processes. Resources are only withheld from Red Processes if failure to achieve outcomes/goals is acceptable.

The purpose and example

A small example may illustrate the Business Process Discovery technology that is required today. Automated Business Process Discovery tools capture the required data, and transform it into a structured dataset for the actual diagnosis; A major challenge is the grouping of repetitive actions from the users into meaningful events. Next, these Business process discovery tools propose probabilistic process models. Probabilistic behavior is essential for the analysis and the diagnosis of the processes. The following shows an example where a probabilistic repair-process is recovered from user actions. The "as-is" process model shows exactly where the pain is in this business. Five percent faulty repairs is a bad sign, but worse, the repetitive fixes that are needed to complete those repairs are cumbersome.

Business Process Discovery Example BPDWikiPic01.png
Business Process Discovery Example

A deeper analysis of the "as-is" process data may reveal which are the faulty parts that are responsible for the overall behavior in this example. It may lead to the discovery of subgroups of repairs that actually need management focus for improvement.

Business Process Comprehend BPDWikiPic02.png
Business Process Comprehend

In this case, it would become obvious that the faulty parts are also responsible for the repetitive fixes. Similar applications have been documented, such as a Healthcare Insurance Provider case where in 4 months the ROI of Business Process Analysis was earned from precisely comprehending its claims handling process and discovering the faulty parts.

History

Process models

The process discovery techniques applied to the event logs provide a graphical representation of a process. The result of a process discovery algorithm is generally a process model and statistics of the cases that are part of the event log. The representation and accuracy of the discovered model depend both on the technique used for the discovery and the type of visualization that is chosen.

See also

Related Research Articles

<span class="mw-page-title-main">Petri net</span> Model to describe distributed systems

A Petri net, also known as a place/transition net, is one of several mathematical modeling languages for the description of distributed systems. It is a class of discrete event dynamic system. A Petri net is a directed bipartite graph that has two types of elements: places and transitions. Place elements are depicted as white circles and transition elements are depicted as rectangles. A place can contain any number of tokens, depicted as black circles. A transition is enabled if all places connected to it as inputs contain at least one token. Some sources state that Petri nets were invented in August 1939 by Carl Adam Petri—at the age of 13—for the purpose of describing chemical processes.

<span class="mw-page-title-main">Event-driven process chain</span> Flow chart in business processing

An event-driven process chain (EPC) is a type of flow chart for business process modeling. EPC can be used to configure enterprise resource planning execution, and for business process improvement. It can be used to control an autonomous workflow instance in work sharing.

<span class="mw-page-title-main">Business Process Model and Notation</span> Graphical representation for specifying business processes

Business Process Model and Notation (BPMN) is a graphical representation for specifying business processes in a business process model.

A workflow pattern is a specialized form of design pattern as defined in the area of software engineering or business process engineering. Workflow patterns refer specifically to recurrent problems and proven solutions related to the development of workflow applications in particular, and more broadly, process-oriented applications.

<span class="mw-page-title-main">XPDL</span>

The XML Process Definition Language (XPDL) is a format standardized by the Workflow Management Coalition (WfMC) to interchange business process definitions between different workflow products, i.e. between different modeling tools and management suites. XPDL defines an XML schema for specifying the declarative part of workflow / business process.

Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time series mining is closely related, but usually considered a different activity. Sequential pattern mining is a special case of structured data mining.

Process mining is a family of techniques used to analyze event data in order to understand and improve operational processes. Part of the fields of data science and process management, process mining is generally built on logs that contain case id, a unique identifier for a particular process instance; an activity, a description of the event that is occurring; a timestamp; and sometimes other information such as resources, costs, and so on.

<span class="mw-page-title-main">Function model</span>

In systems engineering, software engineering, and computer science, a function model or functional model is a structured representation of the functions within the modeled system or subject area.

Business process management (BPM) is the discipline in which people use various methods to discover, model, analyze, measure, improve, optimize, and automate business processes. Any combination of methods used to manage a company's business processes is BPM. Processes can be structured and repeatable or unstructured and variable. Though not required, enabling technologies are often used with BPM.

<span class="mw-page-title-main">Wil van der Aalst</span> Dutch computer scientist and professor

Willibrordus Martinus Pancratius van der Aalst is a Dutch computer scientist and full professor at RWTH Aachen University, leading the Process and Data Science (PADS) group. His research and teaching interests include information systems, workflow management, Petri nets, process mining, specification languages, and simulation. He is also known for his work on workflow patterns.

Fraud represents a significant problem for governments and businesses and specialized analysis techniques for discovering fraud using them are required. Some of these methods include knowledge discovery in databases (KDD), data mining, machine learning and statistics. They offer applicable and successful solutions in different areas of electronic fraud crimes.

The α-algorithm or α-miner is an algorithm used in process mining, aimed at reconstructing causality from a set of sequences of events. It was first put forward by van der Aalst, Weijters and Măruşter. The goal of Alpha miner is to convert the event log into a workflow-net based on the relations between various activities in the event log. An event log is a multi-set of traces, and a trace is a sequence of activity names. Several extensions or modifications of it have since been presented, which will be listed below.

Artifact-centric business process model represents an operational model of business processes in which the changes and evolution of business data, or business entities, are considered as the main driver of the processes. The artifact-centric approach, a kind of data-centric business process modeling, focuses on describing how business data is changed/updated, by a particular action or task, throughout the process.

<span class="mw-page-title-main">Conformance checking</span>

Business process conformance checking is a family of process mining techniques to compare a process model with an event log of the same process. It is used to check if the actual execution of a business process, as recorded in the event log, conforms to the model and vice versa.

<span class="mw-page-title-main">Alexander L. Wolf</span> American computer scientist

Alexander L. Wolf is an American computer scientist known for his research in software engineering, distributed systems, and computer networking. He is credited, along with his collaborators, with introducing the modern study of software architecture, content-based publish/subscribe messaging, content-based networking, automated process discovery, and the software deployment lifecycle. Wolf's 1985 Ph.D. dissertation developed language features for expressing a module's import/export specifications and the notion of multiple interfaces for a type, both of which are now common in modern computer programming languages.

In business analysis, the Decision Model and Notation (DMN) is a standard published by the Object Management Group. It is a standard approach for describing and modeling repeatable decisions within organizations to ensure that decision models are interchangeable across organizations.

Inductive miner belongs to a class of algorithms used in process discovery. Various algorithms proposed previously give process models of slightly different type from the same input. The quality of the output model depends on the soundness of the model. A number of techniques such as alpha miner, genetic miner, work on the basis of converting an event log into a workflow model, however, they do not produce models that are sound all the time. Inductive miner relies on building a directly follows graph from event log and using this graph to detect various process relations.

Process mining is a technique used to turn event data into insights and actions. Techniques used in process mining such as Process discovery and Conformance checking depend only on the order of activities executed in the operations. The event log not only contains the activity details, but also timestamps, resources and data accompanied with process execution. Careful analysis of the external details from the event log can reveal useful information that can be used for making predictions on decisions that might be taken in the future, efficiency and working dynamics of the team, and performance analysis.

Streaming conformance checking is a type of doing conformance checking where the deviation is reported directly when it happens. Instead of event log, streaming conformance checking techniques take event stream and process model as input and for each received event from the stream, it will be compared with the model.

Token-based replay technique is a conformance checking algorithm that checks how well a process conforms with its model by replaying each trace on the model. Using the four counters produced tokens, consumed tokens, missing tokens, and remaining tokens, it records the situations where a transition is forced to fire and the remaining tokens after the replay ends. Based on the count at each counter, we can compute the fitness value between the trace and the model.

References

  1. van der Aalst, W.; Weijters, T.; Maruster, L. (September 2004). "Workflow mining: discovering process models from event logs". IEEE Transactions on Knowledge and Data Engineering. 16 (9): 1128–1142. doi:10.1109/TKDE.2004.47. ISSN   1558-2191. S2CID   5282914.
  2. A.J.M.M., Weijters (2006). Process mining with the HeuristicsMiner algorithm. Technische Universiteit Eindhoven. OCLC   1028695309.
  3. de Medeiros, A. K. A.; Weijters, A. J. M. M.; van der Aalst, W. M. P. (2007-04-01). "Genetic process mining: an experimental evaluation". Data Mining and Knowledge Discovery. 14 (2): 245–304. doi: 10.1007/s10618-006-0061-7 . ISSN   1573-756X. S2CID   17840407.
  4. Carmona, Josep; Cortadella, Jordi; Kishinevsky, Michael (2008). "A Region-Based Algorithm for Discovering Petri Nets from Event Logs". In Dumas, Marlon; Reichert, Manfred; Shan, Ming-Chien (eds.). Business Process Management. Lecture Notes in Computer Science. Vol. 5240. Berlin, Heidelberg: Springer. pp. 358–373. doi:10.1007/978-3-540-85758-7_26. hdl: 2117/130975 . ISBN   978-3-540-85758-7.
  5. Leemans, Sander J. J.; Fahland, Dirk; van der Aalst, Wil M. P. (2013). "Discovering Block-Structured Process Models from Event Logs - A Constructive Approach". In Colom, José-Manuel; Desel, Jörg (eds.). Application and Theory of Petri Nets and Concurrency. Lecture Notes in Computer Science. Vol. 7927. Berlin, Heidelberg: Springer. pp. 311–329. doi:10.1007/978-3-642-38697-8_17. ISBN   978-3-642-38697-8.
  6. Process Mining
  7. van der Aalst, Wil M.P. (2019). "A practitioner's guide to process mining: Limitations of the directly-follows graph". Procedia Computer Science. 164: 321–328. doi: 10.1016/j.procs.2019.12.189 . ISSN   1877-0509.
  8. Van Der Aalst, Wil M. P. (2013). "Decomposing Petri Nets for Process Mining: A Generic Approach". Distributed and Parallel Databases. 31 (4): 471–507. doi:10.1007/s10619-013-7127-5. S2CID   3364469.
  9. Kalenkova, Anna A.; van der Aalst, Wil M. P.; Lomazova, Irina A.; Rubin, Vladimir A. (2015-10-20). "Process mining using BPMN: relating event logs and process models". Software & Systems Modeling. 16 (4): 1019–1048. doi:10.1007/s10270-015-0502-0. ISSN   1619-1366. S2CID   5459379.

Further reading