Event monitoring

Last updated

In computer science, event monitoring is the process of collecting, analyzing, and signaling event occurrences to subscribers such as operating system processes, active database rules as well as human operators. These event occurrences may stem from arbitrary sources in both software or hardware such as operating systems, database management systems, application software and processors. Event monitoring may use a time series database.

Contents

Basic concepts

Event monitoring makes use of a logical bus to transport event occurrences from sources to subscribers, where event sources signal event occurrences to all event subscribers and event subscribers receive event occurrences. An event bus can be distributed over a set of physical nodes such as standalone computer systems. Typical examples of event buses are found in graphical systems such as X Window System, Microsoft Windows as well as development tools such as SDT.

Event collection is the process of collecting event occurrences in a filtered event log for analysis. A filtered event log is logged event occurrences that can be of meaningful use in the future; this implies that event occurrences can be removed from the filtered event log if they are useless in the future. Event log analysis is the process of analyzing the filtered event log to aggregate event occurrences or to decide whether or not an event occurrence should be signalled. Event signalling is the process of signalling event occurrences over the event bus.

Something that is monitored is denoted the monitored object; for example, an application, an operating system, a database, hardware etc. can be monitored objects. A monitored object must be properly conditioned with event sensors to enable event monitoring, that is, an object must be instrumented with event sensors to be a monitored object. Event sensors are sensors that signal event occurrences whenever an event occurs. Whenever something is monitored, the probe effect must be managed.

Monitored objects and the probe effect

As discussed by Gait, [1] when an object is monitored, its behavior is changed. In particular, in any concurrent system in which processes can run in parallel, this poses a particular problem. The reason is that whenever sensors are introduced in the system, processes may execute in a different order. This can cause a problem if, for example, we are trying to localize a fault, and by monitoring the system we change its behavior in such a way that the fault may not result in a failure; in essence, the fault can be masked by monitoring the system. The probe effect is the difference in behavior between a monitored object and its un-instrumented counterpart.

According to Schütz, [2] we can avoid, compensate for, or ignore the probe effect. In critical real-time system, in which timeliness (i.e., the ability of a system to meet time constraints such as deadlines) is significant, avoidance is the only option. If we, for example, instrument a system for testing and then remove the instrumentation before delivery, this invalidates the results of most testing based on the complete system. In less critical real-time system (e.g., media-based systems), compensation can be acceptable for, for example, performance testing. In non-concurrent systems, ignorance is acceptable, since the behavior with respect to the order of execution is left unchanged.

Event log analysis

Event log analysis is known as event composition in active databases, chronicle recognition in artificial intelligence and as real-time logic evaluation in real-time systems. Essentially, event log analysis is used for pattern matching, filtering of event occurrences, and aggregation of event occurrences into composite event occurrences. Commonly, dynamic programming strategies from algorithms are employed to save results of previous analyses for future use, since, for example, the same pattern may be match with the same event occurrences in several consecutive analysis processing. In contrast to general rule processing (employed to assert new facts from other facts, cf. inference engine) that is usually based on backtracking techniques, event log analysis algorithms are commonly greedy; for example, when a composite is said to have occurred, this fact is never revoked as may be done in a backtracking based algorithm.

Several mechanisms have been proposed for event log analysis: finite state automata, Petri nets, procedural (either based on an imperative programming language or an object-oriented programming languages), a modification of Boyer–Moore string-search algorithm, and simple temporal networks.

See also

Related Research Articles

Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.

<span class="mw-page-title-main">Signal processing</span> Analysing, modifying and creating signals

Signal processing is an electrical engineering and geophysics subfield that focuses on analyzing, modifying and synthesizing signals, such as sound, images, potential fields, seismic signals, altimetry processing, and scientific measurements. Signal processing techniques are used to optimize transmissions, digital storage efficiency, correcting distorted signals, subjective video quality and to also detect or pinpoint components of interest in a measured signal. In geophysics, signal processing is used to amplify the signal vs the noise within time-series measurements of geophysical data. Processing is conducted within either the time domain or frequency domain, or both

<span class="mw-page-title-main">Data logger</span> Recording device

A data logger is an electronic device that records data over time or about location either with a built-in instrument or sensor or via external instruments and sensors. Increasingly, but not entirely, they are based on a digital processor, and called digital data loggers (DDL). They generally are small, battery-powered, portable, and equipped with a microprocessor, internal memory for data storage, and sensors. Some data loggers interface with a personal computer and use software to activate the data logger and view and analyze the collected data, while others have a local interface device and can be used as a stand-alone device.

Event processing is a method of tracking and analyzing (processing) streams of information (data) about things that happen (events), and deriving a conclusion from them. Complex event processing, or CEP, consists of a set of concepts and techniques developed in the early 1990s for processing real-time events and extracting information from event streams as they arrive. The goal of complex event processing is to identify meaningful events in real-time situations and respond to them as quickly as possible.

Prognostics is an engineering discipline focused on predicting the time at which a system or a component will no longer perform its intended function. This lack of performance is most often a failure beyond which the system can no longer be used to meet desired performance. The predicted time then becomes the remaining useful life (RUL), which is an important concept in decision making for contingency mitigation. Prognostics predicts the future performance of a component by assessing the extent of deviation or degradation of a system from its expected normal operating conditions. The science of prognostics is based on the analysis of failure modes, detection of early signs of wear and aging, and fault conditions. An effective prognostics solution is implemented when there is sound knowledge of the failure mechanisms that are likely to cause the degradations leading to eventual failures in the system. It is therefore necessary to have initial information on the possible failures in a product. Such knowledge is important to identify the system parameters that are to be monitored. Potential uses for prognostics is in condition-based maintenance. The discipline that links studies of failure mechanisms to system lifecycle management is often referred to as prognostics and health management (PHM), sometimes also system health management (SHM) or—in transportation applications—vehicle health management (VHM) or engine health management (EHM). Technical approaches to building models in prognostics can be categorized broadly into data-driven approaches, model-based approaches, and hybrid approaches.

<span class="mw-page-title-main">Sensor fusion</span>

Sensor fusion is the process of combining sensor data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. For instance, one could potentially obtain a more accurate location estimate of an indoor object by combining multiple data sources such as video cameras and WiFi localization signals. The term uncertainty reduction in this case can mean more accurate, more complete, or more dependable, or refer to the result of an emerging view, such as stereoscopic vision.

Electric power quality is the degree to which the voltage, frequency, and waveform of a power supply system conform to established specifications. Good power quality can be defined as a steady supply voltage that stays within the prescribed range, steady AC frequency close to the rated value, and smooth voltage curve waveform. In general, it is useful to consider power quality as the compatibility between what comes out of an electric outlet and the load that is plugged into it. The term is used to describe electric power that drives an electrical load and the load's ability to function properly. Without the proper power, an electrical device may malfunction, fail prematurely or not operate at all. There are many ways in which electric power can be of poor quality, and many more causes of such poor quality power.

Structural health monitoring (SHM) involves the observation and analysis of a system over time using periodically sampled response measurements to monitor changes to the material and geometric properties of engineering structures such as bridges and buildings.

In computer log management and intelligence, log analysis is an art and science seeking to make sense of computer-generated records. The process of creating such records is called data logging.

In the automotive industry, brake-by-wire technology is the ability to control brakes through electrical means. It can be designed to supplement ordinary service brakes or it can be a standalone brake system.

Computer audition (CA) or machine listening is general field of study of algorithms and systems for audio understanding by machine. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems --"software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents."

Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

Automatic target recognition (ATR) is the ability for an algorithm or device to recognize targets or other objects based on data obtained from sensors.

A flame detector is a sensor designed to detect and respond to the presence of a flame or fire, allowing flame detection. Responses to a detected flame depend on the installation, but can include sounding an alarm, deactivating a fuel line, and activating a fire suppression system. When used in applications such as industrial furnaces, their role is to provide confirmation that the furnace is working properly; it can be used to turn off the ignition system though in many cases they take no direct action beyond notifying the operator or control system. A flame detector can often respond faster and more accurately than a smoke or heat detector due to the mechanisms it uses to detect the flame.

Fault detection, isolation, and recovery (FDIR) is a subfield of control engineering which concerns itself with monitoring a system, identifying when a fault has occurred, and pinpointing the type of fault and its location. Two approaches can be distinguished: A direct pattern recognition of sensor readings that indicate a fault and an analysis of the discrepancy between the sensor readings and expected values, derived from some model. In the latter case, it is typical that a fault is said to be detected if the discrepancy or residual goes above a certain threshold. It is then the task of fault isolation to categorize the type of fault and its location in the machinery. Fault detection and isolation (FDI) techniques can be broadly classified into two categories. These include model-based FDI and signal processing based FDI.

Event-driven SOA is a form of service-oriented architecture (SOA), combining the intelligence and proactiveness of event-driven architecture with the organizational capabilities found in service offerings. Before event-driven SOA, the typical SOA platform orchestrated services centrally, through pre-defined business processes, assuming that what should have already been triggered is defined in a business process. This older approach does not account for events that occur across, or outside of, specific business processes. Thus complex events, in which a pattern of activities—both non-scheduled and scheduled—should trigger a set of services is not accounted for in traditional SOA 1.0 architecture.

Database activity monitoring is a database security technology for monitoring and analyzing database activity. DAM may combine data from network-based monitoring and native audit information to provide a comprehensive picture of database activity. The data gathered by DAM is used to analyze and report on database activity, support breach investigations, and alert on anomalies. DAM is typically performed continuously and in real-time.

The following outline is provided as an overview of and topical guide to formal science:

High performance computing applications run on massively parallel supercomputers consist of concurrent programs designed using multi-threaded, multi-process models. The applications may consist of various constructs with varying degree of parallelism. Although high performance concurrent programs use similar design patterns, models and principles as that of sequential programs, unlike sequential programs, they typically demonstrate non-deterministic behavior. The probability of bugs increases with the number of interactions between the various parallel constructs. Race conditions, data races, deadlocks, missed signals and live lock are common error types.

This is a list of the individual topics in Electronics, Mathematics, and Integrated Circuits that together make up the Computer Engineering field. The organization is by topic to create an effective Study Guide for this field. The contents match the full body of topics and detail information expected of a person identifying themselves as a Computer Engineering expert as laid out by the National Council of Examiners for Engineering and Surveying. It is a comprehensive list and superset of the computer engineering topics generally dealt with at any one time.

References

  1. J. Gait (1985). A debugger for concurrent programs. Software-Practice And Experience, 15(6)
  2. W. Schütz (1994). Fundamental issues in testing distributed real-time systems. Real-Time Systems, 7(2):129–157