Synthetic monitoring

Last updated

In software design, web design, and electronic product design, synthetic monitoring (also known as active monitoring or proactive monitoring) is a monitoring technique that is done by using a simulation or scripted recordings of transactions. Behavioral scripts (or paths) are created to simulate an action or path that a customer or end user would take on a site, application, or other software (or even hardware). Those paths are then continuously monitored at specified intervals for performance, such as functionality, availability, and response time measures.

Synthetic monitoring enables a webmaster or an IT/Operations professional to identify problems and determine if a website or application is slow or experiencing downtime before that problem affects actual end users or customers. This type of monitoring does not require actual traffic, thus the name synthetic, so it enables companies to test applications 24x7, or test new applications prior to a live customer-facing launch. [1] [2]

Because synthetic monitoring is a simulation of typical user behavior or navigation through a website, it is often best used to monitor commonly trafficked paths and critical business processes. Synthetic tests must be scripted in advance, so it is not feasible to measure performance for every permutation of a navigational path an end user might take. This is more suited for passive monitoring.

Synthetic testing is useful for measuring uptime, availability, and response time of critical pages and transactions (how a site performs from all geographies) but doesn't monitor or capture actual end user interactions, see Website monitoring. [3]

Synthetic monitoring will report myriad metrics, and it's up to the a webmaster or an IT/Operations professional to identify which ones are most important. Common metrics from synthetic website monitoring includes Time to First Byte, Speed Index, Time to Interactive, and Page Complete. [4]

Demand for synthetic monitoring has grown exponentially corresponding to the underlying growth in websites/applications. IT/Operations staff need mechanisms to identify health and performance issues in advance of their customers identifying and reporting them to avoid customer satisfaction issues. To accomplish this they can write custom simulation scripts, or leverage the growing number of commercial synthetic monitoring solutions. Example include:

By implementing Synthetic Monitoring IT/Operations staffs are able to identify application issues in advance of them becoming critical, and take remedial action. Identifying such issues can be difficult because of:

See also

Related Research Articles

In software quality assurance, performance testing is in general a testing practice performed to determine how a system performs in terms of responsiveness and stability under a particular workload. It can also serve to investigate, measure, validate or verify other quality attributes of the system, such as scalability, reliability and resource usage.

<span class="mw-page-title-main">Simulation</span> Imitation of the operation of a real-world process or system over time

A simulation is an imitative representation of a process or system that could exist in the real world. In this broad sense, simulation can often be used interchangeably with model. Sometimes a clear distinction between the two terms is made, in which simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the simulation represents the evolution of the model over time. Another way to distinguish between the terms is to define simulation as experimentation with the help of a model. This definition includes time-independent simulations. Often, computers are used to execute the simulation.

<span class="mw-page-title-main">Load testing</span> Process of putting demand on a system and measuring its response

Load testing is the process of putting demand on a structure or system and measuring its response.

A service-level agreement (SLA) is an agreement between a service provider and a customer. Particular aspects of the service – quality, availability, responsibilities – are agreed between the service provider and the service user. The most common component of an SLA is that the services should be provided to the customer as agreed upon in the contract. As an example, Internet service providers and telcos will commonly include service level agreements within the terms of their contracts with customers to define the level(s) of service being sold in plain language terms. In this case, the SLA will typically have a technical definition of mean time between failures (MTBF), mean time to repair or mean time to recovery (MTTR); identifying which party is responsible for reporting faults or paying fees; responsibility for various data rates; throughput; jitter; or similar measurable details.

<span class="mw-page-title-main">Benchmark (computing)</span> Standardized performance evaluation

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.

Web analytics is the measurement, collection, analysis, and reporting of web data to understand and optimize web usage. Web analytics is not just a process for measuring web traffic but can be used as a tool for business and market research and assess and improve website effectiveness. Web analytics applications can also help companies measure the results of traditional print or broadcast advertising campaigns. It can be used to estimate how traffic to a website changes after launching a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views, or creates user behavior profiles. It helps gauge traffic and popularity trends, which is useful for market research.

Systems management refers to enterprise-wide administration of distributed systems including computer systems. Systems management is strongly influenced by network management initiatives in telecommunications. The application performance management (APM) technologies are now a subset of Systems management. Maximum productivity can be achieved more efficiently through event correlation, system automation and predictive analysis which is now all part of APM.

Capacity management's goal is to ensure that information technology resources are sufficient to meet upcoming business requirements cost-effectively. One common interpretation of capacity management is described in the ITIL framework. ITIL version 3 views capacity management as comprising three sub-processes: business capacity management, service capacity management, and component capacity management.

In the fields of information technology and systems management, application performance management (APM) is the monitoring and management of the performance and availability of software applications. APM strives to detect and diagnose complex application performance problems to maintain an expected level of service. APM is "the translation of IT metrics into business meaning ."

Website monitoring is the process of testing and verifying that end-users can interact with a website or web application as expected. Website monitoring are often used by businesses to ensure website uptime, performance, and functionality is as expected.

Passive monitoring is a technique used to capture traffic from a network by copying traffic, often from a span port or mirror port or via a network tap. It can be used in application performance management for performance trending and predictive analysis. Passive monitoring is also used in web performance optimization in the form of real user monitoring. E-commerce and media industries use real user monitoring to correlate site performance to conversions and engagement.

Performance engineering encompasses the techniques applied during a systems development life cycle to ensure the non-functional requirements for performance will be met. It may be alternatively referred to as systems performance engineering within systems engineering, and software performance engineering or application performance engineering within software engineering.

Real user monitoring (RUM) is a passive monitoring technology that records all user interaction with a website or client interacting with a server or cloud-based application. Monitoring actual user interaction with a website or an application is important to operators to determine if users are being served quickly and without errors and, if not, which part of a business process is failing. Software as a service (SaaS) and application service providers (ASP) use RUM to monitor and manage service quality delivered to their clients. Real user monitoring data is used to determine the actual service-level quality delivered to end-users and to detect errors or slowdowns on websites. The data may also be used to determine if changes that are propagated to sites have the intended effect or cause errors.

Social media measurement, also called social media controlling, is the management practice of evaluating successful social media communications of brands, companies, or other organizations.

Business transaction management (BTM), also known as business transaction monitoring, application transaction profiling or user defined transaction profiling, is the practice of managing information technology (IT) from a business transaction perspective. It provides a tool for tracking the flow of transactions across IT infrastructure, in addition to detection, alerting, and correction of unexpected changes in business or technical conditions. BTM provides visibility into the flow of transactions across infrastructure tiers, including a dynamic mapping of the application topology.

The following outline is provided as an overview of and topical guide to project management:

Verification and validation of computer simulation models is conducted during the development of a simulation model with the ultimate goal of producing an accurate and credible model. "Simulation models are increasingly being used to solve problems and to aid in decision-making. The developers and users of these models, the decision makers using information obtained from the results of these models, and the individuals affected by decisions based on such models are all rightly concerned with whether a model and its results are "correct". This concern is addressed through verification and validation of the simulation model.

Performance management work (PMW) describes all activities that are necessary to ensure that performance requirements of application systems (AS) can be met. Therefore, PMW integrates software performance engineering (SPE) and application performance management (APM) activities. SPE and APM are part of different lifecycle phases of an AS, namely systems development and IT operations. PMW supports a comprehensive coordination of all SPE and APM activities, which is inevitable due to an increased complexity of AS architectures.

In IT operations, software performance management is the subset of tools and processes in IT Operations which deals with the collection, monitoring, and analysis of performance metrics. These metrics can indicate to IT staff whether a system component is up and running (available), or that the component is behaving in an abnormal way that would impact its ability to function correctly—much like how a doctor may measure pulse, respiration, and temperature to measure how the human body is "operating". This type of monitoring originated with computer network components, but has now expanded into monitoring other components such as servers and storage devices, as well as groups of components organized to deliver specific services and Business Service Management).

In software engineering, more specifically in distributed computing, observability is the ability to collect data about programs' execution, modules' internal states, and the communication among components. To improve observability, software engineers use a wide range of logging and tracing techniques to gather telemetry information, and tools to analyze and use it. Observability is foundational to site reliability engineering, as it is the first step in triaging a service outage. One of the goals of observability is to minimize the amount of prior knowledge needed to debug an issue.

References

  1. "Prioritizing Gartner's APM Model". APM Digest. 15 March 2012. Archived from the original on 22 March 2012. Retrieved 28 April 2012.
  2. "Are You Monitoring Your SaaS Applications? If Not, You Should Be". APMdigest - Application Performance Management. 2017-02-14. Archived from the original on 2017-04-14. Retrieved 2017-04-13.
  3. "The Anatomy of APM - 4 Foundational Elements to a Successful Strategy". APM Digest. 4 April 2012. Archived from the original on 8 June 2012. Retrieved 18 May 2012.
  4. "Site Speed Metrics Explained - TTFB, Speed Index, Interactive, Load, Complete, and more". MachMetrics Speed Blog. 2019-02-22. Archived from the original on 2020-03-18. Retrieved 2020-01-17.