Azure Data Explorer

Last updated
Azure Data Explorer
Developer(s) Microsoft
Initial release2018;5 years ago (2018)
Platform Microsoft Azure
Type Cloud storage
License Proprietary
Website docs.microsoft.com/en-us/azure/data-explorer/

Azure Data Explorer is a fully-managed [1] big data analytics cloud platform [2] [3] and data-exploration service, [4] developed by Microsoft, [5] [6] that ingests structured, semi-structured (like JSON) and unstructured data (like free-text). [7] [8] [9] [10] The service then stores this data and answers analytic ad hoc queries on it with seconds of latency. It is a full text indexing and retrieval database, including time series analysis capabilities [6] [7] and regular expression evaluation and text parsing. [11]

Contents

It is offered as Platform as a Service (PaaS) as part of Microsoft Azure platform. The product was announced by Microsoft in 2018.

History

The development of the product began in 2014 as a grassroots incubation project in the Israel i R&D center of Microsoft, [12] with the internal code name 'Kusto [9] [7] ' (named after Jacques Cousteau, as a reference to "exploring the ocean of data"). The project aim was to address Azure services' needs for fast and scalable log and telemetry analytics.

In 2016 it became the backend big-data and analytics service for Application Insights Analytics. [13]

The product was announced as a Public Preview product at the Microsoft Ignite 2018 conference, [14] and was announced as a generally available at the Microsoft Ignite conference of February 2019. [15]

In March 2021, "Kusto EngineV3", Azure Data Explorer's next generation storage and query engine, became generally available. It was designed to provide unparalleled performance for ingesting and querying telemetry, logs, and time series data. [16]

Features

Azure Data Explorer offers an optimized query language and visualizing options [17] of its data with a SQL-like language called KQL (Kusto Query Language [18] [19] [20] ). [7] [8]

Azure Data Explorer can ingest 200 MB per second per node. [14] Data Ingestion methods are pipelines and connectors to common services like Azure Event Grid or Azure Event Hub, [21] or programmatic ingestion using SDKs.

Data visualization can be achieved using their native dashboard offering, or with tools like Power BI [21] [22] or Grafana. [23] [24]

Design

Azure Data Explorer is a distributed database running on a cluster of compute nodes in Microsoft Azure. It is based on relational database management systems (RDBMS), supporting entities such as databases, tables, functions, and columns. It supports complex analytics query operators, such as calculated columns, searching and filtering on rows, group by-aggregates and joins. [25]

The engine service exposes a relational data model: At the top level (cluster) there is a collection of databases, each database contains a collection of tables and stored functions. Each table defines a schema (ordered list of typed fields).

In Azure Data Explorer, unlike a typical relational database management systems (RDBMS), there are no constraints like key uniqueness, primary and foreign key. [26] The necessary relationships are established at the query time. [27] The data in Azure Data Explorer generally follows this pattern: [28] Creating Database, Ingesting data, Query the database.

Related Research Articles

A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

Entity Framework (EF) is an open source object–relational mapping (ORM) framework for ADO.NET. It was originally shipped as an integral part of .NET Framework, however starting with Entity Framework version 6.0 it has been delivered separately from the .NET Framework.

Microsoft SQL Server Compact is a compact relational database produced by Microsoft for applications that run on mobile devices and desktops. Prior to the introduction of the desktop platform, it was known as SQL Server for Windows CE and SQL Server Mobile Edition.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

<span class="mw-page-title-main">Microsoft Azure SQL Database</span> Managed cloud database

Microsoft Azure SQL Database is a managed cloud database (PaaS) provided as part of Microsoft Azure services. The service handles database management functions for cloud based Microsoft SQL Servers including upgrading, patching, backups, and monitoring without user involvement.

<span class="mw-page-title-main">Microsoft Azure</span> Cloud computing platform by Microsoft

Microsoft Azure, often referred to as Azure, is a cloud computing platform run by Microsoft. It offers access, management, and the development of applications and services through global data centers. It also provides a range of capabilities, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Microsoft Azure supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily inter-connected data.

<span class="mw-page-title-main">DatabaseSpy</span> SQL database profiling tool and GUI

DatabaseSpy is a multi-database query, design, and database comparison tool from Altova, the creator of XMLSpy. DatabaseSpy connects to many major relational databases, facilitating SQL querying, database structure design, database content editing, and database comparison and conversion.

A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machine image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL-based and some use a NoSQL data model.

<span class="mw-page-title-main">SingleStore</span>

SingleStore is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

Druid is a column-oriented, open-source, distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide low-latency queries on top of the data. The name Druid comes from the shapeshifting Druid class in many role-playing games, to reflect that the architecture of the system can shift to solve different types of data problems.

<span class="mw-page-title-main">Databricks</span> American software company

Databricks, Inc. is an American software company founded by the creators of Apache Spark. Databricks develops a web-based platform for working with Spark, that provides automated cluster management and IPython-style notebooks. The company develops Delta Lake, an open-source project to bring reliability to data lakes for machine learning and other data science use cases.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for mission-critical applications. Unlike traditional relational databases, Cosmos DB is a NoSQL database, which means it can handle unstructured and semi-structured, in addition to structured, data types.

The history of Microsoft SQL Server begins with the first Microsoft SQL Server database product – SQL Server v1.0, a 16-bit relational database for the OS/2 operating system, released in 1989.

Azure Data Lake is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud.

Microsoft Azure Stream Analytics is a serverless scalable complex event processing engine by Microsoft that enables users to develop and run real-time analytics on multiple streams of data from sources such as devices, sensors, web sites, social media, and other applications. Users can set up alerts to detect anomalies, predict trends, trigger necessary workflows when certain conditions are observed, and make data available to other downstream applications and services for presentation, archiving, or further analysis.

<span class="mw-page-title-main">Grafana</span> Platform for data analytics and monitoring

Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources.

<span class="mw-page-title-main">Apache Pinot</span> Open-source distributed data store

Apache Pinot is a column-oriented, open-source, distributed data store written in Java. Pinot is designed to execute OLAP queries with low latency. It is suited in contexts where fast analytics, such as aggregations, are needed on immutable data, possibly, with real-time data ingestion. The name Pinot comes from the Pinot grape vines that are pressed into liquid that is used to produce a variety of different wines. The founders of the database chose the name as a metaphor for analyzing vast quantities of data from a variety of different file formats or streaming data sources.

References

  1. Serra, James (2019-03-14). "Azure Data Explorer". James Serra's Blog. Retrieved 2020-04-09.
  2. "Microsoft Rolls Out Azure Data Lake Storage, Azure Data Explorer". eWEEK. Retrieved 2020-04-10.
  3. Mogenis, Max. "What is Azure Data Explorer?". blog.pragmaticworks.com. Retrieved 2020-03-21.
  4. "Creating An Azure Data Explorer Cluster And Database In Azure". www.c-sharpcorner.com. Retrieved 2020-04-09.
  5. Mackie, Kurt (2019-09-20). "Updated Tools for Office 365 and Microsoft Azure Arriving Soon -- Redmondmag.com". Redmondmag. Retrieved 2020-04-10.
  6. 1 2 "Microsoft Azure Data Explorer vs. Microsoft Azure SQL Database Comparison". db-engines.com. Retrieved 2020-04-09.
  7. 1 2 3 4 Mahajan, Gauri (2020-02-27). "Azure Data Explorer for beginners". SQL Shack - articles about database auditing, server performance, data recovery, and more. Retrieved 2020-04-09.
  8. 1 2 "What is Azure Data Explorer". docs.microsoft.com. Retrieved 2019-12-06.
  9. 1 2 Mahajan, Gauri (2020-02-27). "Azure Data Explorer for beginners". SQL Shack - articles about database auditing, server performance, data recovery, and more. Retrieved 2020-03-21.
  10. Lai, Alex (2019-01-14). "What is Azure Data Explorer and Kusto Querying Language (KQL)?". Adatis. Retrieved 2020-04-10.
  11. "Azure Data Explorer - Digital Marketplace". www.digitalmarketplace.service.gov.uk. Retrieved 2020-03-21.
  12. "Microsoft R&D". www.microsoftrnd.co.il. Retrieved 2019-12-06.
  13. orspod. "Introducing Application Insights Analytics". devblogs.microsoft.com. Retrieved 2019-12-06.
  14. 1 2 "Introducing Azure Data Explorer". azure.microsoft.com. Retrieved 2019-12-06.
  15. "General Availability: Azure Data Explorer | Azure updates | Microsoft Azure". azure.microsoft.com. Retrieved 2019-12-06.
  16. orspod. "Azure Data Explorer Kusto EngineV3". docs.microsoft.com. Retrieved 2021-04-13.
  17. Brust, Andrew. "Fastly, Microsoft partner on real-time analytics with Azure Data Explorer". ZDNet. Retrieved 2020-04-09.
  18. Mackie, Kurt (2019-07-18). "Microsoft Previews Prometheus Data in Azure Monitor for Containers -- Redmondmag.com". Redmondmag. Retrieved 2020-04-09.
  19. "Getting Started with the Kusto Query Language (KQL) – System.Blog.Martens.Ben". blogs.msdn.microsoft.com. Retrieved 2019-12-06.
  20. "Exploring Data in Microsoft Azure Using Kusto Query Language and Azure Data Explorer". www.pluralsight.com. Retrieved 2020-03-21.
  21. 1 2 "Architecting Data and Analytics Pipelines in Azure". Gartner. Retrieved 2020-04-10.
  22. Mahajan, Gauri (2020-02-27). "Azure Data Explorer for beginners". SQL Shack - articles about database auditing, server performance, data recovery, and more. Retrieved 2020-04-10.
  23. "Grafana vs Azure Dashboards - Which One to Use & When?". CloudIQ Tech. 2018-12-10. Retrieved 2020-04-10.
  24. "Azure Data Explorer Datasource plugin for Grafana". Grafana Labs. Retrieved 2020-04-10.
  25. orspod. "Getting started with Kusto - Azure Data Explorer". docs.microsoft.com. Retrieved 2019-12-06.
  26. "GraphDB vs. Microsoft Azure Data Explorer Comparison". db-engines.com. Retrieved 2020-04-10.
  27. "Azure Data Explorer: a big data analytics cloud platform" (PDF). Archived (PDF) from the original on 2019-12-06.
  28. Serra, James (2019-03-14). "Azure Data Explorer". James Serra's Blog. Retrieved 2020-03-21.