Azure Stream Analytics

Last updated
Azure Stream Analytics
Developer(s) Microsoft
Available inEnglish
Type Complex event processing engine
Website azure.microsoft.com/en-us/services/stream-analytics/

Microsoft Azure Stream Analytics is a serverless scalable complex event processing engine by Microsoft that enables users to develop and run real-time analytics on multiple streams of data from sources such as devices, sensors, web sites, social media, and other applications. [1] Users can set up alerts to detect anomalies, predict trends, trigger necessary workflows when certain conditions are observed, and make data available to other downstream applications and services for presentation, archiving, or further analysis. [2]

Contents

Query Language

Users can author real-time analytics using a simple declarative SQL-like language with embedded support for temporal logic. Callouts to custom code with JavaScript user defined functions extend the streaming logic written in SQL. [3] Callouts to Azure Machine Learning helps with predictive scoring on streaming data.

Scalability

Azure Stream Analytics is a serverless job service on Azure that eliminates the need for infrastructure, servers, virtual machines, or managed clusters. Users only pay for the processing used for the running jobs. [1]

IoT applications

Azure Stream Analytics integrates with Azure IoT Hub to enable real-time analytics on data from IoT devices and applications. [3]

Real-time Dashboards

Users can build real-time dashboards with Power BI for a live command and control view. Real-time dashboards help transform live data into actionable and insightful visuals.

Data Input Sources

Stream Analytics supports three different types of input sources - Azure Event Hubs, Azure IoT Hubs, and Azure Blob Storage. [2] Additionally, stream analytics supports Azure Blob storage as the input reference data to help augment fast moving event data streams with static data. [2]

Stream analytics supports a wide variety of output targets. Support for Power BI allows for real-time dashboarding. [3] Event Hub, Service bus topics and queues help trigger downstream workflows. Support for Azure Table Storage, Azure SQL Databases, Azure SQL Data Warehouse, Azure SQL, Document DB, Azure Data Lake Store enable a variety of downstream analysis and archiving capabilities. [3]

Related Research Articles

Azure RTOS ThreadX is a highly deterministic, embedded real-time operating system (RTOS) programmed mostly in the language C.

Azure DevOps Server is a Microsoft product that provides version control, reporting, requirements management, project management, automated builds, testing and release management capabilities. It covers the entire application lifecycle, and enables DevOps capabilities. Azure DevOps can be used as a back-end to numerous integrated development environments (IDEs) but is tailored for Microsoft Visual Studio and Eclipse on all platforms.

Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

<span class="mw-page-title-main">Microsoft Azure</span> Cloud computing service created by Microsoft

Microsoft Azure, often referred to as Azure, is a cloud computing service operated by Microsoft for application management via Microsoft-managed data centers. It provides software as a service (SaaS), platform as a service (PaaS) and infrastructure as a service (IaaS) and supports many different programming languages, tools, and frameworks, including both Microsoft-specific and third-party software and systems.

WebORB is an integration server developed and maintained by Midnight Coders Incorporated. It is used in SOA/Rich Internet Application development projects to connect browser clients and mobile clients with backend services and databases. It combines technologies that provide developer productivity tools, AMF remoting, real time messaging, code-level security and real time streaming media.

BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a Platform as a Service (PaaS) that supports querying using ANSI SQL. It also has built-in machine learning capabilities. BigQuery was announced in May 2010 and made generally available in November 2011.

SQLstream is a distributed, SQL standards-compliant plus Java stream processing platform. SQLstream, Inc. is based in San Francisco, California and was launched in 2009 by Damian Black, Edan Kabatchnik and Julian Hyde, author of the open source Mondrian Relational OLAP Server Engine.

A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machine image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL-based and some use a NoSQL data model.

<span class="mw-page-title-main">SingleStore</span>

SingleStore is a cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system, also productized as BigQuery. Drill is an Apache top-level project.

Cloud analytics is a marketing term for businesses to carry out analysis using cloud computing. It uses a range of analytical tools and techniques to help companies extract information from massive data and present it in a way that is easily categorised and readily available via a web browser.

Azure Web Apps is a cloud computing based platform for hosting websites, created and operated by Microsoft. It is a platform as a service (PaaS) which allows publishing Web apps running on multiple frameworks and written in different programming languages, including Microsoft proprietary ones and 3rd party ones. Microsoft Azure Web Sites became available in its first preview version in June 2012, and an official version was announced in June 2013. Microsoft Azure Web Sites was originally named Windows Azure Web Sites, but was renamed as part of a re-branding move across Azure in March 2014. It was subsequently renamed "App Service" in March 2015.

Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail, Google Drive, and YouTube. Alongside a set of management tools, it provides a series of modular cloud services including computing, data storage, data analytics and machine learning. Registration requires a credit card or bank account details.

Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service "for managing data at planet-scale" launched in May 2017. It is schema-agnostic, horizontally scalable, and generally classified as a NoSQL database.

WorkflowGen is a web-based low-code business application creation solution developed by Advantys. As a workflow software and business process management (BPM) solution, WorkflowGen enables organizations to automate human and system-based processes via a visual interface in a low programming environment.

Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers. "Serverless" is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. However, developers of serverless applications are not concerned with capacity planning, configuration, management, maintenance, fault tolerance, or scaling of containers, VMs, or physical servers. Serverless computing does not hold resources in volatile memory; computing is rather done in short bursts with the results persisted to storage. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application. It can be a form of utility computing.

Azure Data Lake is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud.

Azure Data Explorer is a fully-managed big data analytics cloud platform and data-exploration service, developed by Microsoft, that ingests structured, semi-structured and unstructured data. The service then stores this data and answers analytic ad hoc queries on it with seconds of latency. It is a full text indexing and retrieval database, including time series analysis capabilities and regular expression evaluation and text parsing.

Microsoft Power Platform is a line of business intelligence, app development, and app connectivity software applications. Microsoft developed the Power Fx low-code programming language for expressing logic across the Power Platform. It also provides integrations with GitHub and Teams.

References

  1. 1 2 JennieHubbard. "Introduction to Stream Analytics". docs.microsoft.com. Retrieved 2017-08-22.
  2. 1 2 3 "Microsoft Azure Stream Analytics - Simple Talk". Simple Talk. 2015-06-02. Retrieved 2017-08-22.
  3. 1 2 3 4 "Stream Analytics Query Language Reference". msdn.microsoft.com. Retrieved 2017-08-22.