Sqlstream

Last updated
SQLstream
Initial release2009
Type Software
Website www.sqlstream.com

SQLstream is a distributed, SQL standards-compliant plus Java stream processing platform. SQLstream, Inc. is based in San Francisco, California and was launched in 2009 by Damian Black, Edan Kabatchnik and Julian Hyde, author of the open source Mondrian Relational OLAP Server Engine.

Contents

Leadership Position

In 2016, SQLstream announced it had licensed a subset of SQLstream Blaze, its flagship product suite, to Amazon's AWS for their Kinesis Analytics service that provides streaming real-time insights and transformations for Amazon customers to Kinesis data streams. In the same year, Forrester had already published its Wave report on Streaming Analytics placing SQLstream in the Leadership Circle (their alternative to Gartner's Magic Quadrant). That year, SQLstream also announced that Kontron, the world's second largest embedded systems supplier, had standardized on SQLstream Blaze for its IoT data acquisition, analysis and real-time action and dashboarding and is embedded within Kontron's IoT Gateways. SQLstream was listed for the fourth year in a row in the DBTA 100 (Database Trends and Applications magazine) which is their list of the 100 companies that matter most in data. In the same year, they announced Rubicon, a publicly-traded company and a leader in real-time advertising, to provide real-time insights into massive volumes of data reducing latency from three hours with Hadoop to near real-time and reducing the number of servers required to perform such analytics from 180 servers to 12 servers.

Technology

The rapid increase in the volume of available service, device and sensor data has led to new, real-time market segments which augment the traditional monitoring, business intelligence and data warehousing domains. [1] The Internet of Things promises to bring hundreds of billions of connected devices to the Internet, all streaming out data that need to be processed in aggregate in real-time in order to power smart services that can react and respond to their environment through these sensors. Stored data analytics systems where one continually updates the data store with newly arriving data and re-traverse the stored data in order to perform analysis on the data do not scale up to the very large volumes of data emitted in the Internet of Things. They are not designed for issuing queries or analyses for each of millions of records per second. This is where technologies like SQLstream come in, that process the data incrementally and continually, without first storing the data. Such an approach is called Stream Processing. All of this information was released in public press releases.

SQLstream provides a relational stream processing platform called SQLstream Blaze for analyzing large volumes of service, sensor and machine and log file data in real-time. It performs real-time collection, aggregation, integration, enrichment and real-time analytics on the streaming data. Data streams are analyzed using the industry standard SQL language, using the ANSI standard, functionally rich SQL window function to analyze and aggregate real-time streaming data over fixed or sliding time windows, which can be further partitioned by user defined keys. Unlike a traditional RDBMS SQL query, which returns a result and exits, streaming SQL queries do not exit, generating results continuously as soon as new data become available. Patterns and exception events in data streams are detected, analyzed and reported 'on the fly' as the data arrive, that is, before the data are stored. Like a database or data warehouse, SQLstream allows you to create multiple views over the data so that different applications and users can each get their own customized view of the streaming data. The partitioning allows many different analytics to be incrementally computed using a single SQL statement or window., effectively processing potentially millions of streams with a single statement. For example, partitioning by a customer id would maintain a separate computation for each distinct customer. This is extremely concise, but also allows for efficient parallel execution. SQLstream Blaze also allows changes to be made to the queries and views without bringing down and recompiling existing applications. This is very important for many Internet of Things and other smart services that must operate 24x7 on a continuous real-time basis, where application changes must be made without needing to bringing down the service or rebuild the application. Part of SQLstream Blaze, StreamLab takes advantage of this capability in order to guide users who wish to explore data streams and understand their structure while the data are still flowing by generating new SQL queries on the fly based on user direction and analysis of data values driven by rules. In this way, it provides an effective platform for performing real-time operational intelligence, which you can view as real-time business intelligence over streaming operational data. SQLstream utilizes dataflow technology to execute many queries over high-velocity high-volume Big Data with a massively parallel standards-compliant SQL engine where the queries are executed concurrently and incrementally. Unlike databases, SQL in SQLstream becomes a language for performing continuous parallel processing, in contrast to a language for data retrieval as commonly found in relational databases. SQLstream is able to execute its queries in an optimized C++ multi-threaded dataflow engine which operates lock-free. This enables people to create lock-free parallel processing applications easily, which otherwise require specialist skillets and are often difficult to get working and are often error prone.

Applications of SQLstream Blaze include real-time service and sensor (Internet of Things) data management, real-time data integration, streaming log file analytics and real-time data warehousing. [2] [3] SQLstream Blaze provides an effective way of processing large volumes of data in real-time enabling a wide variety of smart services to be real-time responsive to streaming data even at massive data volumes.

Products

SQLstream launched its first product to market in January 2008. Its stream processing software is called SQLstream Blaze, which comprises s-Server, s-Studio, s-Dashboard, s-Visualizer and StreamLab. It also has streaming application templates that can be edited and extended by users and these are known as StreamApps. SQLstream has a Smart Cities StreamApp, a Telecom StreamApp and an emergency services StreamApp. Professional services for strategy, certification, and training are also offered on the website.

See also

Notes

  1. All too much, monstrous amounts of data The Economist
  2. Reporting on SQLstream Rick van der Lans, BeyeNetwork
  3. Interview with SQLstream Dashboard Insight

Related Research Articles

IBM Db2 Family Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. They initially supported the relational model, but were extended to support object-relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

Online analytical processing, or OLAP, is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

Event processing is a method of tracking and analyzing (processing) streams of information (data) about things that happen (events), and deriving a conclusion from them. Complex event processing, or CEP, consists of a set of concepts and techniques developed in the early 1990s for processing real-time events and extracting information from event streams as they arrive. The goal of complex event processing is to identify meaningful events in real-time situations and respond to them as quickly as possible.

Microsoft SQL Server Analysis Services, SSAS, is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server. SSAS is used as a tool by organizations to analyze and make sense of information possibly spread out across multiple databases, or in disparate tables or files. Microsoft has included a number of services in SQL Server related to business intelligence and data warehousing. These services include Integration Services, Reporting Services and Analysis Services. Analysis Services includes a group of OLAP and data mining capabilities and comes in two flavors - Multidimensional and Tabular.

Business intelligence software is a type of application software designed to retrieve, analyze, transform and report data for business intelligence. The applications generally read data that has been previously stored, often - though not necessarily - in a data warehouse or data mart.

StreamSQL is a query language that extends SQL with the ability to process real-time data streams. SQL is primarily intended for manipulating relations, which are finite bags of tuples (rows). StreamSQL adds the ability to manipulate streams, which are infinite sequences of tuples that are not all available at the same time. Because streams are infinite, operations over streams must be monotonic. Queries over streams are generally "continuous", executing for long periods of time and returning incremental results.


Operational intelligence (OI) is a category of real-time dynamic, business analytics that delivers visibility and insight into data, streaming events and business operations. OI solutions run queries against streaming data feeds and event data to deliver analytic results as operational instructions. OI provides organizations the ability to make decisions and immediately act on these analytic insights, through manual or automated actions.

SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.

Microsoft Office PerformancePoint Server is a business intelligence software product released in 2007 by Microsoft. The product was generally an integration of the acquisitions from ProClarity - the Planning Server and Monitoring Server - into Microsoft's SharePoint server product line. Although discontinued in 2009, the dashboard, scorecard, and analytics capabilities of PerformancePoint Server were incorporated into SharePoint 2010 and later versions.

Panorama Software is a Canadian software and consulting company specializing in business intelligence. The company was founded by Rony Ross in Israel in 1993; it relocated its headquarters to Toronto, Canada in 2003. Panorama sold its online analytical processing (OLAP) technology to Microsoft in 1996, which was built into Microsoft OLAP Services and later SQL Server Analysis Services, an integrated component of Microsoft SQL Server.

Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network.

Panopticon Software was a multi-national data visualization software company specializing in monitoring and analysis of real-time data. The firm was headquartered in Stockholm, Sweden, with additional offices in New York City, London, Boston, San Francisco, and Chicago. It partnered with several large systems integrators and infrastructure software companies, including SAP, Thomson Reuters, Kx Systems, and One Market Data (OneTick). The company's name is derived from the Greek: 'pan' for all, 'optic' for sight. The company name is derived from the word panopticon which is an architectural concept originally intended to facilitate surveillance of prisons.

Truviso

Truviso is a continuous analytics, venture-backed, startup headquartered in Foster City, California developing and supporting its solution leveraging PostgreSQL, to deliver a proprietary analytics solutions for net-centric customers. Truviso was acquired by Cisco Systems, Inc. on May 4, 2012.

Database activity monitoring is a database security technology for monitoring and analyzing database activity that operates independently of the database management system (DBMS) and does not rely on any form of native (DBMS-resident) auditing or native logs such as trace or transaction logs. DAM is typically performed continuously and in real-time.

In computer science, in-memory processing is an emerging technology for processing of data stored in an in-memory database. Older systems have been based on disk storage and relational databases using SQL query language, but these are increasingly regarded as inadequate to meet business intelligence (BI) needs. Because stored data is accessed much more quickly when it is placed in random-access memory (RAM) or flash memory, in-memory processing allows data to be analysed in real time, enabling faster reporting and decision-making in business.

The following outline is provided as an overview of and topical guide to databases:

Power BI

Power BI is a business analytics service by Microsoft. It aims to provide interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards.

Azure Data Lake is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud.

Microsoft Azure Stream Analytics is a serverless scalable complex event processing engine by Microsoft that enables users to develop and run real-time analytics on multiple streams of data from sources such as devices, sensors, web sites, social media, and other applications. Users can set up alerts to detect anomalies, predict trends, trigger necessary workflows when certain conditions are observed, and make data available to other downstream applications and services for presentation, archiving, or further analysis.