SQream DB

Last updated
SQream
Developer(s) SQream Technologies Ltd.
Initial release2014;11 years ago (2014)
Stable release
2021.2 [1] / 13 September 2021;3 years ago (2021-09-13)
Written in CUDA, C++, Haskell [2]
Operating system Linux
Platform x86-64
Type RDBMS
License proprietary
Website sqream.com

SQream is a relational database management system (RDBMS) that uses graphics processing units (GPUs) from Nvidia. SQream is designed for big data analytics using the Structured Query Language (SQL). [3]

Contents

History

SQream is the first product from SQream Technologies Ltd, a company founded in 2010 by Ami Gal and Kostya Varakin in Tel Aviv, Israel. [4]

SQream was first released in 2014 [5] after a partnership with an Orange S.A. in Silicon Valley. [6] [7] The firm claimed Orange S.A. saved $6 million by using SQream in 2014. [8] [9] SQream is aimed at the budget multi-terabyte analytics market, due to its modest hardware requirements and use of compression. [10]

SQream is also the basis for a product named GenomeStack, for querying many DNA sequences simultaneously. [11] [12] A US$7.4M investment of venture capital was announced in June 2015. [13] It is an example of general-purpose computing on graphics processing units, alongside OmniSci and Kinetica. [14]

The firm applied for patents, encompassing parallel execution queries on multi-core processors and speeding up parallel execution on vector processors. [15] [16] [17]

In February 2018, SQream Technologies partnered with Alibaba Group's division Alibaba Cloud to deliver a GPU database service on Alibaba Cloud. [18]

In December 2021, SQream announced that it had acquired no-code data platform Panoply for an undisclosed sum, as part of the push to grow its cloud computing offering. [19]

Software and features

The column-oriented database SQream platform was designed to manage large, fast-growing volumes of data, for compute-intensive queries. The product claims to improve query performance for very large datasets, over traditional relational database systems.

SQream is designed to run on premise or in the public cloud. [20]

Related Research Articles

<span class="mw-page-title-main">Nvidia</span> American multinational technology company

Nvidia Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang, Chris Malachowsky, and Curtis Priem, it is a software company which designs and supplies graphics processing units (GPUs), application programming interfaces (APIs) for data science and high-performance computing, and system on a chip units (SoCs) for mobile computing and the automotive market. Nvidia is also a dominant supplier of artificial intelligence (AI) hardware and software. Nvidia does not make hardware, instead using external suppliers for all phases of manufacturing.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form. In the early days, it was sometimes wrongly styled as DB/2 in a false derivation from the operating system OS/2.

Teradata Corporation is an American software company that provides cloud database and analytics-related software, products, and services. The company was formed in 1979 in Brentwood, California, as a collaboration between researchers at Caltech and Citibank's advanced technology group.

<span class="mw-page-title-main">Graphics processing unit</span> Specialized electronic circuit; graphics accelerator

A graphics processing unit (GPU) is a specialized electronic circuit initially designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal computers, workstations, and game consoles. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

In computer science, stream processing is a programming paradigm which views streams, or sequences of events in time, as the central input and output objects of computation. Stream processing encompasses dataflow programming, reactive programming, and distributed data processing. Stream processing systems aim to expose parallel processing for data streams and rely on streaming algorithms for efficient implementation. The software stack for these systems includes components such as programming models and query languages, for expressing computation; stream management systems, for distribution and scheduling; and hardware components for acceleration including floating-point units, graphics processing units, and field-programmable gate arrays.

<span class="mw-page-title-main">Netezza</span> Provider of Integrated Data Warehouse Hardware and Software

IBM Netezza is a subsidiary of American technology company IBM that designs and markets high-performance data warehouse appliances and advanced analytics applications for the most demanding analytic uses including enterprise data warehousing, business intelligence, predictive analytics and business continuity planning.

<span class="mw-page-title-main">Greenplum</span> American data technology company

Greenplum is a big data technology based on MPP architecture and the Postgres open source database technology. The technology was created by a company of the same name headquartered in San Mateo, California around 2005. Greenplum was acquired by EMC Corporation in July 2010.

<span class="mw-page-title-main">Vertica</span> Software company

Vertica is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as CEOs later on.

Data-intensive computing is a class of parallel computing applications which use a data parallel approach to process large volumes of data typically terabytes or petabytes in size and typically referred to as big data. Computing applications that devote most of their execution time to computational requirements are deemed compute-intensive, whereas applications are deemed data-intensive if they require large volumes of data and devote most of their processing time to input/output and manipulation of data.

HPCC, also known as DAS, is an open source, data-intensive computing system platform developed by LexisNexis Risk Solutions. The HPCC platform incorporates a software architecture implemented on commodity computing clusters to provide high-performance, data-parallel processing for applications utilizing big data. The HPCC platform includes system configurations to support both parallel batch data processing (Thor) and high-performance online query applications using indexed data files (Roxie). The HPCC platform also includes a data-centric declarative programming language for parallel data processing called ECL.

InfiniteGraph is a distributed graph database implemented in Java and C++ and is from a class of NOSQL database technologies that focus on graph data structures. Developers use InfiniteGraph to find useful and often hidden relationships in highly connected, complex big data sets. InfiniteGraph is cross-platform, scalable, cloud-enabled, and is designed to handle very high throughput.

Single instruction, multiple threads (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading. It is different from SPMD in that all instructions in all "threads" are executed in lock-step. The SIMT execution model has been implemented on several GPUs and is relevant for general-purpose computing on graphics processing units (GPGPU), e.g. some supercomputers combine CPUs with GPUs.

Kinetica is a distributed, memory-first OLAP database developed by Kinetica DB, Inc. Kinetica is designed to use GPUs and modern vector processors to improve performance on complex queries across large volumes of real-time data. Kinetica is well suited for analytics on streaming geospatial and temporal data.

Presto is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata, and allows use of multiple data sources within a query. Presto is community-driven open-source software released under the Apache License.

HEAVY.AI is an American-based software company, that uses graphics processing units (GPUs) and central processing units (CPUs) to query and visualize big data. The company was founded in 2013 by Todd Mostak and Thomas Graham and is headquartered in San Francisco, California.

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020 and is named after French mathematician and physicist André-Marie Ampère.

Nvidia GTC is a global artificial intelligence (AI) conference for developers that brings together developers, engineers, researchers, inventors, and IT professionals. Topics focus on AI, computer graphics, data science, machine learning and autonomous machines. Each conference begins with a keynote from Nvidia CEO and founder Jensen Huang, followed by a variety of sessions and talks with experts from around the world.

<span class="mw-page-title-main">Michael Gschwind</span> American computer scientist

Michael Karl Gschwind is an American computer scientist at Nvidia in Santa Clara, California. He is recognized for his seminal contributions to the design and exploitation of general-purpose programmable accelerators, as an early advocate of sustainability in computer design and as a prolific inventor.

Blackwell is a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Hopper and Ada Lovelace microarchitectures.

Milvus is a distributed vector database developed by Zilliz. It is available as both open-source software and a cloud service.

References

  1. "What's new in 2021.2". SQream Technologies. September 13, 2021. Retrieved September 21, 2020.
  2. Wheat, Jake (September 25, 2013). "Using Haskell at SQream Technologies". SQream Technologies. Retrieved July 9, 2018.
  3. Rosbrow-Telem, Laura (June 9, 2015). "This insanely fast big data startup uses only one server – and just got $7.4M in funding". Geektime. Retrieved March 28, 2017.
  4. Wolfson, Rachel (August 15, 2016). "Q&A with Big Data Thought Leader, Ami Gal – Data Natives Tel Aviv 2016". DataConomy. Retrieved March 28, 2017.
  5. "SQream Tech unveils new big data platform". Geektime.
  6. Prickett Morgan, Timothy (March 28, 2014). "Telco Calls on GPU-Native SQream SQL Database". Enterprise Tech. Retrieved March 28, 2017.
  7. "IBM, Orange Use GPUs for Next Generation Enterprise Big Data Analytics at GTC". Nvidia Blog. Retrieved 5 October 2014.
  8. "Getting big data done on a GPU-Based database" (PDF). GPU Technology Conference. Retrieved 5 October 2014.
  9. "SQream Technologies and Orange Silicon Valley Demo Groundbreaking Big Data Platform at GTC". PRWeb. 26 March 2014.
  10. "A Shoebox-Size Data Warehouse Powered by GPUs". Datanami. 22 April 2015.
  11. "April News From the Bio-IT World Conference and Around the Industry". bio-itworld.com.
  12. וינרב, גלי (May 18, 2015). "סטארט-אפ בשבוע: מאגר מידע". Israel Globes (in Hebrew).
  13. "SQream Raises $7.4M in Funding Round" . Genomeweb (Press release). June 9, 2015. Retrieved June 22, 2015.
  14. Prickett Morgan, Timothy (September 22, 2016). "Pushing Database Scalability Up and Out with GPUs". The Next Platform. Retrieved March 28, 2017.
  15. "Patent WO 2012025915 A1 - A system and method for the parallel execution of database queries over cpus and multi core processors". Google Patents. Retrieved 5 October 2014.
  16. "Patent WO 2012025915 A8 - A system and method for the parallel execution of database queries over cpus and multi core processors". Google Patents. Retrieved 5 October 2014.
  17. "Patent WO 2014020605 A1 - A method for pre-processing and processing query operation on multiple data chunk on vector enabled architecture". Google Patents. Retrieved 5 October 2014.
  18. "SQream teams with Alibaba, doubling workforce" . Retrieved 20 February 2018.
  19. "SQream acquires no-code data platform Panoply". TechCrunch. 15 December 2021. Retrieved 2022-01-03.
  20. "SQream Technologies Launches Beta of GPU Database SQream DB on AWS Cloud". Yahoo Finance. Retrieved 5 October 2017.