Company type | Private |
---|---|
Industry | Data warehousing, SQL analytics [1] |
Founded | 2014 |
Headquarters | Mountain View, California, |
Key people |
|
Website | www |
Yellowbrick Data is a US-based database company delivering massively parallel processing (MPP) data warehouse and SQL analytics products. [2] [3] [4] The company is headquartered in Mountain View, California. [5] [6]
Yellowbrick Data was founded in 2014 by Neil Carson, Jim Dawson, and Mark Brinicombe to bring to market Yellowbrick Data Warehouse, a flash storage data warehouse product. [7] [8] [9] Yellowbrick’s first product used hardware consisting of analytic blades with both NVMe flash storage and CPUs, with the blades connected by an internal network. [10] The system includes a purpose built execution engine with a primary column store, built in compression, as well as erasure encoding for reliability. [11] The Yellowbrick Data Warehouse supports ANSI SQL and ACID reliability by using a Postgres based front-end, supporting any database driver or external connector. The all-flash architecture claims performance and predictability benefits compared to other data warehouses. [12]
In 2019, Yellowbrick announced two products – the Yellowbrick Cloud Data Warehouse, and Yellowbrick Cloud DR. [13] The Cloud Data Warehouse is a service offering, using its own hardware available to applications running in AWS, Azure, and GCP public clouds through dedicated network links. [14] This product allows the same speed and reliability advantages as the Data Warehouse, and complements the on-premises product. Cloud DR allows replication of on-premises datasets to the cloud service, or between cloud services at multiple physical locations. [15] [16] [17]
In 2022 Yellowbrick announced a fully cloud native version of Yellowbrick Data Warehouse, based on Kubernetes, available across all public clouds including AWS Marketplace, Azure and GCP. The cloud native product retains many of the same architectural principles as the hardware product, such as Massively Parallel Processing, column storage, NVMe flash storage, compatibility with PostgreSQL front-end interfaces and the SQL query language. Following the cloud native approach enables Yellowbrick to be deployed in any public cloud and delivers on cloud benefits such as elasticity and separation of storage and compute. The storage architecture in the cloud adds the use of cloud object storage, such as AWS S3, for persistent storage. In a departure from other similar services in the public cloud, Yellowbrick Data Warehouse does not operate a managed services layer, instead the service is deployed entirely in the target cloud account without requiring data or system metadata to be shared with the cloud operator or vendor.
PostgreSQL also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Windows, Linux, macOS, FreeBSD, and OpenBSD, and handles a range of workloads from single machines to data warehouses, data lakes, or web services with many concurrent users.
Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.
Oracle Database is a proprietary multi-model database management system produced and marketed by Oracle Corporation.
NetApp, Inc. is an American data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in Cork City, Ireland. It has ranked in the Fortune 500 from 2012 to 2021. Founded in 1992 with an initial public offering in 1995, NetApp offers cloud data services for management of applications and data both online and physically.
In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.
In computing, the term data warehouse appliance (DWA) was coined by Foster Hinshaw for a computer architecture for data warehouses (DW) specifically marketed for big data analysis and discovery that is simple to use and has a high performance for the workload. A DWA includes an integrated set of servers, storage, operating systems, and databases.
IBM Netezza is a subsidiary of American technology company IBM that designs and markets high-performance data warehouse appliances and advanced analytics applications for the most demanding analytic uses including enterprise data warehousing, business intelligence, predictive analytics and business continuity planning.
Greenplum is a big data technology based on MPP architecture and the Postgres open source database technology. The technology was created by a company of the same name headquartered in San Mateo, California around 2005. Greenplum was acquired by EMC Corporation in July 2010.
Amazon Relational Database Service is a distributed relational database service by Amazon Web Services (AWS). It is a web service running "in the cloud" designed to simplify the setup, operation, and scaling of a relational database for use in applications. Administration processes like patching the database software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage and compute resources can be performed by a single API call to the AWS control plane on-demand. AWS does not offer an SSH connection to the underlying virtual machine as part of the managed service.
IBM FlashSystem is an IBM Storage enterprise system that stores data on flash memory. Unlike storage systems that use standard solid-state drives, IBM FlashSystem products incorporate custom hardware based on technology from the 2012 IBM acquisition of Texas Memory Systems.
Amazon Redshift is a data warehouse product which forms part of the larger cloud-computing platform Amazon Web Services. It is built on top of technology from the massive parallel processing (MPP) data warehouse company ParAccel, to handle large scale data sets and database migrations. Redshift differs from Amazon's other hosted database offering, Amazon RDS, in its ability to handle analytic workloads on big data data sets stored by a column-oriented DBMS principle. Redshift allows up to 16 petabytes of data on a cluster.
Actian is an American software company headquartered in Santa Clara, California that provides analytics-related software, products, and services. The company sells database software and technology, cloud engineered systems, and data integration solutions.
Google Cloud Platform (GCP) is a suite of cloud computing services offered by Google that provides a series of modular cloud services including computing, data storage, data analytics, and machine learning, alongside a set of management tools. It runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail, and Google Docs, according to Verma et al. Registration requires a credit card or bank account details.
Amazon Aurora is a proprietary relational database offered as a service by Amazon Web Services (AWS) since October 2014. Aurora is available as part of the Amazon Relational Database Service (RDS).
Presto is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata, and allows use of multiple data sources within a query. Presto is community-driven open-source software released under the Apache License.
Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage that runs over an IP-based network.
Kyvos is a business intelligence acceleration platform for cloud and big data platforms developed by an American privately held company named Kyvos Insights. The company, headquartered in Los Gatos, California, was founded by Praveen Kankariya, CEO of Impetus Technologies. The software provides OLAP-based multidimensional analysis on big data and cloud platforms and was launched officially in June 2015. In December the same year, the company was listed among the 10 Coolest Big Data Startups of 2015 by CRN Magazine.
Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. Trino can query data lakes that contain open column-oriented data file formats like ORC or Parquet residing on different storage systems like HDFS, AWS S3, Google Cloud Storage, or Azure Blob Storage using the Hive and Iceberg table formats. Trino also has the ability to run federated queries that query tables in different data sources such as MySQL, PostgreSQL, Cassandra, Kafka, MongoDB and Elasticsearch. Trino is released under the Apache License.
RavenDB is an open-source document-oriented database written in C#, developed by Hibernating Rhinos Ltd. It is cross-platform, supported on Windows, Linux, and Mac OS. RavenDB stores data as JSON documents and can be deployed in distributed clusters with master-master replication.