WebScaleSQL

Last updated

WebScaleSQL
Developer(s) Facebook, Google, LinkedIn, Twitter and Alibaba Group
Repository
Written in C, C++, Perl and Bash
Operating system Linux
Platform x86-64
Available in English
Type RDBMS
License GNU GPL version 2
Website webscalesql.org

WebScaleSQL was an open-source relational database management system (RDBMS) created as a software branch of the production-ready community releases of MySQL. By joining efforts of a few companies and incorporating various changes and new features into MySQL, WebScaleSQL aimed toward fulfilling various needs arising from the deployment of MySQL in large-scale environments, which involve large amounts of data and numerous database servers. [1] [2]

Contents

The source code of WebScaleSQL is hosted on GitHub and licensed under the terms of version 2 of the GNU General Public License. [3] [4]

The project website announced in December 2016 that the companies involved would no longer contribute to the project. [5]

Overview

Running MySQL on numerous servers with large amounts of data, at the scale of terabytes and petabytes of data, creates a set of difficulties that in many cases arise the need for implementing specific customized MySQL features, or the need for introducing functional changes to MySQL. More than a few companies have faced the same (or very similar) set of difficulties in their production environments, which used to result in the availability of multiple solutions for similar challenges. [3] [6] [7]

WebScaleSQL was announced on March 27, 2014 as a joint effort of Facebook, Google, LinkedIn and Twitter (with Alibaba Group joining in January 2015 [8] ), aiming to provide a centralized development structure for extending MySQL with new features specific to its large-scale deployments, such as building large replicated databases running on server farms. As a result, WebScaleSQL attempted to open a path toward deduplicating the efforts each founding company had been putting into maintaining its own branch of MySQL, and toward bringing together more developers. [1] [4] [9]

WebScaleSQL was created as a branch of the MySQL's latest production-ready community release, which was version 5.6 as of March 2013. As the project aimed to tightly follow new MySQL community releases, a branching path was selected instead of becoming a software fork of MySQL. The selection of MySQL production-ready community releases for the WebScaleSQL's upstream, instead of selecting some of the available MySQL forks was the result of a consensus between the four founding companies, which concluded that the features already existing in MySQL 5.6 are suitable for large-scale deployments, while additional features of the same kind are planned for MySQL 5.7. [1] [3] [4]

Features

The initial changes and feature additions that WebScaleSQL introduced to the MySQL 5.6 codebase came from the engineers employed by the four founding companies; however, the project was open to peer-reviewed community contributions. [10] As of September 15,2014, available new features and changes included the following: [4] [9] [11] [12] [13]

As of March 28,2014, planned new features and changes included the following: [1] [9]

Availability

WebScaleSQL is distributed in a source-code-only form, with no official binaries available. As of March 27,2014, compiling the source code and running WebScaleSQL is supported only on x86-64 Linux hosts, requiring at the same time a toolchain that supports C99 and C++11 language standards. [4]

The source code is hosted on GitHub and available under version 2 of the GNU General Public License (GPL v2). [3] [4]

End of Contributions

In December 2016, the WebScaleSQL website announced the companies originally involved in collaborating on the project (Facebook, Google, LinkedIn, Twitter, and Alibaba) would no longer contribute to the project. The announcement blamed differences among the needs of the various companies for the end of the collaboration.

See also

Related Research Articles

<span class="mw-page-title-main">MySQL</span> SQL database engine software

MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data. SQL is a language that programmers use to create, modify and extract data from the relational database, as well as control user access to the database. In addition to relational databases and SQL, an RDBMS like MySQL works with an operating system to implement a relational database in a computer's storage system, manages users, allows for network access and facilitates testing database integrity and creation of backups.

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source object relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.

Memcached is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source must be read. Memcached is free and open-source software, licensed under the Revised BSD license. Memcached runs on Unix-like operating systems and on Microsoft Windows. It depends on the libevent library.

DNS management software is computer software that controls Domain Name System (DNS) server clusters. DNS data is typically deployed on multiple physical servers. The main purposes of DNS management software are:

mysqlBind/unxsBind is a DNS management software system. It supports Internet Systems Consortium BIND Domain Name System (DNS) and is distributed as open source software under the GNU General Public License.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS or Alluxio, providing Bigtable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data.

Companies whose business centers on the development of open-source software employ a variety of business models to solve the challenge of making profits from software that is under an open-source license. Each of these business strategies rest on the premise that users of open-source technologies are willing to purchase additional software features under proprietary licenses, or purchase other services or elements of value that complement the open-source software that is core to the business. This additional value can be, but not limited to, enterprise-grade features and up-time guarantees to satisfy business or compliance requirements, performance and efficiency gains by features not yet available in the open source version, legal protection, or professional support/training/consulting that are typical of proprietary software applications.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Redis is used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

<span class="mw-page-title-main">MariaDB</span> Database management system

MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Development is led by some of the original developers of MySQL, who forked it due to concerns over its acquisition by Oracle Corporation in 2009.

<span class="mw-page-title-main">Open-core model</span> Business model monetizing commercial open-source software

The open-core model is a business model for the monetization of commercially produced open-source software. The open-core model primarily involves offering a "core" or feature-limited version of a software product as free and open-source software, while offering "commercial" versions or add-ons as proprietary software. The term was coined by Andrew Lampitt in 2008.

<span class="mw-page-title-main">Juju (software)</span> Open source service orchestration management tool

Juju is a free and open source application modeling tool developed by Canonical Ltd. Juju is an application management system. It was built to reduce the operation overhead of software by facilitating, deploying, configuring, scaling, integrating, and performing operational tasks on public and private cloud services along with bare-metal servers and local container-based deployments.

<span class="mw-page-title-main">Elasticsearch</span> Search engine

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-licensed under the (source-available) Server Side Public License and the Elastic license, while other parts fall under the proprietary (source-available) Elastic License. Official clients are available in Java, .NET (C#), PHP, Python, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine.

ownCloud Free software for cloud computing

ownCloud, a Kiteworks Company, is a free and open-source software project for content collaboration and sharing and syncing of files in distributed and federated enterprise scenarios. It allows companies and remote end-users to organize their documents on servers, computers, and mobile devices and work with them collaboratively while keeping a centrally organized and synchronized state.

One thing the most visited websites have in common is that they are dynamic websites. Their development typically involves server-side coding, client-side coding and database technology. The programming languages applied to deliver dynamic web content, however, vary vastly between sites.

Autoscaling, also spelled auto scaling or auto-scaling, and sometimes also called automatic scaling, is a method used in cloud computing that dynamically adjusts the amount of computational resources in a server farm - typically measured by the number of active servers - automatically based on the load on the farm. For example, the number of servers running behind a web application may be increased or decreased automatically based on the number of active users on the site. Since such metrics may change dramatically throughout the course of the day, and servers are a limited resource that cost money to run even while idle, there is often an incentive to run "just enough" servers to support the current load while still being able to support sudden and large spikes in activity. Autoscaling is helpful for such needs, as it can reduce the number of active servers when activity is low, and launch new servers when activity is high. Autoscaling is closely related to, and builds upon, the idea of load balancing.

The following outline is provided as an overview of and topical guide to MySQL:

Presto is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata, and allows use of multiple data sources within a query. Presto is community-driven open-source software released under the Apache License.

<span class="mw-page-title-main">RocksDB</span> Embedded key-value database

RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit multi-core processors (CPUs), and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads. It is based on a log-structured merge-tree data structure. It is written in C++ and provides official language bindings for C++, C, and Java. Many third-party language bindings exist. RocksDB is free and open-source software, released originally under a BSD 3-clause license. However, in July 2017 the project was migrated to a dual license of both Apache 2.0 and GPLv2 license. This change helped its adoption in Apache Software Foundation's projects after blacklist of the previous BSD+Patents license clause.

References

  1. 1 2 3 4 Steven J. Vaughan-Nichols (March 28, 2013). "WebScaleSQL: MySQL for Facebook-sized databases". ZDNet . Retrieved April 1, 2014.
  2. Klint Finley (March 27, 2013). "Google and Facebook Team Up to Modernize Old-School Databases". Wired . Retrieved April 1, 2014.
  3. 1 2 3 4 Jack Clark (March 27, 2013). "Forkin' 'L! Facebook, Google and friends create WebScaleSQL from MySQL 5.6". The Register . Retrieved April 1, 2014.
  4. 1 2 3 4 5 6 "Frequently Asked Questions". webscalesql.org. March 27, 2014. Retrieved April 1, 2014.
  5. "WebScaleSQL Moving Forward". December 29, 2016. Retrieved December 29, 2016.
  6. "Patches for MySQL 5 – MySQL tools released by Google". code.google.com. June 24, 2011. Retrieved April 1, 2014.
  7. "facebook/mysql-5.1". github.com. June 2013. Retrieved April 1, 2014.
  8. "Please welcome Alibaba to WebScaleSQL!". webscalesql.org. January 15, 2015. Retrieved August 15, 2015.
  9. 1 2 3 Doug Henschen (March 27, 2014). "Facebook Debuts Web-Scale Variant of MySQL". informationweek.com. Retrieved August 15, 2015.
  10. "Is Your Change Appropriate?". webscalesql.org. March 27, 2014. Retrieved April 1, 2014.
  11. Michael Larabel (March 28, 2014). "Facebook & Others Announce WebScaleSQL". Phoronix . Retrieved April 1, 2014.
  12. Steaphan Greene (March 27, 2014). "WebScaleSQL: A collaboration to build upon the MySQL upstream". code.facebook.com. Retrieved August 16, 2015.
  13. Doug Henschen (September 15, 2014). "Facebook Announces WebScaleSQL Upgrade For MySQL". informationweek.com. Retrieved August 16, 2015.
  14. "MySQL 5.6 Reference Manual, Section 17.1.3 Replication with Global Transaction Identifiers". dev.mysql.com. Retrieved August 16, 2015.