BigQuery

Last updated
BigQuery
Type of site
Platform as a service data warehouse
Available in English
Owner Google
URL cloud.google.com/bigquery
RegistrationRequired
LaunchedMay 19, 2010;13 years ago (2010-05-19)
Current statusActive

BigQuery is Google's fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a Platform as a Service (PaaS) that supports querying using a dialect of SQL. It also has built-in machine learning capabilities. BigQuery was announced in May 2010 and made generally available in November 2011. [1]

Contents

Design

BigQuery provides external access to Google's Dremel technology, [2] [3] a scalable, interactive ad hoc query system for analysis of nested data. BigQuery requires all requests to be authenticated, supporting a number of Google-proprietary mechanisms as well as OAuth.

Features

Pricing

The two main components of BigQuery pricing are the cost to process queries and the cost to store data. BigQuery offers two types of pricing - on demand pricing which charges for the number of petabytes processed for each query and flat-rate pricing which charges for slots or virtual CPUs. [14]

Partnerships & integrations

BigQuery partners and natively integrates with several tools: [15]

Adoption

Customers of BigQuery include 20th Century Fox, American Eagle Outfitters, HSBC, CNA Insurance, Asahi Group, ATB Financial, Athena, The Home Depot, Wayfair, Carrefour, Oscar Health, and several others. [16] Gartner named Google as a Leader in the 2021 Magic Quadrant™ for Cloud Database Management Systems. [17] BigQuery is also named a Leader in The 2021 Forrester Wave: Cloud Data Warehouse. [18] According to a study by Enterprise Strategy Group, BigQuery saves up to 27% in total cost of ownership over three years compared to other cloud data warehousing solutions. [19]

Related Research Articles

Business intelligence consists of strategies and technologies used by enterprises for the data analysis and management of business information. Common functions of business intelligence technologies include reporting, online analytical processing, analytics, dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics, and prescriptive analytics.

MarkLogic is an American software business that develops and provides an enterprise NoSQL database, which is also named MarkLogic. They have offices in the United States, Europe, Asia, and Australia.

Business intelligence software is a type of application software designed to retrieve, analyze, transform and report data for business intelligence. The applications generally read data that has been previously stored, often - though not necessarily - in a data warehouse or data mart.

The term is used for two different things:

  1. In computer science, in-memory processing (PIM) is a computer architecture in which data operations are available directly on the data memory, rather than having to be transferred to CPU registers first. This may improve the power usage and performance of moving data between the processor and the main memory.
  2. In software engineering, in-memory processing is a software architecture where a database is kept entirely in random-access memory (RAM) or flash memory so that usual accesses, in particular read or query operations, do not require access to disk storage. This may allow faster data operations such as "joins", and faster reporting and decision-making in business.
<span class="mw-page-title-main">SingleStore</span> Database management system

SingleStore is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

Dremel is a distributed system developed at Google for interactively querying large datasets.

Amazon Redshift is a data warehouse product which forms part of the larger cloud-computing platform Amazon Web Services. It is built on top of technology from the massive parallel processing (MPP) data warehouse company ParAccel, to handle large scale data sets and database migrations. Redshift differs from Amazon's other hosted database offering, Amazon RDS, in its ability to handle analytic workloads on big data data sets stored by a column-oriented DBMS principle. Redshift allows up to 16 petabytes of data on a cluster compared to Amazon RDS Aurora's maximum size of 128 terabytes.

Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop. Impala has been described as the open-source equivalent of Google F1, which inspired its development in 2012.

<span class="mw-page-title-main">Carto (company)</span> Cloud computing platform

CARTO is a software as a service (SaaS) spatial analysis platform that provides GIS, web mapping, data visualization, spatial analytics, and spatial data science features. The company is positioned as a Location Intelligence platform due to its tools for geospatial data analysis and visualization that do not require advanced GIS or development experience. As a cloud-native platform, CARTO runs natively on cloud data warehouse platforms overcoming any previous limits on data scale for spatial workloads.

Cloud analytics is a marketing term for businesses to carry out analysis using cloud computing. It uses a range of analytical tools and techniques to help companies extract information from massive data and present it in a way that is easily categorised and readily available via a web browser.

Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that provides a series of modular cloud services including computing, data storage, data analytics, and machine learning, alongside a set of management tools. It runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail, and Google Docs, according to Verma, et.al. Registration requires a credit card or bank account details.

Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems. These characteristics can include size or amount of data, completeness of the data, correctness of the data, possible relationships amongst data elements or files/tables in the data.

Hybrid transaction/analytical processing (HTAP) is a term created by Gartner Inc., an information technology research and advisory company, in its early 2014 research report Hybrid Transaction/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation. As defined by Gartner:

Hybrid transaction/analytical processing (HTAP) is an emerging application architecture that "breaks the wall" between transaction processing and analytics. It enables more informed and "in business real time" decision making.

<span class="mw-page-title-main">Zoomdata</span> American software company

Zoomdata is a business intelligence software company that specializes in real-time data visualization of big data, streaming data, and multisource analysis. The company's products are deployable on-prem, in the cloud, and embedded in other applications. SAP Data Visualization by Zoomdata is a SaaS version of Zoomdata for SAP customers. On June 10, 2019, Zoomdata was acquired by Logi Analytics for an undisclosed sum.

Microsoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence. It is part of the Microsoft Power Platform. Power BI is a collection of software services, apps, and connectors that work together to turn various sources of data into static and interactive data visualizations. Data may be input by reading directly from a database, webpage, PDF, or structured files such as spreadsheets, CSV, XML, JSON, XLSX, and SharePoint.

Data blending is a process whereby big data from multiple sources are merged into a single data warehouse or data set.

Kyvos is a business intelligence acceleration platform for cloud and big data platforms developed by an American privately held company named Kyvos Insights. The company, headquartered in Los Gatos, California, was founded by Praveen Kankariya, CEO of Impetus Technologies. The software provides OLAP-based multidimensional analysis on big data and cloud platforms and was launched officially in June 2015. In December the same year, the company was listed among the 10 Coolest Big Data Startups of 2015 by CRN Magazine.

Azure Data Explorer is a fully-managed big data analytics cloud platform and data-exploration service, developed by Microsoft, that ingests structured, semi-structured and unstructured data. The service then stores this data and answers analytic ad hoc queries on it with seconds of latency. It is a full text indexing and retrieval database, including time series analysis capabilities and regular expression evaluation and text parsing.

References

  1. Iain Thomson (November 14, 2011). "Google opens BigQuery for cloud analytics: Dangles free trial to lure doubters". The Register . Retrieved August 26, 2016.
  2. Sergey Melnik; Andrey Gubarev; Jing Jing Long; Geoffrey Romer; Shiva Shivakumar; Matt Tolton; Theo Vassilakis (2010). "Dremel: Interactive Analysis of Web-Scale Datasets". Proc. of the 36th International Conference on Very Large Data Bases (VLDB).
  3. Kazunori Sato (2012). "An Inside Look at Google BigQuery" (PDF). Retrieved August 26, 2016.
  4. "SQL Reference" . Retrieved 26 June 2017.
  5. "Quota Policy" . Retrieved 26 June 2017.
  6. "BigQuery Service | Apps Script | Google Developers". March 15, 2018. Retrieved April 23, 2018.
  7. "BigQuery Client Libraries" . Retrieved 26 June 2017.
  8. "bigquery".[ permanent dead link ]
  9. "Google Clouds BiqQuery Omni Now Generally Available". 12 October 2021.
  10. "Analytics Hub".
  11. "BI Engine".
  12. "With Many Updates in BigQuery". 2 July 2022.
  13. "with Many Updates in BigQuery". 2 July 2022.
  14. "BigQuery Costs". 13 July 2023.
  15. "BigQuery Section".
  16. "Customers for Data Analytics".
  17. "Whats Changed 2021 Gartner Magic Quadrant for Cloud Database Management Systems". 13 January 2022.
  18. "BigQuery named leader in forrester wave cloud data warehouse". 30 March 2021.
  19. "Economic Validation Google BigQuery va. Cloud Based EDWS" (PDF).