Data build tool

Last updated
dbt
Developer(s) dbt-Labs
Initial releaseDecember 3, 2021;2 years ago (2021-12-03)
Stable release
1.6.5 / October 2, 2023;5 months ago (2023-10-02) [1]
Repository
Written in Python
Operating system Microsoft Windows, macOS, Linux
Available in Python
Type Data analytics, data management
License Apache License 2.0
Website docs.getdbt.com

data build tool (dbt) is an open-source command line tool that helps analysts and engineers transform data in their warehouse more effectively. [2]

Contents

History

It started at RJMetrics in 2016 as a solution to add basic transformation capabilities to Stitch (acquired by Talend in 2018). [3] The earliest versions of dbt allowed analysts to contribute to the data transformation process following the best practices of software engineering. [4]

From the beginning, dbt was open source. [5] In 2018, the dbt Labs team (then called Fishtown Analytics) released a commercial product on top of dbt Core. [6]

Funding

In April 2020, dbt Labs announced its Series A led by Andreessen Horowitz. [7] In November, dbt Labs announced its Series B led by Andreessen Horowitz and Sequoia. [8] And in June 2021, dbt Labs raised its Series C led by Altimeter, Sequoia, and Andreessen Horowitz. [9] In February 2022, the company raised $222 million for its Series D, at a $4.2 billion valuation [10]

Overview

dbt enables analytics engineers to transform data in their warehouses by writing select statements, and turns these select statements into tables and views. dbt does the transformation (T) in extract, load, transform (ELT) processes – it does not extract or load data, but is designed to be performant at transforming data already inside of a warehouse. dbt has the goal of allowing analysts to work more like software engineers, in line with the dbt viewpoint. [11]

dbt uses YAML files to declare properties. seed is a type of reference table used in dbt for static or infrequently changed data, like for example country codes or lookup tables), which are CSV based and typically stored in a seeds folder.

Related Research Articles

<span class="mw-page-title-main">Marc Andreessen</span> American entrepreneur, investor, and software engineer (born 1971)

Marc Lowell Andreessen is an American businessman and software engineer. He is the co-author of Mosaic, the first widely used web browser with a graphical user interface; co-founder of Netscape; and co-founder and general partner of Silicon Valley venture capital firm Andreessen Horowitz. He co-founded and later sold the software company Opsware to Hewlett-Packard. Andreessen is also a co-founder of Ning, a company that provides a platform for social networking websites and an inductee in the World Wide Web Hall of Fame. Andreessen's net-worth is estimated at $1.7 billion.

Andreessen Horowitz is a private American venture capital firm, founded in 2009 by Marc Andreessen and Ben Horowitz. The company is headquartered in Menlo Park, California. As of April 2023, Andreessen Horowitz ranks first on the list of venture capital firms by assets under management, with $35 billion as of March 2022.

SnapLogic is a commercial software company that provides an integration platform as a service (iPaaS) tools for connecting cloud data sources, SaaS applications, and on-premises business software applications. SnapLogic was founded in 2006, with its headquarters is located in San Mateo, California. SnapLogic is headed by Ex-CEO and co-founder of Informatica Gaurav Dhillon, and is venture-backed by Andreessen Horowitz, Ignition Partners, Floodgate Fund, Brian McClendon, and Naval Ravikant.

<span class="mw-page-title-main">Peter J. Levine</span> American venture capitalist

Peter J. Levine is an American software executive and venture capitalist.

<span class="mw-page-title-main">DigitalOcean</span> American cloud infrastructure provider

DigitalOcean Holdings, Inc. is an American multinational technology company and cloud service provider. The company is headquartered in New York City, New York, US, with 15 globally distributed data centers. DigitalOcean provides developers, startups, and SMBs with cloud infrastructure-as-a-service platforms.

Valar Ventures is a US-based venture capital fund founded by Andrew McCormack, James Fitzgerald and Peter Thiel in 2010. Historically, the majority of the firm's investments have been in technology startups based outside of Silicon Valley, including in Europe, the UK, the US and Canada. Valar Ventures originally spun out of Thiel Capital, Peter Thiel's global parent company based in San Francisco, and is now headquartered near Madison Square in Manhattan. The firm's namesake is the Valar of J. R. R. Tolkien's legendarium, who are god-like immortal spirits that chose to enter the mortal world to prepare it for their living creations.

Mixpanel is an event analytics service company that tracks user interactions with web and mobile applications.

RJMetrics is an American software company headquartered in Philadelphia, Pennsylvania. The company offers big data analytics to small and midsize businesses.

<span class="mw-page-title-main">Databricks</span> American software company

Databricks, Inc. is a global data, analytics and artificial intelligence company founded by the original creators of Apache Spark.

<span class="mw-page-title-main">GitLab</span> Open-source Git software package

GitLab Inc. is an open-core company that operates GitLab, a DevOps software package that can develop, secure, and operate software. The open-source software project was created by Ukrainian developer Dmytro Zaporozhets and Dutch developer Sytse Sijbrandij. In 2018, GitLab Inc. was considered to be the first partly-Ukrainian unicorn.

Snowflake Inc. is an American cloud computing–based data cloud company based in Bozeman, Montana. It was founded in July 2012 and was publicly launched in October 2014 after two years in stealth mode.

Imply Data, Inc. is an American software company. It develops and provides commercial support for the open-source Apache Druid, a real-time database designed to power analytics applications.

Navan is an online travel management, corporate card and expense management company.

PagerDuty is an American cloud computing company specializing in a SaaS incident response platform for IT departments.

<span class="mw-page-title-main">Netlify</span> American cloud computing company

Netlify is a remote-first cloud computing company that offers a development platform that includes build, deploy, and serverless backend services for web applications and dynamic websites. The platform is built on open web standards, making it possible to integrate build tools, web frameworks, APIs, and various web technologies into a unified developer workflow.

Samsara Inc. is an American IoT company headquartered in San Francisco, California that provides software and insights for physical operations. The company has over 20,000 customers across North America and Europe. Samsara developed a connected operations cloud platform that provides insights to physical operations organizations in the transportation, construction, energy, utilities, public sector and retail industries, and supports the safety and efficiency of those operations. The company’s initial public offering raised $805 million in December 2021.

<span class="mw-page-title-main">Deel (company)</span> A private San Francisco-based payroll and compliance provider

Deel is an American payroll and compliance provider based in San Francisco, California. The company provides hiring and payments services for companies hiring international employees and contractors.

Hugging Face, Inc. is an French-American company based in New York City that develops computer tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

Loom, Inc. is a technology company that provides video communication software for work. Its technology includes screen and camera recording, video editing, transcription, and the ability to share the recorded video link with others.

References

  1. "Release dbt-core v1.6.5 · dbt-labs/dbt-core". GitHub. Retrieved 10 Oct 2023.
  2. Atwal, Harvinder (9 December 2019). Practical DataOps: Delivering Agile Data Science at Scale. Apress. p. 223. ISBN   978-1-4842-5104-1.
  3. "Stitch is joining Talend". Stitch Data. 2018-11-07. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  4. "Goodbye RJMetrics, Hello Fishtown Analytics". dbt Blog. 2016-08-01. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  5. Cai, Kenrick. "Dbt Labs In Talks To Raise At $6 Billion Valuation, Six Months After Becoming A Unicorn". Forbes. Retrieved 2023-04-01.
  6. "Sinter Release Notes, August 2018: pull request builder, fine-grained GitHub permissions, and more". 2018-07-31. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  7. "Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool". TechCrunch. 2020-04-22. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  8. "Fishtown Analytics raises $29.5M Series B for its data engineering platform". TechCrunch. 2020-11-11. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  9. "Of the Community, By the Community, For the Community". dbt Blog. 2021-06-30. Archived from the original on 2021-11-07. Retrieved 2021-11-07.
  10. Cai, Kenrick (24 Feb 2022). "VENTURE CAPITAL Dbt Labs Raises At $4.2 Billion Valuation, $2 Billion Less Than First Planned". Forbes. Forbes. Archived from the original on 11 May 2022. Retrieved 11 May 2022. The Philadelphia-based data analytics startup revealed Thursday that it had settled on a $4.2 billion valuation as part of a $222 million Series D funding round
  11. "dbt viewpoint". Archived from the original on 2021-11-07. Retrieved 2021-11-07.