| | |
| Company type | Private |
|---|---|
| Industry | Computer software |
| Founded | 2013 [1] |
| Founders |
|
| Headquarters | , United States |
Key people |
|
| Revenue | |
Number of employees | 9,000 (2025) [3] |
| Website | databricks.com |
Databricks, Inc. is an American software company based in San Francisco. [4] It was founded in 2013 by the original creators of Apache Spark. [1] [5] It offers a cloud-based platform for data analytics and artificial intelligence. [6]
Databricks developed the 'data lakehouse' architecture, which combines elements of data warehouses and data lakes for managing structured and unstructured data. [7] The company develops Delta Lake, an open-source project that adds ACID transaction support to data lakes. [8]
Databricks grew out of the AMPLab project at University of California, Berkeley that was involved in making Apache Spark, an open-source distributed computing framework built atop Scala. [9] The company was founded by Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia, Patrick Wendell, and Reynold Xin. [10]
Microsoft Azure integrated Databricks as Azure Databricks in 2017. [11]
In February 2021, together with Google Cloud, Databricks provided integration with the Google Kubernetes Engine and Google's BigQuery platform. [12] In February 2021, the company reported having more than 5,000 customers. [13]
Databricks announced the Data Intelligence Platform, integrating its lakehouse architecture with generative AI capabilities acquired from MosaicML. [14]
The firm was valued at $62 billion in December 2024, [15] following a $10 billion funding round. [16]
In early March 2025, Databricks announced it would invest $1 billion in San Francisco's downtown. [17]
In March 2025, Databricks entered a five-year partnership with Anthropic, incorporating Anthropic's AI products into its platform in a deal valued at $100 million. [18] [19]
In June 2025, Databricks entered into a four-year partnership with Alphabet to incorporate Gemini into the Databricks platform. [20] The company launched Agent Bricks, [21] a suite of tools to help organizations build AI agents, and Lakebase, [22] an OLTP database.
In April 2025, Databricks was featured in the Forbes AI 50 list. [23]
In September 2025, Databricks entered into a partnership with OpenAI to incorporate the company's LLMs into the Databricks platform in a deal valued at $100 million. [24]
In June 2020, Databricks bought Redash, an open-source tool for data visualization and building of interactive dashboards. [25] In 2021, it bought German no-code company 8080 Labs whose product, bamboolib, allowed data exploration without any coding. [26] In May 2023, Databricks bought data security group Okera, extending Databricks data governance capabilities. [27] In June, it bought the open-source generative AI startup MosaicML for $1.4 billion. [28] [29] In October, Databricks bought data replication startup Arcion for $100 million. [30] In 2024, Databricks bought data-management startup Tabular for over $1 billion. [31]
In March 2023, in response to the popularity of OpenAI's ChatGPT, the company introduced an open-source language model, named Dolly after Dolly the sheep, that developers could use to create custom chatbots. Dolly has only 6 billion parameters. [32] Databricks claimed that Dolly had "ChatGPT-like instruction following ability", but has not released formal benchmark tests comparing it to ChatGPT. [33] [34] [35]
Databricks reported $1.6 billion in revenue for the 2023 fiscal year, demonstrating growth from the previous year. [36]
In 2025, Databricks acquired a serverless database startup, Neon, [37] for around $1 billion. [38]
In September 2013, Databricks announced it raised $13.9 million from Andreessen Horowitz and said it aimed to offer an alternative to Google's MapReduce system. [39] [40] Microsoft was a noted investor of Databricks in 2019, participating in the company's Series E at an unspecified amount. [41] [42] The company has raised $1.9 billion in funding, including a $1 billion Series G led by Franklin Templeton at a $28 billion post-money valuation in February 2021. Other investors include Amazon Web Services, CapitalG (a growth equity firm under Alphabet Inc.) and Salesforce Ventures. [13] In August 2021, Databricks finished its eighth round of funding by raising $1.6 billion and valuing the company at $38 billion. [43] In December 2024, Databricks announced a $10 billion financing at a valuation of $62 billion. [15] In August 2025, Databricks announced a $1 billion Series K funding round, raising their valuation to over $100 billion. [44] Just 3 months later, in December 2025, Databricks completed a $4 billion Series L funding round at a new valuation of $134 billion. [45]
| Series | Date | Amount (million $) | Lead investors |
|---|---|---|---|
| A | 2013 | 13.9 [39] | Andreessen Horowitz |
| B | 2014 | 33 [46] | New Enterprise Associates |
| C | 2016 | 60 [47] | New Enterprise Associates |
| D | 2017 | 140 [48] | Andreessen Horowitz |
| E | Feb. 2019 | 250 [49] | Andreessen Horowitz |
| F | Oct. 2019 | 400 [50] | Andreessen Horowitz |
| G | Jan. 2021 | 1,000 [51] | Franklin Templeton Investments |
| H | Aug. 2021 | 1,600 [52] | Morgan Stanley |
| I | Sep. 2023 | 500 [53] | Capital One Ventures, Nvidia |
| J | Dec. 2024 | 10,000 [54] | Thrive Capital |
| K | Aug. 2025 | 1,000 [44] | Thrive Capital, Insight Partners |
| L | Dec. 2025 | 4,000 [55] | Insight Partners, Fidelity, J.P. Morgan |
Databricks develops a cloud data platform referred to as a 'lakehouse', combining features of data warehouses and data lakes. [56] The platform is built on the open-source Apache Spark framework, enabling analytical queries on semi-structured data without requiring a traditional database schema. [57] In October 2022, Lakehouse received FedRAMP authorization for use with the U.S. federal government and contractors. [58]
The company has also created Delta Lake, MLflow and Koalas, open source projects that span data engineering, data science and machine learning. [59] [60]
In June 2020, Databricks launched Delta Engine, a fast query engine for Delta Lake, [61] compatible with Apache Spark and MLflow. [62]
In November 2020, Databricks introduced Databricks SQL (previously called SQL Analytics) for running business intelligence and analytics reporting on top of data lakes. Analysts can query data sets with standard SQL or use connectors to integrate with business intelligence tools like Holistics, Tableau, Qlik, Sigma, Looker, and ThoughtSpot. [63]
Databricks offers a platform for other workloads, including machine learning, data storage and processing, streaming analytics, and business intelligence. [64]
In early 2024, Databricks released the Mosaic set of tools for customizing, fine-tuning and building AI systems. It includes AI Vector Search for building RAG models; AI Model Serving, a service for deploying, governing, querying and monitoring models fine-tuned or pre-deployed by Databricks; and AI Pretraining, a platform for enterprises to create their own LLMs. [65]
In March 2024, Databricks released its DBRX foundation model under the Databricks Open Model License. [66] It has a mixture-of-experts architecture and is built on the MegaBlocks open-source project. [67] DBRX cost $10 million to create. According to the company, DBRX performed competitively on industry benchmarks against other open-source models available at the time.[ citation needed ] It beat other models like Llama 2 at solving logic puzzles and answering general knowledge questions, among other tasks. While it has 136 billion parameters, it only uses 36 billion, on average, to generate outputs. [68] According to Databricks, DBRX can be used as a foundation for companies to build customized AI models using their proprietary data. [69] [ non-primary source needed ]
In addition to building the Databricks platform, the company has co-organized massive open online courses about Spark [70] and a conference for the Spark community called the Data + AI Summit, [71] formerly known as Spark Summit. [72] [ non-primary source needed ]
At the 2025 Data + AI Summit, Databricks introduced Agent Bricks, a development platform for AI agents, [73] Lakebase, a transactional database, [74] and Databricks One, a no-code AI business intelligence platform. [75] The company also disclosed its Databricks SQL product would grow to a $1 billion revenue run rate. [76]
In December 2024, Databricks along with Wiz and Workday has decided to run their products on top of AWS via the new button called "Buy with AWS". [77]
In June 2025, Databricks announced a partnership with Google Cloud to integrate its platform with Google Cloud services. [78]
{{cite web}}: CS1 maint: url-status (link)