LucidDB

Last updated
LucidDB
Developer(s) Eigenbase Foundation
Stable release
0.9.4 [1] / 2012-01-05
Repository
Written in Java, C++
Type Database, Business intelligence, Data Warehouse
License GPL 2
Website luciddb.sourceforge.net

LucidDB is an open-source database purpose-built to power data warehouses, OLAP servers and business intelligence systems. According to the product website, its architecture is based on column-store, bitmap indexing, hash join/aggregation, and page-level multiversioning. [2]

Contents

Overview

Purpose-built for data warehousing, OLAP, and business intelligence, LucidDB is a "columnar Business Intelligence database". [3] [4] It handles ETL functionality using extensions to ANSI SQL, by using 'wrappers' around a range of data sources (databases, text files, Web services, etc.), allowing them to be queried as though they were all databases. [3] It can also be used for enterprise information integration. [3] LucidDB uses the Optiq query planning and execution framework. [5]

LucidDB achieves high performance by automatically identifying required indexes and creating them on the fly without the need for manual intervention. [6] It includes a bulk loader that permits merge and update operations as well as insert. [3]

LucidDB server is licensed under GPL, while LucidDB client is licensed under LGPL. [7]

Current status

It appears that LucidDB is no longer being maintained based on its GitHub entry. [8] The SourceForge page has not been updated since 2010.

LucidDB has had a long run as the first pure play open source column store database. However, with no commercial sponsors and no ongoing community activity it's time to OFFICIALLY shut the doors. There will be no future code, or binary releases (this repository may disappear[sic] at some point) of luciddb. All assets (wiki, issues, etc) will likely start coming down as well over the course of 2014. Appreciate all the effort by all those involved with LucidDB.

Optiq, has given home and new life to portions of the LucidDB codebase. If you're interested in speaking SQL to NoSQL sources please checkout[sic] the Optiq project. [9]

Connectors

Related Research Articles

<span class="mw-page-title-main">Data warehouse</span> Centralized storage of knowledge

In computing, a data warehouse, also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is a core component of business intelligence. Data warehouses are central repositories of data integrated from disparate sources. They store current and historical data organized so as to make it easy to create reports, query and get insights from the data. Unlike databases, they are intended to be used by analysts and managers to help make organizational decisions.

In computing, Microsoft's ActiveX Data Objects (ADO) comprises a set of Component Object Model (COM) objects for accessing data sources. A part of MDAC, it provides a middleware layer between programming languages and OLE DB. ADO allows a developer to write programs that access data without knowing how the database is implemented; developers must be aware of the database for connection only. No knowledge of SQL is required to access a database when using ADO, although one can use ADO to execute SQL commands directly.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.

In computing, online analytical processing, or OLAP, is an approach to quickly answer multi-dimensional analytical (MDA) queries. The term OLAP was created as a slight modification of the traditional database term online transaction processing (OLTP). OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

ADO.NET is a data access technology from the Microsoft .NET Framework that provides communication between relational and non-relational systems through a common set of components. ADO.NET is a set of computer software components that programmers can use to access data and data services from a database. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems, though it can also access data in non-relational data sources. ADO.NET is sometimes considered an evolution of ActiveX Data Objects (ADO) technology, but was changed so extensively that it can be considered an entirely new product.

Microsoft SQL Server Analysis Services (SSAS) is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server. SSAS is used as a tool by organizations to analyze and make sense of information possibly spread out across multiple databases, or in disparate tables or files. Microsoft has included a number of services in SQL Server related to business intelligence and data warehousing. These services include Integration Services, Reporting Services and Analysis Services. Analysis Services includes a group of OLAP and data mining capabilities and comes in two flavors multidimensional and tabular, where the difference between the two is how the data is presented. In a tabular model, the information is arranged in two-dimensional tables which can thus be more readable for a human. A multidimensional model can contain information with many degrees of freedom, and must be unfolded to increase readability by a human.

Business intelligence software is a type of application software designed to retrieve, analyze, transform and report data for business intelligence (BI). The applications generally read data that has been previously stored, often - though not necessarily - in a data warehouse or data mart.

A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.

The Oracle Database OLAP Option implements On-line Analytical Processing (OLAP) within an Oracle database environment. Oracle Corporation markets the Oracle Database OLAP Option as an extra-cost option to supplement the "Enterprise Edition" of its database.

<span class="mw-page-title-main">Palo (OLAP database)</span>

Palo is a memory resident multidimensional database server and typically used as a business intelligence tool for controlling and budgeting purposes with spreadsheet software acting as the user interface. Beyond the multidimensional data concept, Palo enables multiple users to share one centralised data storage.

Entity Framework (EF) is an open source object–relational mapping (ORM) framework for ADO.NET. It was originally shipped as an integral part of .NET Framework, however starting with Entity Framework version 6.0 it has been delivered separately from the .NET Framework.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

The following tables compare general and technical information for a number of online analytical processing (OLAP) servers. Please see the individual products articles for further information.

<span class="mw-page-title-main">Pentaho</span> Business intelligence software

Pentaho is the brand name for several Data Management software products that make up the Pentaho+ Data Platform. These include Pentaho Data Integration, Pentaho Business Analytics, Pentaho Data Catalog, and Pentaho Data Optimiser.

The following is provided as an overview of and topical guide to databases:

Sqoop is a command-line interface application for transferring data between relational databases and Hadoop.

Microsoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence (BI). It is part of the Microsoft Power Platform. Power BI is a collection of software services, apps, and connectors that work together to turn various sources of data into static and interactive data visualizations. Data may be input by reading directly from a database, webpage, PDF, or structured files such as spreadsheets, CSV, XML, JSON, XLSX, and SharePoint.

<span class="mw-page-title-main">ClickHouse</span> Open-source database management system

ClickHouse is an open-source column-oriented DBMS for online analytical processing (OLAP) that allows users to generate analytical reports using SQL queries in real-time. ClickHouse Inc. is headquartered in the San Francisco Bay Area with the subsidiary, ClickHouse B.V., based in Amsterdam, Netherlands.

RavenDB is an open-source document-oriented database written in C#, developed by Hibernating Rhinos Ltd. It is cross-platform, supported on Windows, Linux, and Mac OS. RavenDB stores data as JSON documents and can be deployed in distributed clusters with master-master replication.

References

  1. "Releases - LucidDB". github.com.
  2. From LucidDB Web Site
  3. 1 2 3 4 Casters, Matt; Bouman, Roland; van Dongen, Jos (2010) Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration, John Wiley & Sons, ISBN   978-0470635179, pp. 10, 249
  4. Bulusu, Lakshman (2012) Open Source Data Warehousing and Business Intelligence, CRC Press, ISBN   978-1439816400, p. 56
  5. Pullokkaran, John (2013). "Introducing Cost Based Optimizer to Apache Hive" (PDF).
  6. Tas, N. C.; Raileanu, C.; Dejori, M.; Neubauer, C. (July 2010). "Bridge Sensor Mart: A flexible and scalable data storage and analysis framework for structural health monitoring". In Frangopol, Dan; Sause, Richard; Kusko, Chad (eds.). Bridge Maintenance, Safety, Management and Life-Cycle Optimization: Proceedings of the Fifth International IABMAS Conference. Philadelphia: CRC Press. p. 193. ISBN   978-1-000-00681-0.
  7. Khosrow-Pour, Mehdi (2010) Information Resources Management: Concepts, Methodologies, Tools and Applications, Engineering Science Reference, ISBN   978-1615209651, p. 632
  8. "Luciddb". GitHub . 13 May 2020.
  9. ADO.NET provider for LucidDB