XLDB

Last updated May 22, 2024

XLDB (eXtremely Large DataBases) was a yearly conference about databases, data management and analytics held from 2007 to 2019. The definition of extremely large refers to data sets that are too big in terms of volume (too much), and/or velocity (too fast), and/or variety (too many places, too many formats) to be handled using conventional solutions. This conference dealt with the high-end of very large databases (VLDB). It was conceived and chaired by Jacek Becla.

History

In October 2007, data experts gathered at SLAC National Accelerator Lab for the First Workshop on Extremely Large Databases. As a result, the XLDB research community was formed to meet the rapidly growing demands of the largest data systems. In addition to the original invitational workshop, an open conference, tutorials, and annual satellite events on different continents were added. The main event, held annually at Stanford University gathers over 300 attendees. XLDB is one of the data systems events catering to both academic and industry communities. For 2009, the workshop was co-located with VLDB 2009 in France to reach out to non-US research communities.^[1] XLDB 2019 followed Stanford's Conference on Systems and Machine Learning (SysML).^[2]

Goals

The main goals of this community include:^[3]

Identify trends, commonalities and major roadblocks related to building extremely large databases
Bridge the gap between users trying to build extremely large databases and database solution providers worldwide
Facilitate development and growth of practical technologies for extremely large data stores

XLDB Community

As of 2013, the community consisted of over one thousand members including:

Scientists who develop, use, or plan to develop or use XLDB for their research, from laboratories.
Commercial users of XLDB.
Providers of database products, including commercial vendors and representatives from open source database communities.
Academic database researchers.

XLDB Conferences, Workshops and Tutorials

The community met annually at Stanford University through 2019. Occasional satellite events were held in Asia and Europe.

A detailed report or videos was produced after each workshop.

Year	Place	Report	Comments
2019	Stanford		12th XLDB Conference
2018	Stanford		11th XLDB Conference
2017	Clermont-Ferrand		10th XLDB Conference
2016	Stanford		9th XLDB Conference
2015	Stanford		8th XLDB Conference
2014	Observatório Nacional, Rio_de_Janeiro		Satellite XLDB Workshop in South America
2014	Stony_Brook_University		XLDB-Healthcare Workshop
2013	Stanford		7th XLDB Conference
2013	CERN, Geneva/Switzerland		Satellite XLDB Workshop in Europe
2012	Stanford		6th XLDB Conference, Workshop & Tutorials
2012	Beijing, China		Satellite XLDB Conference in Asia
2011	SLAC		5th XLDB Conference and Workshop
2011	Edinburgh, UK	not available	Satellite XLDB Workshop in Europe
2010	SLAC		4th XLDB Conference and Workshop
2009	Lyon, France		3rd XLDB Workshop
2008	SLAC		2nd XLDB Workshop
2007	SLAC		1st XLDB Workshop

Tangible results

XLDB events led to initiating an effort to build a new open source, science database called SciDB.^[4]

The XLDB organizers started defining a science benchmark for scientific data management systems called SS-DB.

At XLDB 2012 the XLDB organizers announced that two major databases that support arrays as first-class objects (MonetDB SciQL and SciDB) have formed a working group in conjunction with XLDB. This working group is proposing a common syntax (provisionally named “ArrayQL”) for manipulating arrays, including array creation and query.

Related Research Articles

In computer programming contexts, a data cube is a multi-dimensional ("n-D") array of values. Typically, the term data cube is applied in contexts where these arrays are massively larger than the hosting computer's main memory; examples include multi-terabyte/petabyte data warehouses and time series of image data.

MonetDB is an open-source column-oriented relational database management system (RDBMS) originally developed at the Centrum Wiskunde & Informatica (CWI) in the Netherlands. It is designed to provide high performance on complex queries against large databases, such as combining tables with hundreds of columns and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing, data mining, geographic information system (GIS), Resource Description Framework (RDF), text retrieval and sequence alignment processing.

International Conference on Very Large Data Bases or VLDB conference is an annual conference held by the non-profit Very Large Data Base Endowment Inc. While named after very large databases, the conference covers the research and development results in the broader field of database management. The mission of VLDB Endowment is to "promote and exchange scholarly work in databases and related fields throughout the world." The VLDB conference began in 1975 and is now closely associated with SIGMOD and SIGKDD.

SIGMOD is the Association for Computing Machinery's Special Interest Group on Management of Data, which specializes in large-scale data management problems and databases.

A column-oriented DBMS or columnar DBMS is a database management system (DBMS) that stores data tables by column rather than by row. Benefits include more efficient access to data when only querying a subset of columns, and more options for data compression. However, they are typically less efficient for inserting new data.

Gerhard Weikum is a German computer scientist and Research Director at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he is leading the databases and information systems department. His current research interests include transactional and distributed systems, self-tuning database systems, data and text integration, and the automatic construction of knowledge bases. He is one of the creators of the YAGO knowledge base. He is also the Dean of the International Max Planck Research School for Computer Science (IMPRS-CS).

Objectivity/DB is a commercial object database produced by Objectivity, Inc. It allows applications to make standard C++, C#, Java, or Python objects persistent without having to convert the data objects into the rows and columns used by a relational database management system (RDBMS). Objectivity/DB supports the most popular object-oriented languages plus SQL/ODBC and XML. It runs on Linux, Macintosh, UNIX and Windows platforms. All of the languages and platforms interoperate, with the Objectivity/DB kernel taking care of compiler and hardware platform differences.

Dan Suciu is a full professor of computer science at the University of Washington. He received his Ph.D. from the University of Pennsylvania in 1995 under the supervision of Val Tannen. After graduation, he was a principal member of the technical staff at AT&T Labs until he joined the University of Washington in 2000. Suciu does research in data management, with an emphasis on Web data management and managing uncertain data. He is a co-author of an influential book on managing semistructured data.

Alberto O. Mendelzon was an Argentine-Canadian computer scientist who died on June 16, 2005.

<span class="mw-page-title-main">Samuel Madden (computer scientist)</span> American computer scientist

Samuel R. Madden is an American computer scientist specializing in database management systems. He is currently a professor of computer science at the Massachusetts Institute of Technology.

Michael Ralph Stonebraker is a computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."

Chandrasekaran Mohan is an Indian-born American computer scientist. He was born on 3 August 1955 in Tamil Nadu, India. After growing up there and finishing his undergraduate studies in Chennai, he moved to the United States in 1977 for graduate studies, naturalizing in 2007. In June 2020, he retired from being an IBM Fellow at the IBM Almaden Research Center after working at IBM Research for 38.5 years. Currently, he is a visiting professor at China's Tsinghua University. He is also an Honorary Advisor at the Tamil Nadu e-Governance Agency (TNeGA) in Chennai and an advisor at the Kerala Blockchain Academy in Kerala.

Tomasz Imieliński is a Polish-American computer scientist, most known in the areas of data mining, mobile computing, data extraction, and search engine technology. He is currently a professor of computer science at Rutgers University in New Jersey, United States.

<span class="mw-page-title-main">Malcolm Atkinson</span> Professor of e-Science/University of Edinburgh School of Informatics

Malcolm Phillip Atkinson is a professor of e-Science, in the University of Edinburgh School of Informatics. He is known for his work in the areas of object-oriented databases, database systems, software engineering and e-Science and was the UK's first e-Science Envoy (2006–2011) and the Director of the e-Science Institute and National e-Science Centre, University of Edinburgh.

An array database management system or array DBMS provides database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

Tova Milo is a full Professor of Computer Science at Tel Aviv University and the Dean of the Faculty of Exact Sciences. She served as the head of the Computer Science Department from 2011 to 2014. Milo is the head of the data management group in Tel Aviv University, and her research focuses on Web data management. She received her PhD from the Hebrew University in 1992 under the supervision of Catriel Beeri, and was a postdoctoral fellow at the University of Toronto and INRIA, France, prior to joining Tel Aviv University.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

Martin L. Kersten was a computer scientist with research focus on database architectures, query optimization and their use in scientific databases. He was an architect of the MonetDB system, an open-source column store for data warehouses, online analytical processing (OLAP) and geographic information systems (GIS). He has been (co-) founder of several successful spin-offs of the Centrum Wiskunde & Informatica (CWI).

Daniel Abadi is the Darnell-Kanal Professor of Computer Science at University of Maryland, College Park. His primary area of research is database systems, with contributions to stream databases, distributed databases, graph databases, and column-store databases. He helped create C-Store, a column-oriented database, and HadoopDB, a hybrid of relational databases and Hadoop. Both database systems were commercialized by companies.

Tim Kraska is a German computer scientist specializing in data systems and the intersection of systems and machine learning. He is currently an associate professor of computer science at the Massachusetts Institute of Technology.

References

↑ "Building the biggest scientific databases". symmetry magazine. Retrieved 2019-04-15.
↑ "XLDB Extremely Large Databases 2019". XLDB Extremely Large Databases 2019. Retrieved 2019-04-15.
↑ Becla, Jacek (2009). "XLDB 3 Welcome" . Retrieved 2009-08-29.
↑ Becla, Jacek (2008). "Report from the SciDB Workshop" . Retrieved 2008-09-29.^{[ permanent dead link ]}

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Building the biggest scientific databases". symmetry magazine. Retrieved 2019-04-15.

[2] "XLDB Extremely Large Databases 2019". XLDB Extremely Large Databases 2019. Retrieved 2019-04-15.

[3] Becla, Jacek (2009). "XLDB 3 Welcome" . Retrieved 2009-08-29.

[4] Becla, Jacek (2008). "Report from the SciDB Workshop" . Retrieved 2008-09-29.^{[ permanent dead link ]}

[1]

[2]

[3]

[4]