XLDB

Last updated

XLDB (eXtremely Large DataBases) is a yearly conference about databases, data management and analytics. The definition of extremely large refers to data sets that are too big in terms of volume (too much), and/or velocity (too fast), and/or variety (too many places, too many formats) to be handled using conventional solutions. This conference deals with the high-end of very large databases (VLDB). It was conceived and it is chaired by Jacek Becla.

Contents

History

In October 2007, data experts gathered at SLAC National Accelerator Lab for the First Workshop on Extremely Large Databases. As a result, the XLDB research community was formed to meet the rapidly growing demands of the largest data systems. In addition to the original invitational workshop, an open conference, tutorials, and annual satellite events on different continents were added. The main event, held annually at Stanford University gathers over 300 attendees. XLDB is one of the data systems events catering to both academic and industry communities. For 2009, the workshop was co-located with VLDB 2009 in France to reach out to non-US research communities. [1] XLDB 2019 followed Stanford's Conference on Systems and Machine Learning (SysML). [2]

Goals

The main goals of this community include: [3]

XLDB Community

As of 2013, the community consisted of above one thousand members including:

  1. Scientists who develop, use, or plan to develop or use XLDB for their research, from laboratories.
  2. Commercial users of XLDB.
  3. Providers of database products, including commercial vendors and representatives from open source database communities.
  4. Academic database researchers.

XLDB Conferences, Workshops and Tutorials

The community meets annually at Stanford University where the main event is held each Spring. Those who live too far from California to attend have the opportunity to attend occasional satellite events either in Asia or Europe.

A detailed report or videos are produced after each workshop.

YearPlaceLinkReportComments
2019 Stanford 12th XLDB Conference
2018 Stanford 11th XLDB Conference
2017 Clermont-Ferrand 10th XLDB Conference
2016 Stanford 9th XLDB Conference
2015 Stanford 8th XLDB Conference
2014 Observatório Nacional, Rio_de_Janeiro Satellite XLDB Workshop in South America
2014 Stony_Brook_University XLDB-Healthcare Workshop
2013 Stanford 7th XLDB Conference
2013 CERN, Geneva/Switzerland Satellite XLDB Workshop in Europe
2012 Stanford 6th XLDB Conference, Workshop & Tutorials
2012 Beijing, China Satellite XLDB Conference in Asia
2011 SLAC 5th XLDB Conference and Workshop
2011 Edinburgh, UK not availableSatellite XLDB Workshop in Europe
2010 SLAC 4th XLDB Conference and Workshop
2009 Lyon, France 3rd XLDB Workshop
2008 SLAC 2nd XLDB Workshop
2007 SLAC 1st XLDB Workshop

Tangible results

XLDB events led to initiating an effort to build a new open source, science database called SciDB. [4]

The XLDB organizers started defining a science benchmark for scientific data management systems called SS-DB.

At XLDB 2012 the XLDB organizers announced that two major databases that support arrays as first-class objects (MonetDB SciQL and SciDB) have formed a working group in conjunction with XLDB. This working group is proposing a common syntax (provisionally named “ArrayQL”) for manipulating arrays, including array creation and query.

See also

Related Research Articles

In computer programming contexts, a data cube is a multi-dimensional ("n-D") array of values. Typically, the term datacube is applied in contexts where these arrays are massively larger than the hosting computer's main memory; examples include multi-terabyte/petabyte data warehouses and time series of image data.

MonetDB

MonetDB is an open-source column-oriented database management system developed at the Centrum Wiskunde & Informatica (CWI) in the Netherlands. It was designed to provide high performance on complex queries against large databases, such as combining tables with hundreds of columns and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing, data mining, geographic information system (GIS), Resource Description Framework (RDF), text retrieval and sequence alignment processing.

Héctor García-Molina was a Mexican/American computer scientist and Professor in the Departments of Computer Science and Electrical Engineering at Stanford University. He was advisor to Sergey Brin, the founder of Google, from 1993 to 1997 when he was a computer science student at Stanford.

International Conference on Very Large Data Bases or VLDB conference is an annual conference held by the non-profit Very Large Data Base Endowment Inc. While named after very large databases, the conference covers the research and development results in the broader field of database management. The mission of VLDB Endowment is to "promote and exchange scholarly work in databases and related fields throughout the world." The VLDB conference began in 1975 and is now closely associated with SIGMOD and SIGKDD.

SIGMOD is the Association for Computing Machinery's Special Interest Group on Management of Data, which specializes in large-scale data management problems and databases.

A column-oriented DBMS or columnar DBMS is a database management system (DBMS) that stores data tables by column rather than by row. Practical use of a column store versus a row store differs little in the relational DBMS world. Both columnar and row databases can use traditional database query languages like SQL to load data and perform queries. Both row and columnar databases can become the backbone in a system to serve data for common extract, transform, load (ETL) and data visualization tools. However, by storing data in columns rather than rows, the database can more precisely access the data it needs to answer a query rather than scanning and discarding unwanted data in rows.

Objectivity/DB is a commercial object database produced by Objectivity, Inc. It allows applications to make standard C++, C#, Java, or Python objects persistent without having to convert the data objects into the rows and columns used by a relational database management system (RDBMS). Objectivity/DB supports the most popular object oriented languages plus SQL/ODBC and XML. It runs on Linux, Macintosh, UNIX and Windows platforms. All of the languages and platforms interoperate, with the Objectivity/DB kernel taking care of compiler and hardware platform differences.

Dan Suciu is a full professor of computer science at the University of Washington. He received his Ph.D. from the University of Pennsylvania in 1995 under the supervision of Val Tannen. After graduation, he was a principal member of the technical staff at AT&T Labs until he joined the University of Washington in 2000. Suciu does research in data management, with an emphasis on Web data management and managing uncertain data. He is a co-author of an influential book on managing semistructured data.

The Conference on Innovative Data Systems Research (CIDR) is a biennial computer science conference focused on research into new techniques for data management. It was started in 2002 by Michael Stonebraker, Jim Gray, and David DeWitt, and is held at the Asilomar Conference Grounds in Pacific Grove, California.

Michael Stonebraker American computer scientist

Michael Ralph Stonebraker is a computer scientist specializing in database research. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational database systems. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. He is also an editor for the book Readings in Database Systems.

Chandrasekaran Mohan is an Indian-born American computer scientist. He was born on 3 August 1955 in Tamil Nadu, India. After growing up there and finishing his undergraduate studies in Chennai, he moved to the United States in 1977 for graduate studies. After having been an Indian citizen from birth, since 2007 he has been an American citizen and an Overseas Citizen of India (OCI). In June 2020, he retired from being an IBM Fellow at the IBM Almaden Research Center after working at IBM Research for 38.5 years. Currently, he is a Distinguished Visiting Professor at China's Tsinghua University. He is also an Honorary Advisor at the Tamil Nadu e-Governance Agency (TNeGA) in Chennai and an Advisor at the Kerala Blockchain Academy in Kerala.

Tomasz Imieliński

Tomasz Imieliński is a Polish-American computer scientist, most known in the areas of data mining, mobile computing, data extraction, and search engine technology. He is currently a professor of computer science at Rutgers University in New Jersey, United States.

A very large database, or VLDB, is a database that contains a very large amount of data, so much that it can require specialized architectural, management, processing and maintenance methodologies.

Actian Vector is an SQL relational database management system designed for high performance in analytical database applications. It published record breaking results on the Transaction Processing Performance Council's TPC-H benchmark for database sizes of 100 GB, 300 GB, 1 TB and 3 TB on non-clustered hardware.

Array DBMS System that provides database services specifically for arrays

Array database management systems provide database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today’s earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

Tova Milo Israeli computer scientist

Tova Milo is a full Professor of Computer Science at Tel Aviv University and the Dean of the Faculty of Exact Sciences. She served as the head of the Computer Science Department from 2011 to 2014. Milo is the head of the data management group in Tel Aviv University, and her research focuses on Web data management. She received her PhD from the Hebrew University in 1992 under the supervision of Catriel Beeri, and was a postdoctoral fellow at the University of Toronto and INRIA, France, prior to joining Tel Aviv University.

Martin L. Kersten Dutch computer scientist

Martin L. Kersten is a computer scientist with research focus on database architectures, query optimization and their use in scientific databases. He is an architect of the MonetDB system, an open-source column store for data warehouses, online analytical processing (OLAP) and geographic information systems (GIS). He has been (co-) founder of several successful spin-offs of the Centrum Wiskunde & Informatica (CWI).

Wenfei Fan

Wenfei Fan is a Chinese-British computer scientist and professor of web data management at the University of Edinburgh. His research investigates database theory and database systems.

Alibaba Cloud Chinese cloud computing company

Alibaba Cloud, also known as Aliyun, is a Chinese cloud computing company, a subsidiary of Alibaba Group. Alibaba Cloud provides cloud computing services to online businesses and Alibaba's own e-commerce ecosystem. Its international operations are registered and headquartered in Singapore.

Sihem Amer-Yahia Algerian computer scientist

Sihem Amer-Yahia is an Algerian computer scientist. She is a CNRS Research Director at the Laboratoire d’Informatique de Grenoble. She leads the SLIDE research team. Sihem Amer-Yahia works on data management, declarative languages and query processing algorithms. Her publication topics include crowdsourcing, computational complexity, data analysis, data handling, data visualisation, information science, and addresses new data management problems in emerging internet and big data applications.

References

  1. "Building the biggest scientific databases". symmetry magazine. Retrieved 2019-04-15.
  2. "XLDB Extremely Large Databases 2019". XLDB Extremely Large Databases 2019. Retrieved 2019-04-15.
  3. Becla, Jacek (2009). "XLDB 3 Welcome" . Retrieved 2009-08-29.
  4. Becla, Jacek (2008). "Report from the SciDB Workshop" . Retrieved 2008-09-29.[ permanent dead link ]

Further reading