Center for Data-Driven Discovery

Last updated

The Center for Data-Driven Discovery is a multi-division research group at the California Institute of Technology, focusing on the methodologies for handling and analysis of large and complex data sets, facilitating the data-to-discovery process. [1] It supports all applications of data-driven computing in various scientific domains, including biology, physics, and astronomy. It also functions as a catalyst for new collaborations and projects between different scientific disciplines, and between the campus and JPL, with especial interest in the sharing and transfer of methodologies, where the solutions from one field can be reapplied in another one.

The Center for Data-Driven Discovery is a part of a joint initiative with the Center for Data Science and Technology at the Jet Propulsion Laboratory. [2] It became operational in the fall of 2014. [3]

Directors

Related Research Articles

<span class="mw-page-title-main">Pseudoscience</span> Unscientific claims wrongly presented as scientific

Pseudoscience consists of statements, beliefs, or practices that claim to be both scientific and factual but are incompatible with the scientific method. Pseudoscience is often characterized by contradictory, exaggerated or unfalsifiable claims; reliance on confirmation bias rather than rigorous attempts at refutation; lack of openness to evaluation by other experts; absence of systematic practices when developing hypotheses; and continued adherence long after the pseudoscientific hypotheses have been experimentally discredited. It is not the same as junk science.

<span class="mw-page-title-main">Research</span> Systematic study undertaken to increase knowledge

Research is "creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness to controlling sources of bias and error. These activities are characterized by accounting and controlling for biases. A research project may be an expansion of past work in the field. To test the validity of instruments, procedures, or experiments, research may replicate elements of prior projects or the project as a whole.

<span class="mw-page-title-main">National Center for Supercomputing Applications</span> Illinois-based applied supercomputing research organization

The National Center for Supercomputing Applications (NCSA) is a state-federal partnership to develop and deploy national-scale computer infrastructure that advances research, science and engineering based in the United States. NCSA operates as a unit of the University of Illinois Urbana-Champaign, and provides high-performance computing resources to researchers across the country. Support for NCSA comes from the National Science Foundation, the state of Illinois, the University of Illinois, business and industry partners, and other federal agencies.

<span class="mw-page-title-main">National Research Council Canada National Science Library</span>

The National Science Library (NSL), formerly known as the Canada Institute for Scientific and Technical Information or CISTI, began in 1917 as the library of the National Research Council of Canada (NRC). NRC is the Government of Canada's premier research and technology organization (RTO), working with clients and partners to provide innovation support, strategic research, scientific and technical services. The library took on the role of national science library unofficially in 1957 and became the official National Science Library in 1966.

<span class="mw-page-title-main">Methodology</span> Study of research methods

In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for bringing about a certain goal, like acquiring knowledge or verifying knowledge claims. This normally involves various steps, like choosing a sample, collecting data from this sample, and interpreting the data. The study of methods concerns a detailed description and analysis of these processes. It includes evaluative aspects by comparing different methods. This way, it is assessed what advantages and disadvantages they have and for what research goals they may be used. These descriptions and evaluations depend on philosophical background assumptions. Examples are how to conceptualize the studied phenomena and what constitutes evidence for or against them. When understood in the widest sense, methodology also includes the discussion of these more abstract issues.

E-Science or eScience is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid. The term was created by John Taylor, the Director General of the United Kingdom's Office of Science and Technology in 1999 and was used to describe a large funding initiative starting in November 2000. E-science has been more broadly interpreted since then, as "the application of computer technology to the undertaking of modern scientific investigation, including the preparation, experimentation, data collection, results dissemination, and long-term storage and accessibility of all materials generated through the scientific process. These may include data modeling and analysis, electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints, and print and/or electronic publications." In 2014, IEEE eScience Conference Series condensed the definition to "eScience promotes innovation in collaborative, computationally- or data-intensive research across all disciplines, throughout the research lifecycle" in one of the working definitions used by the organizers. E-science encompasses "what is often referred to as big data [which] has revolutionized science... [such as] the Large Hadron Collider (LHC) at CERN... [that] generates around 780 terabytes per year... highly data intensive modern fields of science...that generate large amounts of E-science data include: computational biology, bioinformatics, genomics" and the human digital footprint for the social sciences.

<span class="mw-page-title-main">Grounded theory</span> Qualitative research methodology

Grounded theory is a systematic methodology that has been largely applied to qualitative research conducted by social scientists. The methodology involves the construction of hypotheses and theories through the collecting and analysis of data. Grounded theory involves the application of inductive reasoning. The methodology contrasts with the hypothetico-deductive model used in traditional scientific research.

<span class="mw-page-title-main">Science Commons</span>

Science Commons (SC) was a Creative Commons project for designing strategies and tools for faster, more efficient web-enabled scientific research. The organization's goals were to identify unnecessary barriers to research, craft policy guidelines and legal agreements to lower those barriers, and develop technology to make research data and materials easier to find and use. Its overarching goal was to speed the translation of data into discovery and thereby the value of research.

<span class="mw-page-title-main">Discovery science</span> Scientific methodology

Discovery science is a scientific methodology which aims to find new patterns, correlations, and form hypotheses through the analysis of large-scale experimental data. The term “discovery science” encompasses various fields of study, including basic, translational, and computational science and research. Discovery-based methodologies are commonly contrasted with traditional scientific practice, the latter involving hypothesis formation before experimental data is closely examined. Discovery science involves the process of inductive reasoning or using observations to make generalisations, and can be applied to a range of science-related fields, e.g., medicine, proteomics, hydrology, psychology, and psychiatry.

<span class="mw-page-title-main">Open research</span> Research made available to the public

Open research is research that is openly accessible by others. Those who publish research in this way are often concerned with making research more transparent, more collaborative, more wide-reaching, and more efficient. Open research aims to make both research methods and the resulting data freely available, often via the internet, in order to support reproducibility and, potentially, massively distributed research collaboration. In this regard, it is related to both open source software and citizen science.

<span class="mw-page-title-main">Energy Sciences Network</span>

The Energy Sciences Network (ESnet) is a high-speed computer network serving United States Department of Energy (DOE) scientists and their collaborators worldwide. It is managed by staff at the Lawrence Berkeley National Laboratory.

<span class="mw-page-title-main">Open science</span> Generally available scientific research

Open science is the movement to make scientific research and its dissemination accessible to all levels of society, amateur or professional. Open science is transparent and accessible knowledge that is shared and developed through collaborative networks. It encompasses practices such as publishing open research, campaigning for open access, encouraging scientists to practice open-notebook science, broader dissemination and engagement in science and generally making it easier to publish, access and communicate scientific knowledge.

The following outline is provided as a topical overview of science; the discipline of science is defined as both the systematic effort of acquiring knowledge through observation, experimentation and reasoning, and the body of knowledge thus acquired, the word "science" derives from the Latin word scientia meaning knowledge. A practitioner of science is called a "scientist". Modern science respects objective logical reasoning, and follows a set of core procedures or rules to determine the nature and underlying natural laws of all things, with a scope encompassing the entire universe. These procedures, or rules, are known as the scientific method.

Discovery Park is a 40-acre (160,000 m2) multidisciplinary research park located in Purdue University's West Lafayette campus in the U.S. state of Indiana. Tomás Díaz de la Rubia, an energy and resources industry executive who also spent a decade as a top scientist and administrator at Lawrence Livermore National Laboratory, serves as Discovery Park's Vice President.

<span class="mw-page-title-main">Data</span> Units of information

In common usage and statistics, data is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A datum is an individual value in a collection of data. Data is usually organized into structures such as tables that provide additional context and meaning, and which may themselves be used as data in larger structures. Data may be used as variables in a computational process. Data may represent abstract ideas or concrete measurements. Data is commonly used in scientific research, economics, and in virtually every other form of human organizational activity. Examples of data sets include price indices, unemployment rates, literacy rates, and census data. In this context, data represents the raw facts and figures from which useful information can be extracted.

<span class="mw-page-title-main">Istituto Italiano di Tecnologia</span> Italian high tech research centre

The Istituto Italiano di Tecnologia (IIT) (in English: Italian Institute of Technology) is a scientific research centre based in Genoa (Italy, EU). Its main goal is the advancement of science, in Italy and worldwide, through projects and discoveries oriented to applications and technology. Some account IIT as the best Italian scientific research centre.

E-Science librarianship refers to a role for librarians in e-Science.

Basic research, also called pure research, fundamental research, basic science, or pure science, is a type of scientific research with the aim of improving scientific theories for better understanding and prediction of natural or other phenomena. In contrast, applied research uses scientific theories to develop technology or techniques which can be used to intervene and alter natural or other phenomena. Though often driven simply by curiosity, basic research often fuels the technological innovations of applied science. The two aims are often practiced simultaneously in coordinated research and development.

<span class="mw-page-title-main">Berkeley Institute for Data Science</span>

The Berkeley Institute for Data Science (BIDS) is a central hub of research and education within University of California, Berkeley designed to facilitate data-intensive science and earn grants to be disseminated within the sciences. BIDS was initially funded by grants from the Gordon and Betty Moore Foundation and the Sloan Foundation as part of a three-year grant with data science institutes at New York University and the University of Washington. The objective of the three-university initiative is to bring together domain experts from the natural and social sciences, along with methodological experts from computer science, statistics, and applied mathematics. The organization has an executive director and a faculty director, Saul Perlmutter, who won the 2011 Nobel Prize in Physics. The initiative was announced at a White House Office of Science and Technology Policy event to highlight and promote advances in data-driven scientific discovery, and is a core component of the National Science Foundation's strategic plan for building national capacity in data science.

References