Open-notebook science is the practice of making the entire primary record of a research project publicly available online as it is recorded. This involves placing the personal, or laboratory, notebook of the researcher online along with all raw and processed data, and any associated material, as this material is generated. The approach may be summed up by the slogan 'no insider information'. It is the logical extreme of transparent approaches to research and explicitly includes the making available of failed, less significant, and otherwise unpublished experiments; so called 'dark data'. [1] The practice of open notebook science, although not the norm in the academic community, has gained significant recent attention in the research [2] [3] and general [1] [4] media as part of a general trend towards more open approaches in research practice and publishing. Open notebook science can therefore be described as part of a wider open science movement that includes the advocacy and adoption of open access publication, open data, crowdsourcing data, and citizen science. It is inspired in part by the success of open-source software [5] and draws on many of its ideas.
The term "open-notebook science" [6] was first used in 2006 in a blog post by Jean-Claude Bradley, an Associate Professor of Chemistry at Drexel University at the time. Bradley described open-notebook science as follows: [7]
... there is a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world
— Jean-Claude Bradley
"A team of groundbreaking scientists at SGC are now sharing their lab notebooks online". [11] [12] [13]
These are initiatives more open than traditional laboratory notebooks but lacking a key component for full Open Notebook Science. Usually either the notebook is only partially shared or shared with significant delay. [40]
A public laboratory notebook makes it convenient to cite the exact instances of experiments used to support arguments in articles. For example, in a paper on the optimization of a Ugi reaction, [52] [53] three different batches of product are used in the characterization and each spectrum references the specific experiment where each batch was used: EXP099, [54] EXP203 [55] and EXP206. [56] This work was subsequently published in the Journal of Visualized Experiments, [57] demonstrating that the integrity data provenance can be maintained from lab notebook to final publication in a peer-reviewed journal.
Without further qualifications, Open Notebook Science implies that the research is being reported on an ongoing basis without unreasonable delay or filter. This enables others to understand exactly how research actually happens within a field or a specific research group. Such information could be of value to collaborators, prospective students or future employers. Providing access to selective notebook pages or inserting an embargo period would be inconsistent with the meaning of the term "Open" in this context. Unless error corrections, failed experiments and ambiguous results are reported, it will not be possible for an outside observer to understand exactly how science is being done. Terms such as Pseudo [58] or Partial [40] have been used as qualifiers for the sharing of laboratory notebook information in a selective way or with a significant delay.
The arguments against adopting open notebook science fall mainly into three categories which have differing importance in different fields of science. The primary concern, expressed particularly by biological and medical scientists is that of 'data theft' or 'being scooped'. While the degree to which research groups steal or adapt the results of others remains a subject of debate it is certainly the case that the fear of not being first to publish drives much behavior, particularly in some fields. This is related to the focus in these fields on the published peer reviewed paper as being the main metric of career success.
The second argument advanced against open notebook science is that it constitutes prior publication, thus making it impossible to patent and difficult to publish the results in the traditional peer reviewed literature. With respect to patents, publication on the web is clearly classified as disclosure. Therefore, while there may be arguments over the value of patents, and approaches that get around this problem, it is clear that open notebook science is not appropriate for research for which patent protection is an expected and desired outcome. With respect to publication in the peer reviewed literature the case is less clear cut. Most publishers of scientific journals accept material that has previously been presented at a conference or in the form of a preprint. Those publishers that accept material that has been previously published in these forms have generally indicated informally that web publication of data, including open notebook science, falls into this category. Open notebook projects have been successfully published in high impact factor peer reviewed journals [59] [60] but this has not been tested with a wide range of publishers. It is to be expected that those publishers that explicitly exclude these forms of pre-publication will not accept material previously disclosed in an open notebook.
A third argument advanced against open notebook science is that it vitiates independence of competing research and hence may result in lack of all important independent verification of results. This is not the same as data-scooping, but the much more subtle possibility of allowing data that is co-evolving to influence each other. In traditional science large experimental collaborations often establish fire-wall rules preventing communication between members of competing collaborations to prevent not just data leakage but also influencing the methodology by which data is analyzed.
The final argument relates to the problem of the 'data deluge'. If the current volume of the peer reviewed literature is too large for any one person to manage, then how can anyone be expected to cope with the huge quantity of non–peer-reviewed material that could potentially be available, especially when some, perhaps most, would be of poor quality? A related argument is that 'my notebook is too specific' for it to be of interest to anyone else. The question of how to discover high quality and relevant material is a related issue. The issue of curation and validating data and methodological quality is a serious issue and one that arguably has relevance beyond open notebook science but is a particular challenge here.
The Open Notebook Science Challenge, [61] now directed towards reporting solubility measurements in non-aqueous solvent, has received sponsorship from Submeta, [62] Nature [63] and Sigma-Aldrich. [64] The first of ten winners of the contest for December 2008 was Jenny Hale. [65]
Logos can be used on notebooks to indicate the conditions of sharing. Fully open notebooks are marked as "all content" and "immediate" access. Partially open notebooks can be marked as either "selected content" and/or "delayed". [66]
Reproducibility, also known as replicability and repeatability, is a major principle underpinning the scientific method. For the findings of a study to be reproducible means that results obtained by an experiment or an observational study or in a statistical analysis of a data set should be achieved again with a high degree of reliability when the study is replicated. There are different kinds of replication but typically replication studies involve different researchers using the same methodology. Only after one or several such successful replications should a result be recognized as scientific knowledge.
A laboratory is a facility that provides controlled conditions in which scientific or technological research, experiments, and measurement may be performed. Laboratory services are provided in a variety of settings: physicians' offices, clinics, hospitals, and regional and national referral centers.
A laboratory notebook is a primary record of research. Researchers use a lab notebook to document their hypotheses, experiments and initial analysis or interpretation of these experiments. The notebook serves as an organizational tool, a memory aid, and can also have a role in protecting any intellectual property that comes from the research.
The Advanced Photon Source (APS) at Argonne National Laboratory is a storage-ring-based high-energy X-ray light source facility. It is one of five X-ray light sources owned and funded by the U.S. Department of Energy Office of Science. The APS saw first light on March 26, 1995. It is operated as a user facility, meaning that it is open to the world’s scientific community, and more than 5,500 researchers make use of its resources each year.
An electronic lab notebook is a computer program designed to replace paper laboratory notebooks. Lab notebooks in general are used by scientists, engineers, and technicians to document research, experiments, and procedures performed in a laboratory. A lab notebook is often maintained to be a legal document and may be used in a court of law as evidence. Similar to an inventor's notebook, the lab notebook is also often referred to in patent prosecution and intellectual property litigation.
The National High Magnetic Field Laboratory (MagLab) is a facility at Florida State University, the University of Florida, and Los Alamos National Laboratory in New Mexico, that performs magnetic field research in physics, biology, bioengineering, chemistry, geochemistry, biochemistry. It is the only such facility in the US, and is among twelve high magnetic facilities worldwide. The lab is supported by the National Science Foundation and the state of Florida, and works in collaboration with private industry.
The Structural Genomics Consortium (SGC) is a public-private-partnership focusing on elucidating the functions and disease relevance of all proteins encoded by the human genome, with an emphasis on those that are relatively understudied. The SGC places all its research output into the public domain without restriction and does not file for patents and continues to promote open science. Two recent publications revisit the case for open science. Founded in 2003, and modelled after the Single Nucleotide Polymorphism Database (dbSNP) Consortium, the SGC is a charitable company whose Members comprise organizations that contribute over $5,4M Euros to the SGC over a five-year period. The Board has one representative from each Member and an independent Chair, who serves one 5-year term. The current Chair is Anke Müller-Fahrnow (Germany), and previous Chairs have been Michael Morgan (U.K.), Wayne Hendrickson (U.S.A.), Markus Gruetter (Switzerland) and Tetsuyuki Maruyama (Japan). The founding and current CEO is Aled Edwards (Canada). The founding Members of the SGC Company were the Canadian Institutes of Health Research, Genome Canada, the Ontario Research Fund, GlaxoSmithKline and Wellcome Trust. The current Members comprise Bayer Pharma AG, Bristol Myers Squibb, Boehringer Ingelheim, the Eshelman Institute for Innovation, Genentech, Genome Canada, Janssen, Merck KGaA, Pfizer, and Takeda.
Laboratory automation is a multi-disciplinary strategy to research, develop, optimize and capitalize on technologies in the laboratory that enable new and improved processes. Laboratory automation professionals are academic, commercial and government researchers, scientists and engineers who conduct research and develop new technologies to increase productivity, elevate experimental data quality, reduce lab process cycle times, or enable experimentation that otherwise would be impossible.
A wet lab, or experimental lab, is a type of laboratory where it is necessary to handle various types of chemicals and potential "wet" hazards, so the room has to be carefully designed, constructed, and controlled to avoid spillage and contamination.
Apache Taverna was an open source software tool for designing and executing workflows, initially created by the myGrid project under the name Taverna Workbench, then a project under the Apache incubator. Taverna allowed users to integrate many different software components, including WSDL SOAP or REST Web services, such as those provided by the National Center for Biotechnology Information, the European Bioinformatics Institute, the DNA Databank of Japan (DDBJ), SoapLab, BioMOBY and EMBOSS. The set of available services was not finite and users could import new service descriptions into the Taverna Workbench.
ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry.
MeasureNet Technology Ltd. is a private, limited liability partnership based in Cincinnati, Ohio manufacturing the MeasureNet System: a brand of network-based electronic data acquisition interface for science teaching laboratories.
The Environmental Molecular Sciences Laboratory is a Department of Energy, Office of Science facility at Pacific Northwest National Laboratory in Richland, Washington, United States.
Blue Obelisk is an informal group of chemists who promote open data, open source, and open standards; it was initiated by Peter Murray-Rust and others in 2005. Multiple open source cheminformatics projects associate themselves with the Blue Obelisk, among which, in alphabetical order, Avogadro, Bioclipse, cclib, Chemistry Development Kit, GaussSum, JChemPaint, JOELib, Kalzium, Openbabel, OpenSMILES, and UsefulChem.
LabKey Server is a software suite available for scientists to integrate, analyze, and share biomedical research data. The platform provides a secure data repository that allows web-based querying, reporting, and collaborating across a range of data sources. Specific scientific applications and workflows can be added on top of the basic platform and leverage a data processing pipeline.
Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger and an author.
Jean-Claude Bradley was a chemist who actively promoted Open Science in chemistry, including at the White House, for which he was awarded the Blue Obelisk award in 2007. He coined the term "Open Notebook science". He died in May 2014. A memorial symposium was held July 14, 2014 at Cambridge University, UK.
The Chemical Probes Portal is an open, online resource whose purpose is to identify and make available high quality chemical probes for use in biological research and drug discovery. While chemical probes can be valuable tools to elucidate signal transduction pathways and to validate new drug targets, many of the probes that are in use are not selective and therefore can give very misleading results.
A cloud laboratory is a heavily automated, centralized research laboratory where scientists can run an entire experimental process from a computer in a remote location. Cloud laboratories offer the execution of life science research experiments as a service, allowing researchers to retain full control over experimental design. Users create experimental protocols through a high-level API and the experiment is executed in the cloud laboratory where users do not need to track its progression.