D4Science

Last updated
D4Science
NicknameD4Science
Headquarters Istituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy
Products Virtual Research Environments, Science Gateways, cloud computing, e-infrastructure
Website www.d4science.org

D4Science is an organisation operating a Data Infrastructure offering a rich array of services by community-driven virtual research environments . [1] In particular, it supports communities of practice willing to implement open science practices. [2] The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer “resources” (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services. [3] In particular, D4Science aggregates “domain agnostic” service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services.

Contents

This organization is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy).

At the earth of this infrastructure there is an Open Source Software named gCube system. [4]

Services

D4Science offers a rich array of services:

The D4Science Infrastructure is serving thousands of users (more than 20,000 registered users in June 2023) by 178 active VREs offered via 20 Science gateways.

History

The D4Science initiative has been developed and supported by several European-funded projects.

DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development. [7]

In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative started with the support of D4Science (2008-2009), D4Science-II (2009-2011), ENVRI (2011-2014), EUBrazilOpenBio (2011-2013), iMarine (2011-2014). In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope [8]

In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices. [2] The following projects contributed to D4Science development: BlueBRIDGE (2015-2018), EGI-Engage (2015-2017), ENVRIplus (2015-2019), Parthenos (2015-2019), SoBigData (2015-2019), AGINFRAplus (2017-2019), PerformFish (2017-2022), ARIADNEplus (2019-2022), EOSC-Pillar (2019-2022), DESIRA (2019-2023), RISIS2 (2019-2022), SoBigData++ (2019-2022), MOVING (2020-2024), EcoScope (2021-2025), SNAPSHOT (2020-2022), I-GENE (2021-2025), NAVIGATOR (2020-2023).

The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing. These activities are additionally partly supported by the following Horizon Europe programme projects: BlueCloud2026 (2023-2026), and SoBigData RI PPP (2022-2025).

Supported communities and cases range from Agri-food [9] to Social Data Science [10] , Earth Science [11] and Marine Science. [12]

See also

Related Research Articles

Neuroinformatics is the emergent field that combines informatics and neuroscience. Neuroinformatics is related with neuroscience data and information processing by artificial neural networks. There are three main directions where neuroinformatics has to be applied:

A reference model—in systems, enterprise, and software engineering—is an abstract framework or domain-specific ontology consisting of an interlinked set of clearly defined concepts produced by an expert or body of experts to encourage clear communication. A reference model can represent the component parts of any consistent idea, from business functions to system components, as long as it represents a complete set. This frame of reference can then be used to communicate ideas clearly among members of the same community.

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

<span class="mw-page-title-main">Carole Goble</span> British computer scientist

Carole Anne Goble, is a British academic who is Professor of Computer Science at the University of Manchester. She is principal investigator (PI) of the myGrid, BioCatalogue and myExperiment projects and co-leads the Information Management Group (IMG) with Norman Paton.

The George E. Brown, Jr. Network for Earthquake Engineering Simulation (NEES) was created by the National Science Foundation (NSF) to improve infrastructure design and construction practices to prevent or minimize damage during an earthquake or tsunami. Its headquarters were at Purdue University in West Lafayette, Indiana as part of cooperative agreement #CMMI-0927178, and it ran from 2009 till 2014. The mission of NEES is to accelerate improvements in seismic design and performance by serving as a collaboratory for discovery and innovation.

<span class="mw-page-title-main">Smart city</span> City using integrated information and communication technology

A smart city is a technologically modern urban area that uses different types of electronic methods and sensors to collect specific data. Information gained from that data is used to manage assets, resources and services efficiently; in return, that data is used to improve operations across the city. This includes data collected from citizens, devices, buildings and assets that is processed and analyzed to monitor and manage traffic and transportation systems, power plants, utilities, urban forestry, water supply networks, waste, criminal investigations, information systems, schools, libraries, hospitals, and other community services. Smart cities are defined as smart both in the ways in which their governments harness technology as well as in how they monitor, analyze, plan, and govern the city. In smart cities, the sharing of data is not limited to the city itself but also includes businesses, citizens and other third parties that can benefit from various uses of that data. Sharing data from different systems and sectors creates opportunities for increased understanding and economic benefits.

<span class="mw-page-title-main">Renaissance Computing Institute</span>

Renaissance Computing Institute (RENCI) was launched in 2004 as a collaboration involving the State of North Carolina, University of North Carolina at Chapel Hill (UNC-CH), Duke University, and North Carolina State University. RENCI is organizationally structured as a research institute within UNC-CH, and its main campus is located in Chapel Hill, NC, a few miles from the UNC-CH campus. RENCI has engagement centers at UNC-CH, Duke University (Durham), and North Carolina State University (Raleigh).

In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.

<span class="mw-page-title-main">Cloud computing</span> Form of shared Internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

The Archaeology Data Service (ADS) is an open access digital archive for archaeological research outputs. It is located in The King's Manor, at the University of York. Originally intended to curate digital outputs from archaeological researchers based in the UK's Higher Education sector, the ADS also holds archive material created under the auspices of national and local government as well as in the commercial archaeology sector. The ADS carries out research, most of which focuses on resource discovery, cross-searching and interoperability with other relevant archives in the UK, Europe and the United States of America.

<span class="mw-page-title-main">Species distribution modelling</span> Algorithmic prediction of the distribution of a species across geographic space

Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses computer algorithms to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).

<span class="mw-page-title-main">AquaMaps</span>

AquaMaps is a collaborative project with the aim of producing computer-generated predicted global distribution maps for marine species on a 0.5 x 0.5 degree grid of the oceans based on data available through online species databases such as FishBase and SeaLifeBase and species occurrence records from OBIS or GBIF and using an environmental envelope model in conjunction with expert input. The underlying model represents a modified version of the relative environmental suitability (RES) model developed by Kristin Kaschner to generate global predictions of marine mammal occurrences.

Cloud computing security or, more simply, cloud security, refers to a broad set of policies, technologies, applications, and controls utilized to protect virtualized IP, data, applications, services, and the associated infrastructure of cloud computing. It is a sub-domain of computer security, network security, and, more broadly, information security.

A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collaboration support, document hosting, and some discipline-specific tools, such as data analysis, visualisation, or simulation management. In some instances, publication management, and teaching tools such as presentations and slides may be included. VREs have become important in fields where research is primarily carried out in teams which span institutions and even countries: the ability to easily share information and research results is valuable.

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Data publishing is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Science gateways provide access to advanced resources for science and engineering researchers, educators, and students. Through streamlined, online, user-friendly interfaces, gateways combine a variety of cyberinfrastructure (CI) components in support of a community-specific set of tools, applications, and data collections.: In general, these specialized, shared resources are integrated as a Web portal, mobile app, or a suite of applications. Through science gateways, broad communities of researchers can access diverse resources which can save both time and money for themselves and their institutions. As listed below, functions and resources offered by science gateways include shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

<span class="mw-page-title-main">GCube system</span>

gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure.

References

  1. Candela, L.; Castelli, D.; Pagano, P. (2023). "The D4Science Experience on Virtual Research Environments Development". Computing in Science & Engineering: 1–9. doi:10.1109/MCSE.2023.3290433. S2CID   259713679.
  2. 1 2 Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F. (2019). "Enacting open science by D4Science". Future Generation Computer Systems. 101: 555–563. doi:10.1016/j.future.2019.05.063. S2CID   192644104.
  3. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Dell'Amico, A.; Frosini, L.; Lelii, L.; Lettere, M.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F.; Sinibaldi, F. (2019). "Virtual research environments co-creation: The D4Science experience". Concurrency and Computation: Practice and Experience. 35 (18). doi:10.1002/cpe.6925. S2CID   247426120.
  4. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Marioli, V.; Pagano, P.; Panichi, G.; Perciante, C.; Sinibaldi, F. (2019). "The gCube system: Delivering Virtual Research Environments as-a-Service". Future Generation Computer Systems. 95: 445–453. doi:10.1016/j.future.2018.10.035. S2CID   57313947.
  5. Coro, G.; Panichi, G.; Scarponi, P.; Pagano, P. (2017). "Cloud computing in a distributed e‐infrastructure using the web processing service standard". Concurrency and Computation: Practice and Experience. 29 (18): e4219. doi:10.1002/cpe.4219. S2CID   24360342.
  6. Candela, L.; Coro, G.; Lelii, L.; Pagano, P.; Panichi, G. (2020). "Data Processing and Analytics for Data-Centric Sciences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 176–191. doi:10.1007/978-3-030-52829-4_10. ISBN   978-3-030-52828-7. S2CID   220794538.
  7. Candela, L.; Akal, F.; Avancini, H.; Castelli, D.; Fusco, L.; Guidetti, V.; Langguth, C.; Manzi, A.; Pagano, P.; Schuldt, H.; Simi, M.; Springmann, M.; Voicu, L. (2007). "DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure". International Journal on Digital Libraries. 7 (1–2): 59–80. doi:10.1007/s00799-007-0023-8. S2CID   29730933.
  8. Amaral, R.; Badia, R. M.; Blanquer, I.; Braga‐Neto, R.; Candela, L.; Castelli, D.; Flann, C.; De Giovanni, R.; Gray, W. A.; Jones, A.; Lezzi, D.; Pagano, P.; Perez‐Canhos, V.; Quevedo, F.; Rafanell, R.; Rebello, V.; Sousa‐Baena, M. S.; Torres, E. (2015). "Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure". Concurrency and Computation: Practice and Experience. 27 (2): 376–394. doi:10.1002/cpe.3238. hdl: 10251/62543 . S2CID   34424801.
  9. Assante, M.; Boizet, A.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Fernández, E.; Filter, M.; Frosini, L.; Georgiev, T.; Kakaletris, G.; Katsivelis, P.; Knapen, R.; Lelii, L.; Lokers, R. M.; Mangiacrapa, F.; Manouselis, N.; Pagano, P.; Panichi, G.; Penev, L.; Sinibaldi, F. (2020). "Realizing virtual research environments for the agri‐food community: The AGINFRA PLUS experience". Concurrency and Computation: Practice and Experience (19): n.a. doi:10.1002/cpe.6087. S2CID   229459865.
  10. Grossi, V.; Giannotti, F.; Pedreschi, D.; Manghi, P.; Pagano, P.; Assante, M. (2021). "Data science: a game changer for science and innovation". International Journal of Data Science and Analytics. 11 (4): 263–278. doi: 10.1007/s41060-020-00240-2 .
  11. Jeffery, K.; Candela, L.; Glaves, E. (2020). "Virtual Research Environments for Environmental and Earth Sciences: Approaches and Experiences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 272–289. doi:10.1007/978-3-030-52829-4_15. ISBN   978-3-030-52828-7. S2CID   220795026.
  12. Coro, G.; Gonzalez Vilas, L.; Magliozzi, C.; Ellenbroek, A.; Scarponi, P.; Pagano, P. (2018). "Forecasting the ongoing invasion of Lagocephalus sceleratus in the Mediterranean Sea". Ecological Modelling. 371: 37–49. doi: 10.1016/j.ecolmodel.2018.01.007 .