D4Science

Last updated
D4Science
NicknameD4Science
Headquarters Istituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy
Products Virtual Research Environments, Science Gateways, cloud computing, e-infrastructure
Website www.d4science.org

D4Science is a Data Infrastructure offering services by community-driven virtual research environments . [1] In particular, it supports communities of practice willing to implement open science practices. [2] The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer “resources” (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services. [3] In particular, D4Science aggregates “domain agnostic” service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services.

Contents

It is spread across several sites, the primary one is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy).

At the earth of this infrastructure there is an Open Source Software named gCube system. [4]

Services

D4Science offers:

Community

The D4Science Infrastructure serves more than 24,000 registered users (August 2024) through 177 active VREs offered via 20 Science gateways. This extensive infrastructure not only supports a diverse range of scientific communities but also fosters significant engagement and collaboration among researchers worldwide.

Engagement within the D4Science community is robust, with users benefiting from user-friendly application environments tailored to their specific needs. The platform allows users to securely preserve, access, and share their data from anywhere, fostering a collaborative and inclusive research environment. Additionally, groups of users can create their own virtual environments and customise them with the applications they need, further enhancing the platform's flexibility and usability.

Supported communities and cases range from Agri-food [7] to Social Data Science [8] , Earth Science [9] and Marine Science. [10] These diverse applications demonstrate the versatility and broad applicability of the D4Science Infrastructure, making it an invaluable resource for researchers across various scientific domains.

History

The D4Science development has been supported by several European-funded projects.

DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development. [11]

In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative. In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope [12]

In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices. [2]

The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing.

See also

Related Research Articles

Computer-supported cooperative work (CSCW) is the study of how people utilize technology collaboratively, often towards a shared goal. CSCW addresses how computer systems can support collaborative activity and coordination. More specifically, the field of CSCW seeks to analyze and draw connections between currently understood human psychological and social behaviors and available collaborative tools, or groupware. Often the goal of CSCW is to help promote and utilize technology in a collaborative way, and help create new tools to succeed in that goal. These parallels allow CSCW research to inform future design patterns or assist in the development of entirely new tools.

EGI is a federation of computing and storage resource providers that deliver advanced computing and data analytics services for research and innovation. The Federation is governed by its participants represented in the EGI Council and coordinated by the EGI Foundation.

<span class="mw-page-title-main">Edge computing</span> Distributed computing paradigm

Edge computing is a distributed computing model that brings computation and data storage closer to the sources of data. More broadly, it refers to any design that pushes computation physically closer to a user, so as to reduce the latency compared to when an application runs on a centralized data centre.

<span class="mw-page-title-main">Renaissance Computing Institute</span>

Renaissance Computing Institute (RENCI) was launched in 2004 as a collaboration involving the State of North Carolina, University of North Carolina at Chapel Hill (UNC-CH), Duke University, and North Carolina State University. RENCI is organizationally structured as a research institute within UNC-CH, and its main campus is located in Chapel Hill, NC, a few miles from the UNC-CH campus. RENCI has engagement centers at UNC-CH, Duke University (Durham), and North Carolina State University (Raleigh).

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to ISO.

The Archaeology Data Service (ADS) is an open access digital archive for archaeological research outputs. It is located in The King's Manor, at the University of York. Originally intended to curate digital outputs from archaeological researchers based in the UK's Higher Education sector, the ADS also holds archive material created under the auspices of national and local government as well as in the commercial archaeology sector. The ADS carries out research, most of which focuses on resource discovery, cross-searching and interoperability with other relevant archives in the UK, Europe and the United States of America.

<span class="mw-page-title-main">Virtual private cloud</span> Pool of shared resources allocated within a public cloud environment

A virtual private cloud (VPC) is an on-demand configurable pool of shared resources allocated within a public cloud environment, providing a certain level of isolation between the different organizations using the resources. The isolation between one VPC user and all other users of the same cloud is achieved normally through allocation of a private IP subnet and a virtual communication construct per user. In a VPC, the previously described mechanism, providing isolation within the cloud, is accompanied with a virtual private network (VPN) function that secures, by means of authentication and encryption, the remote access of the organization to its VPC resources. With the introduction of the described isolation levels, an organization using this service is in effect working on a 'virtually private' cloud, and hence the name VPC.

<span class="mw-page-title-main">Species distribution modelling</span> Algorithmic prediction of the distribution of a species across geographic space

Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses ecological models to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).

<span class="mw-page-title-main">AquaMaps</span>

AquaMaps is a collaborative project with the aim of producing computer-generated predicted global distribution maps for marine species on a 0.5 × 0.5 degree grid of the oceans based on data available through online species databases such as FishBase and SeaLifeBase and species occurrence records from OBIS or GBIF and using an environmental envelope model in conjunction with expert input. The underlying model represents a modified version of the relative environmental suitability (RES) model developed by Kristin Kaschner to generate global predictions of marine mammal occurrences.

<span class="mw-page-title-main">David De Roure</span> English computer scientist

David Charles De Roure is an English computer scientist who is a professor of e-Research at the University of Oxford, where he is responsible for Digital Humanities in The Oxford Research Centre in the Humanities (TORCH), and is a Turing Fellow at The Alan Turing Institute. He is a supernumerary Fellow of Wolfson College, Oxford, and Oxford Martin School Senior Alumni Fellow.

A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collaboration support, document hosting, and some discipline-specific tools, such as data analysis, visualisation, or simulation management. In some instances, publication management, and teaching tools such as presentations and slides may be included. VREs have become important in fields where research is primarily carried out in teams which span institutions and even countries: the ability to easily share information and research results is valuable.

HP CloudSystem is a cloud infrastructure from Hewlett Packard Enterprise (HPE) that combines storage, servers, networking and software.

<span class="mw-page-title-main">HP Cloud</span> Set of cloud computing services

HP Cloud was a set of cloud computing services available from Hewlett-Packard. It was the combination of the previous HP Converged Cloud business unit and HP Cloud Services, an OpenStack-based public cloud. It was marketed to enterprise organizations to combine public cloud services with internal IT resources to create hybrid clouds, or a mix of private and public cloud environments, from around 2011 to 2016.

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Data publishing is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Visual computing is a generic term for all computer science disciplines dealing with the 3D modeling of graphical requirements, for which extenuates to all disciplines of the Computational Sciences. While this is directly relevant to the software visualistics of Microservices, Visual Computing also includes the specializations of the subfields that are called Computer Graphics, Image Processing, Visualization, Computer Vision, Computational Imaging, Augmented Reality, and Video Processing, upon which extenuates into Design Computation. Visual computing also includes aspects of Pattern Recognition, Human-Computer Interaction, Machine Learning, Robotics, Computer Simulation, Steganography, Security Visualization, Spatial Analysis, Computational Visualistics, and Computational Creativity. The core challenges are the acquisition, processing, analysis and rendering of visual information. Application areas include industrial quality control, medical image processing and visualization, surveying, multimedia systems, virtual heritage, special effects in movies and television, and ultimately computer games, which is central towards the visual models of User Experience Design. Conclusively, this includes the extenuations of large language models (LLM) that are in Generative Artificial Intelligence for developing research around the simulations of scientific instruments in the Computational Sciences. This is especially the case with the research simulations that are between Embodied Agents and Generative Artificial Intelligence that is designed for Visual Computation. Therefore, this field also extenuates into the diversity of scientific requirements that are addressed through the visualized technologies of interconnected research in the Computational Sciences.

Science gateways provide access to advanced resources for science and engineering researchers, educators, and students. Through streamlined, online, user-friendly interfaces, gateways combine a variety of cyberinfrastructure (CI) components in support of a community-specific set of tools, applications, and data collections.: In general, these specialized, shared resources are integrated as a Web portal, mobile app, or a suite of applications. Through science gateways, broad communities of researchers can access diverse resources which can save both time and money for themselves and their institutions. As listed below, functions and resources offered by science gateways include shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

<span class="mw-page-title-main">GCube system</span> Open source software system

gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure.

References

  1. Candela, L.; Castelli, D.; Pagano, P. (2023). "The D4Science Experience on Virtual Research Environments Development". Computing in Science & Engineering. 25 (2): 12–19. Bibcode:2023CSE....25b..12C. doi:10.1109/MCSE.2023.3290433. S2CID   259713679.
  2. 1 2 Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F. (2019). "Enacting open science by D4Science". Future Generation Computer Systems. 101: 555–563. doi:10.1016/j.future.2019.05.063. S2CID   192644104.
  3. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Dell'Amico, A.; Frosini, L.; Lelii, L.; Lettere, M.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F.; Sinibaldi, F. (2019). "Virtual research environments co-creation: The D4Science experience". Concurrency and Computation: Practice and Experience. 35 (18). doi:10.1002/cpe.6925. S2CID   247426120.
  4. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Marioli, V.; Pagano, P.; Panichi, G.; Perciante, C.; Sinibaldi, F. (2019). "The gCube system: Delivering Virtual Research Environments as-a-Service". Future Generation Computer Systems. 95: 445–453. doi:10.1016/j.future.2018.10.035. S2CID   57313947.
  5. Coro, G.; Panichi, G.; Scarponi, P.; Pagano, P. (2017). "Cloud computing in a distributed e-infrastructure using the web processing service standard". Concurrency and Computation: Practice and Experience. 29 (18): e4219. doi:10.1002/cpe.4219. S2CID   24360342.
  6. Candela, L.; Coro, G.; Lelii, L.; Pagano, P.; Panichi, G. (2020). "Data Processing and Analytics for Data-Centric Sciences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 176–191. doi:10.1007/978-3-030-52829-4_10. ISBN   978-3-030-52828-7. S2CID   220794538.
  7. Assante, M.; Boizet, A.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Fernández, E.; Filter, M.; Frosini, L.; Georgiev, T.; Kakaletris, G.; Katsivelis, P.; Knapen, R.; Lelii, L.; Lokers, R. M.; Mangiacrapa, F.; Manouselis, N.; Pagano, P.; Panichi, G.; Penev, L.; Sinibaldi, F. (2020). "Realizing virtual research environments for the agri-food community: The AGINFRA PLUS experience". Concurrency and Computation: Practice and Experience. 33 (19): n.a. doi:10.1002/cpe.6087. S2CID   229459865.
  8. Grossi, V.; Giannotti, F.; Pedreschi, D.; Manghi, P.; Pagano, P.; Assante, M. (2021). "Data science: a game changer for science and innovation". International Journal of Data Science and Analytics. 11 (4): 263–278. doi: 10.1007/s41060-020-00240-2 . hdl: 11384/137137 .
  9. Jeffery, K.; Candela, L.; Glaves, E. (2020). "Virtual Research Environments for Environmental and Earth Sciences: Approaches and Experiences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 272–289. doi:10.1007/978-3-030-52829-4_15. ISBN   978-3-030-52828-7. S2CID   220795026.
  10. Coro, G.; Gonzalez Vilas, L.; Magliozzi, C.; Ellenbroek, A.; Scarponi, P.; Pagano, P. (2018). "Forecasting the ongoing invasion of Lagocephalus sceleratus in the Mediterranean Sea". Ecological Modelling. 371: 37–49. Bibcode:2018EcMod.371...37C. doi: 10.1016/j.ecolmodel.2018.01.007 .
  11. Candela, L.; Akal, F.; Avancini, H.; Castelli, D.; Fusco, L.; Guidetti, V.; Langguth, C.; Manzi, A.; Pagano, P.; Schuldt, H.; Simi, M.; Springmann, M.; Voicu, L. (2007). "DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure". International Journal on Digital Libraries. 7 (1–2): 59–80. doi:10.1007/s00799-007-0023-8. S2CID   29730933.
  12. Amaral, R.; Badia, R. M.; Blanquer, I.; Braga-Neto, R.; Candela, L.; Castelli, D.; Flann, C.; De Giovanni, R.; Gray, W. A.; Jones, A.; Lezzi, D.; Pagano, P.; Perez-Canhos, V.; Quevedo, F.; Rafanell, R.; Rebello, V.; Sousa-Baena, M. S.; Torres, E. (2015). "Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure". Concurrency and Computation: Practice and Experience. 27 (2): 376–394. doi:10.1002/cpe.3238. hdl: 10251/62543 . S2CID   34424801.