D4Science

Last updated
D4Science
NicknameD4Science
Headquarters Istituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy
Products Virtual Research Environments, Science Gateways, cloud computing, e-infrastructure
Website www.d4science.org

D4Science is an organisation operating a Data Infrastructure offering services by community-driven virtual research environments . [1] In particular, it supports communities of practice willing to implement open science practices. [2] The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer “resources” (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services. [3] In particular, D4Science aggregates “domain agnostic” service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services.

Contents

This organization is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy).

At the earth of this infrastructure there is an Open Source Software named gCube system. [4]

Services

D4Science offers:

Community

The D4Science Infrastructure serves more than 24,000 registered users (August 2024) through 177 active VREs offered via 20 Science gateways. This extensive infrastructure not only supports a diverse range of scientific communities but also fosters significant engagement and collaboration among researchers worldwide.

Engagement within the D4Science community is robust, with users benefiting from user-friendly application environments tailored to their specific needs. The platform allows users to securely preserve, access, and share their data from anywhere, fostering a collaborative and inclusive research environment. Additionally, groups of users can create their own virtual environments and customise them with the applications they need, further enhancing the platform's flexibility and usability.

Supported communities and cases range from Agri-food [7] to Social Data Science [8] , Earth Science [9] and Marine Science. [10] These diverse applications demonstrate the versatility and broad applicability of the D4Science Infrastructure, making it an invaluable resource for researchers across various scientific domains.

History

The D4Science initiative has been developed and supported by several European-funded projects.

DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development. [11]

In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative. In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope [12]

In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices. [2]

The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing.

See also

Related Research Articles

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

<span class="mw-page-title-main">Edge computing</span> Distributed computing paradigm

Edge computing is a distributed computing model that brings computation and data storage closer to the sources of data. More broadly, it refers to any design that pushes computation physically closer to a user, so as to reduce the latency compared to when an application runs on a centralized data centre.

<span class="mw-page-title-main">Renaissance Computing Institute</span>

Renaissance Computing Institute (RENCI) was launched in 2004 as a collaboration involving the State of North Carolina, University of North Carolina at Chapel Hill (UNC-CH), Duke University, and North Carolina State University. RENCI is organizationally structured as a research institute within UNC-CH, and its main campus is located in Chapel Hill, NC, a few miles from the UNC-CH campus. RENCI has engagement centers at UNC-CH, Duke University (Durham), and North Carolina State University (Raleigh).

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

The Archaeology Data Service (ADS) is an open access digital archive for archaeological research outputs. It is located in The King's Manor, at the University of York. Originally intended to curate digital outputs from archaeological researchers based in the UK's Higher Education sector, the ADS also holds archive material created under the auspices of national and local government as well as in the commercial archaeology sector. The ADS carries out research, most of which focuses on resource discovery, cross-searching and interoperability with other relevant archives in the UK, Europe and the United States of America.

<span class="mw-page-title-main">Virtual private cloud</span> Pool of shared resources allocated within a public cloud environment

A virtual private cloud (VPC) is an on-demand configurable pool of shared resources allocated within a public cloud environment, providing a certain level of isolation between the different organizations using the resources. The isolation between one VPC user and all other users of the same cloud is achieved normally through allocation of a private IP subnet and a virtual communication construct per user. In a VPC, the previously described mechanism, providing isolation within the cloud, is accompanied with a virtual private network (VPN) function that secures, by means of authentication and encryption, the remote access of the organization to its VPC resources. With the introduction of the described isolation levels, an organization using this service is in effect working on a 'virtually private' cloud, and hence the name VPC.

<span class="mw-page-title-main">Species distribution modelling</span> Algorithmic prediction of the distribution of a species across geographic space

Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses ecological models to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).

<span class="mw-page-title-main">AquaMaps</span>

AquaMaps is a collaborative project with the aim of producing computer-generated predicted global distribution maps for marine species on a 0.5 x 0.5 degree grid of the oceans based on data available through online species databases such as FishBase and SeaLifeBase and species occurrence records from OBIS or GBIF and using an environmental envelope model in conjunction with expert input. The underlying model represents a modified version of the relative environmental suitability (RES) model developed by Kristin Kaschner to generate global predictions of marine mammal occurrences.

Cloud computing security or, more simply, cloud security, refers to a broad set of policies, technologies, applications, and controls utilized to protect virtualized IP, data, applications, services, and the associated infrastructure of cloud computing. It is a sub-domain of computer security, network security, and, more broadly, information security.

A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collaboration support, document hosting, and some discipline-specific tools, such as data analysis, visualisation, or simulation management. In some instances, publication management, and teaching tools such as presentations and slides may be included. VREs have become important in fields where research is primarily carried out in teams which span institutions and even countries: the ability to easily share information and research results is valuable.

HP CloudSystem is a cloud infrastructure from Hewlett Packard Enterprise (HPE) that combines storage, servers, networking and software.

<span class="mw-page-title-main">HP Cloud</span> Set of cloud computing services

HP Cloud was a set of cloud computing services available from Hewlett-Packard. It was the combination of the previous HP Converged Cloud business unit and HP Cloud Services, an OpenStack-based public cloud. It was marketed to enterprise organizations to combine public cloud services with internal IT resources to create hybrid clouds, or a mix of private and public cloud environments, from around 2011 to 2016.

<span class="mw-page-title-main">Internet area network</span> Type of large-scale computer network

An Internet area network (IAN) is a concept for a communications network that connects voice and data endpoints within a cloud environment over IP, replacing an existing local area network (LAN), wide area network (WAN) or the public switched telephone network (PSTN).

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Data publishing is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Visual computing is a generic term for all computer science disciplines dealing with images and 3D models, such as computer graphics, image processing, visualization, computer vision, computational imaging, augmented reality, video processing, which also includes the extenuated field called User Experience Design. Visual computing also includes aspects of pattern recognition, human-computer interaction, machine learning and computer simulation. The core challenges are the acquisition, processing, analysis and rendering of visual information. Application areas include industrial quality control, medical image processing and visualization, surveying, robotics, multimedia systems, virtual heritage, special effects in movies and television, and ultimately computer games. Conclusively, this includes the extenuations of large language models (LLM) that are provided in Generative Artificial Intelligence.

Science gateways provide access to advanced resources for science and engineering researchers, educators, and students. Through streamlined, online, user-friendly interfaces, gateways combine a variety of cyberinfrastructure (CI) components in support of a community-specific set of tools, applications, and data collections.: In general, these specialized, shared resources are integrated as a Web portal, mobile app, or a suite of applications. Through science gateways, broad communities of researchers can access diverse resources which can save both time and money for themselves and their institutions. As listed below, functions and resources offered by science gateways include shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

<span class="mw-page-title-main">GCube system</span>

gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure.

References

  1. Candela, L.; Castelli, D.; Pagano, P. (2023). "The D4Science Experience on Virtual Research Environments Development". Computing in Science & Engineering. 25 (2): 12–19. Bibcode:2023CSE....25b..12C. doi:10.1109/MCSE.2023.3290433. S2CID   259713679.
  2. 1 2 Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F. (2019). "Enacting open science by D4Science". Future Generation Computer Systems. 101: 555–563. doi:10.1016/j.future.2019.05.063. S2CID   192644104.
  3. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Dell'Amico, A.; Frosini, L.; Lelii, L.; Lettere, M.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F.; Sinibaldi, F. (2019). "Virtual research environments co-creation: The D4Science experience". Concurrency and Computation: Practice and Experience. 35 (18). doi:10.1002/cpe.6925. S2CID   247426120.
  4. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Marioli, V.; Pagano, P.; Panichi, G.; Perciante, C.; Sinibaldi, F. (2019). "The gCube system: Delivering Virtual Research Environments as-a-Service". Future Generation Computer Systems. 95: 445–453. doi:10.1016/j.future.2018.10.035. S2CID   57313947.
  5. Coro, G.; Panichi, G.; Scarponi, P.; Pagano, P. (2017). "Cloud computing in a distributed e-infrastructure using the web processing service standard". Concurrency and Computation: Practice and Experience. 29 (18): e4219. doi:10.1002/cpe.4219. S2CID   24360342.
  6. Candela, L.; Coro, G.; Lelii, L.; Pagano, P.; Panichi, G. (2020). "Data Processing and Analytics for Data-Centric Sciences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 176–191. doi:10.1007/978-3-030-52829-4_10. ISBN   978-3-030-52828-7. S2CID   220794538.
  7. Assante, M.; Boizet, A.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Fernández, E.; Filter, M.; Frosini, L.; Georgiev, T.; Kakaletris, G.; Katsivelis, P.; Knapen, R.; Lelii, L.; Lokers, R. M.; Mangiacrapa, F.; Manouselis, N.; Pagano, P.; Panichi, G.; Penev, L.; Sinibaldi, F. (2020). "Realizing virtual research environments for the agri-food community: The AGINFRA PLUS experience". Concurrency and Computation: Practice and Experience. 33 (19): n.a. doi:10.1002/cpe.6087. S2CID   229459865.
  8. Grossi, V.; Giannotti, F.; Pedreschi, D.; Manghi, P.; Pagano, P.; Assante, M. (2021). "Data science: a game changer for science and innovation". International Journal of Data Science and Analytics. 11 (4): 263–278. doi: 10.1007/s41060-020-00240-2 . hdl: 11384/137137 .
  9. Jeffery, K.; Candela, L.; Glaves, E. (2020). "Virtual Research Environments for Environmental and Earth Sciences: Approaches and Experiences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 272–289. doi:10.1007/978-3-030-52829-4_15. ISBN   978-3-030-52828-7. S2CID   220795026.
  10. Coro, G.; Gonzalez Vilas, L.; Magliozzi, C.; Ellenbroek, A.; Scarponi, P.; Pagano, P. (2018). "Forecasting the ongoing invasion of Lagocephalus sceleratus in the Mediterranean Sea". Ecological Modelling. 371: 37–49. Bibcode:2018EcMod.371...37C. doi: 10.1016/j.ecolmodel.2018.01.007 .
  11. Candela, L.; Akal, F.; Avancini, H.; Castelli, D.; Fusco, L.; Guidetti, V.; Langguth, C.; Manzi, A.; Pagano, P.; Schuldt, H.; Simi, M.; Springmann, M.; Voicu, L. (2007). "DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure". International Journal on Digital Libraries. 7 (1–2): 59–80. doi:10.1007/s00799-007-0023-8. S2CID   29730933.
  12. Amaral, R.; Badia, R. M.; Blanquer, I.; Braga-Neto, R.; Candela, L.; Castelli, D.; Flann, C.; De Giovanni, R.; Gray, W. A.; Jones, A.; Lezzi, D.; Pagano, P.; Perez-Canhos, V.; Quevedo, F.; Rafanell, R.; Rebello, V.; Sousa-Baena, M. S.; Torres, E. (2015). "Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure". Concurrency and Computation: Practice and Experience. 27 (2): 376–394. doi:10.1002/cpe.3238. hdl: 10251/62543 . S2CID   34424801.