D4Science

Last updated
D4Science
NicknameD4Science
Headquarters Istituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy
Products Virtual Research Environments, Science Gateways, cloud computing, e-infrastructure
Website www.d4science.org

D4Science is an organisation operating a Data Infrastructure offering services by community-driven virtual research environments . [1] In particular, it supports communities of practice willing to implement open science practices. [2] The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer “resources” (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services. [3] In particular, D4Science aggregates “domain agnostic” service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services.

Contents

This organization is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy).

At the earth of this infrastructure there is an Open Source Software named gCube system. [4]

Services

D4Science offers:

The D4Science Infrastructure is serving thousands of users (more than 20,000 registered users in June 2023) by 178 active VREs offered via 20 Science gateways.

History

The D4Science initiative has been developed and supported by several European-funded projects.

DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development. [7]

In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative. In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope [8]

In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices. [2]

The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing.

Supported communities and cases range from Agri-food [9] to Social Data Science [10] , Earth Science [11] and Marine Science. [12]

See also

Related Research Articles

Computer-supported cooperative work (CSCW) is the study of how people utilize technology collaboratively, often towards a shared goal. CSCW addresses how computer systems can support collaborative activity and coordination. More specifically, the field of CSCW seeks to analyze and draw connections between currently understood human psychological and social behaviors and available collaborative tools, or groupware. Often the goal of CSCW is to help promote and utilize technology in a collaborative way, and help create new tools to succeed in that goal. These parallels allow CSCW research to inform future design patterns or assist in the development of entirely new tools.

A vulnerability assessment is the process of identifying, quantifying, and prioritizing the vulnerabilities in a system. Examples of systems for which vulnerability assessments are performed include, but are not limited to, information technology systems, energy supply systems, water supply systems, transportation systems, and communication systems. Such assessments may be conducted on behalf of a range of different organizations, from small businesses up to large regional infrastructures. Vulnerability from the perspective of disaster management means assessing the threats from potential hazards to the population and to infrastructure. It may be conducted in the political, social, economic or environmental fields.

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

<span class="mw-page-title-main">Carole Goble</span> British computer scientist

Carole Anne Goble, is a British academic who is Professor of Computer Science at the University of Manchester. She is principal investigator (PI) of the myGrid, BioCatalogue and myExperiment projects and co-leads the Information Management Group (IMG) with Norman Paton.

The George E. Brown, Jr. Network for Earthquake Engineering Simulation (NEES) was created by the National Science Foundation (NSF) to improve infrastructure design and construction practices to prevent or minimize damage during an earthquake or tsunami. Its headquarters were at Purdue University in West Lafayette, Indiana as part of cooperative agreement #CMMI-0927178, and it ran from 2009 till 2014. The mission of NEES is to accelerate improvements in seismic design and performance by serving as a collaboratory for discovery and innovation.

<span class="mw-page-title-main">Renaissance Computing Institute</span>

Renaissance Computing Institute (RENCI) was launched in 2004 as a collaboration involving the State of North Carolina, University of North Carolina at Chapel Hill (UNC-CH), Duke University, and North Carolina State University. RENCI is organizationally structured as a research institute within UNC-CH, and its main campus is located in Chapel Hill, NC, a few miles from the UNC-CH campus. RENCI has engagement centers at UNC-CH, Duke University (Durham), and North Carolina State University (Raleigh).

In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

The Archaeology Data Service (ADS) is an open access digital archive for archaeological research outputs. It is located in The King's Manor, at the University of York. Originally intended to curate digital outputs from archaeological researchers based in the UK's Higher Education sector, the ADS also holds archive material created under the auspices of national and local government as well as in the commercial archaeology sector. The ADS carries out research, most of which focuses on resource discovery, cross-searching and interoperability with other relevant archives in the UK, Europe and the United States of America.

<span class="mw-page-title-main">Species distribution modelling</span> Algorithmic prediction of the distribution of a species across geographic space

Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses computer algorithms to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).

<span class="mw-page-title-main">AquaMaps</span>

AquaMaps is a collaborative project with the aim of producing computer-generated predicted global distribution maps for marine species on a 0.5 x 0.5 degree grid of the oceans based on data available through online species databases such as FishBase and SeaLifeBase and species occurrence records from OBIS or GBIF and using an environmental envelope model in conjunction with expert input. The underlying model represents a modified version of the relative environmental suitability (RES) model developed by Kristin Kaschner to generate global predictions of marine mammal occurrences.

Cloud computing security or, more simply, cloud security, refers to a broad set of policies, technologies, applications, and controls utilized to protect virtualized IP, data, applications, services, and the associated infrastructure of cloud computing. It is a sub-domain of computer security, network security, and, more broadly, information security.

A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collaboration support, document hosting, and some discipline-specific tools, such as data analysis, visualisation, or simulation management. In some instances, publication management, and teaching tools such as presentations and slides may be included. VREs have become important in fields where research is primarily carried out in teams which span institutions and even countries: the ability to easily share information and research results is valuable.

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Data publishing is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Urban informatics refers to the study of people creating, applying and using information and communication technology and data in the context of cities and urban environments. It sits at the conjunction of urban science, geomatics, and informatics, with an ultimate goal of creating more smart and sustainable cities. Various definitions are available, some provided in the Definitions section.

Science gateways provide access to advanced resources for science and engineering researchers, educators, and students. Through streamlined, online, user-friendly interfaces, gateways combine a variety of cyberinfrastructure (CI) components in support of a community-specific set of tools, applications, and data collections.: In general, these specialized, shared resources are integrated as a Web portal, mobile app, or a suite of applications. Through science gateways, broad communities of researchers can access diverse resources which can save both time and money for themselves and their institutions. As listed below, functions and resources offered by science gateways include shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

<span class="mw-page-title-main">GCube system</span>

gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure.

References

  1. Candela, L.; Castelli, D.; Pagano, P. (2023). "The D4Science Experience on Virtual Research Environments Development". Computing in Science & Engineering: 1–9. doi:10.1109/MCSE.2023.3290433. S2CID   259713679.
  2. 1 2 Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F. (2019). "Enacting open science by D4Science". Future Generation Computer Systems. 101: 555–563. doi:10.1016/j.future.2019.05.063. S2CID   192644104.
  3. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Dell'Amico, A.; Frosini, L.; Lelii, L.; Lettere, M.; Mangiacrapa, F.; Pagano, P.; Panichi, G.; Sinibaldi, F.; Sinibaldi, F. (2019). "Virtual research environments co-creation: The D4Science experience". Concurrency and Computation: Practice and Experience. 35 (18). doi:10.1002/cpe.6925. S2CID   247426120.
  4. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F.; Marioli, V.; Pagano, P.; Panichi, G.; Perciante, C.; Sinibaldi, F. (2019). "The gCube system: Delivering Virtual Research Environments as-a-Service". Future Generation Computer Systems. 95: 445–453. doi:10.1016/j.future.2018.10.035. S2CID   57313947.
  5. Coro, G.; Panichi, G.; Scarponi, P.; Pagano, P. (2017). "Cloud computing in a distributed e‐infrastructure using the web processing service standard". Concurrency and Computation: Practice and Experience. 29 (18): e4219. doi:10.1002/cpe.4219. S2CID   24360342.
  6. Candela, L.; Coro, G.; Lelii, L.; Pagano, P.; Panichi, G. (2020). "Data Processing and Analytics for Data-Centric Sciences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 176–191. doi:10.1007/978-3-030-52829-4_10. ISBN   978-3-030-52828-7. S2CID   220794538.
  7. Candela, L.; Akal, F.; Avancini, H.; Castelli, D.; Fusco, L.; Guidetti, V.; Langguth, C.; Manzi, A.; Pagano, P.; Schuldt, H.; Simi, M.; Springmann, M.; Voicu, L. (2007). "DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure". International Journal on Digital Libraries. 7 (1–2): 59–80. doi:10.1007/s00799-007-0023-8. S2CID   29730933.
  8. Amaral, R.; Badia, R. M.; Blanquer, I.; Braga‐Neto, R.; Candela, L.; Castelli, D.; Flann, C.; De Giovanni, R.; Gray, W. A.; Jones, A.; Lezzi, D.; Pagano, P.; Perez‐Canhos, V.; Quevedo, F.; Rafanell, R.; Rebello, V.; Sousa‐Baena, M. S.; Torres, E. (2015). "Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure". Concurrency and Computation: Practice and Experience. 27 (2): 376–394. doi:10.1002/cpe.3238. hdl: 10251/62543 . S2CID   34424801.
  9. Assante, M.; Boizet, A.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Fernández, E.; Filter, M.; Frosini, L.; Georgiev, T.; Kakaletris, G.; Katsivelis, P.; Knapen, R.; Lelii, L.; Lokers, R. M.; Mangiacrapa, F.; Manouselis, N.; Pagano, P.; Panichi, G.; Penev, L.; Sinibaldi, F. (2020). "Realizing virtual research environments for the agri‐food community: The AGINFRA PLUS experience". Concurrency and Computation: Practice and Experience (19): n.a. doi:10.1002/cpe.6087. S2CID   229459865.
  10. Grossi, V.; Giannotti, F.; Pedreschi, D.; Manghi, P.; Pagano, P.; Assante, M. (2021). "Data science: a game changer for science and innovation". International Journal of Data Science and Analytics. 11 (4): 263–278. doi: 10.1007/s41060-020-00240-2 . hdl: 11384/137137 .
  11. Jeffery, K.; Candela, L.; Glaves, E. (2020). "Virtual Research Environments for Environmental and Earth Sciences: Approaches and Experiences". In Zhao, Z.; Hellström, M. (eds.). Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. Vol. 12003. pp. 272–289. doi:10.1007/978-3-030-52829-4_15. ISBN   978-3-030-52828-7. S2CID   220795026.
  12. Coro, G.; Gonzalez Vilas, L.; Magliozzi, C.; Ellenbroek, A.; Scarponi, P.; Pagano, P. (2018). "Forecasting the ongoing invasion of Lagocephalus sceleratus in the Mediterranean Sea". Ecological Modelling. 371: 37–49. doi: 10.1016/j.ecolmodel.2018.01.007 .