This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
Data diplomacy can be defined in two different ways: use of data as a means and tool to conduct national diplomacy, [1] or the use of diplomatic actions and skills of various stakeholders to enable and facilitate data access, understanding, and use. [2] Data can help and influence many aspects of the diplomatic process, such as information gathering, negotiations, consular services, humanitarian response and foreign policy development. [3] The second kind of data diplomacy challenges traditional models of diplomacy and can be conducted without tracks and diplomats. Drivers of change in diplomacy are also emerging from industry, academia and directly from the public. [2]
In recent years[ when? ], as more governments have realized the value and capabilities of data, the influence and application of digital diplomacy have expanded gradually and become a new kind of soft diplomatic power.
Based on the definition of science diplomacy given by the American Association for the Advancement of Science the article of Andy Boyd, Jane Gatewood, Stuart Thorson and Timothy D. V. Dye identifies three subcategories of data diplomacy: data in diplomacy, diplomacy for data, and data for diplomacy. [2] It is worth noting that[ editorializing ] the three classifications of data diplomacy are inevitably overlapping.
Diplomatic data refers to the infusion of data and data expertise into relationships between nation-states or other entities that can affect policymaking and involves training in data understanding, use and management. [2] Such types of diplomacy are not limited to diplomats but can also occur between institutions and the public. Actors at both the macro and micro levels drive data diplomacy and data diplomacy is often at the indistinguishable forefront of soft and hard power. [2] Applications include direct aid, access to international credit, and international inter-negotiations. However, the process of data diplomacy is also subject to unstructured and uncontrolled data, such as whistleblowing data disclosures. [2]
This is an important[ editorializing ] classification in data diplomacy and typically refers to stakeholder interactions to advance international practices of data sharing, data use, and data interpretation. [2] Diplomacy for Data encompasses every stage of data - from generation to sharing, use and archiving or deletion. This helps standardize international data coding, facilitates international data sharing, increases public participation, protects user privacy and regulates environmental imbalances. [2] Applications include the International Classification of Diseases coding system that the World Health Organization established and UN benchmark data sharing. [2]
Data for Diplomacy is a worldwide platform of data experts collaborating to create relationships in which diplomats need to understand the use of data to better facilitate the creation of such relationships, use diplomacy to access data and leverage the value of data as soft power. [2] Data can be used for diplomatic occasions in some cases. Applications include the UN Global Pulse, which uses big data in humanitarian volunteer programs. [2]
Science diplomacy is the use of scientific and diplomatic cooperation among nations to address common problems such as climate change, food security, poverty, energy consumption, nuclear energy use, and epidemic diseases. [4] Science diplomacy typically includes three main components:
Data is the foundation of science, and science without data is inappropriate. [5] From this perspective, data diplomacy can be considered as a part or extension of science diplomacy and as a more thorough, diplomatic relationship built from raw data.[ citation needed ]
Digital diplomacy refers to the widespread use of information and communication technologies in diplomacy, particularly innovations on the Internet. This includes the use of social media, online virtual meetings, and other Internet tools by diplomats. [6] The impact of the internet on diplomatic relations is manifested in three main ways: increasing the number of voices and interests involved in international decision-making, accelerating and releasing the dissemination of information about any issue or event and enabling traditional diplomatic services to be delivered more quickly and cost-effectively. [6]
In contrast, data diplomacy focuses on data (including its creation, use, access, and interpretation) rather than on communication mechanisms or media. However, there is an interplay between data and digital communication mechanisms in the field of diplomacy; for example, data-driven artificial intelligence plays a key role in diplomacy. [2]
Scholars Simone Turchetti and Roberto Lalli developed the concept of science diplomacy 2.0 in their 2020 article Envisioning a "Science Diplomacy 2.0: on Data, Global Challenges and Multi-layered Networks. The term "science diplomacy" broadly identifies the interaction between the scientific and foreign policy communities to promote international scientific exchange and provide scientific advice on issues that are relevant to more than one country, such as climate change, food security, poverty, energy consumption, nuclear energy use and epidemic diseases. [7] However, the concept of science diplomacy in practice does not provide ready-made solutions to global problems and has been considered as not having the benign characteristics that are usually ascribed to it. [8]
Therefore, science diplomacy faces a necessary transformation. The new science diplomacy program should be in line with the shift in focus of academic analysis from production to circulation, addressing data access and integration at the global and regional levels. [7] With this aim in mind, Science Diplomacy 2.0 encompasses three levels of definition: to facilitate integrated data analysis based on interdisciplinary scientific work, to facilitate the infrastructure needed to finish these studies and to facilitate the implementation of programs prioritized in this interdisciplinary and integrated research. This encompasses a new analytic approach and infrastructure. The new analytic approach integrates data and metadata from different disciplines to improve understanding of which research should be prioritized in science diplomacy initiatives. [7] The new infrastructure, the Innovation Observatory, will bring together data (and metadata) from different disciplines and integrate them through a linked data, multi-layered approach. [7] Science Diplomacy 2.0" is thus a new model of science diplomacy driven by data. Data and metadata play a key role in the reform of the science diplomacy model.[ citation needed ]
Data diplomacy has also been widely applied to diplomacy in the areas of bioinformatics and public health. [9] Bioinformatics diplomacy works to address how to rapidly share critical biological information internationally to better manage deadly infectious disease outbreaks. The role of bioinformatics diplomacy was well represented during the COVID-19 pandemic. In the past, scientists investigating outbreaks relied heavily on bringing back samples of crisis pathogens from outbreak countries for physical examination. This process was often time-consuming and economically costly. In contrast, advances in information technology and bioinformatics sharing allow scientists to increasingly use digital sequence data of pathogens for rapid computation and analysis. [9]
However, there are limitations to the data open to bioinformatics diplomacy, and social scientists who are not part of the community of sequencing professionals have little knowledge of the social processes involved in acquiring, exchanging and disseminating pathogen sequence data internationally.
Public diplomacy refers to the process and practice of expanding and strengthening relationships between people and governments and citizens in other parts of the world and engaging the global public in the service of national diplomatic interests. [10] [11] Public diplomacy is useful in dealing with international conflicts, community disputes, and interpersonal interactions. [12]
Social media has become an important tool for public diplomacy and diplomats. [13] In 2014, the U.S. Public Diplomacy Advisory Council released a report entitled "Data-Driven Public Diplomacy," highlighting the potential for incorporating data analytics into the public diplomacy workflow, planning and program design for public diplomacy programs, and analyzing the appeal of their programs to foreign audiences. In 2018, Hamilton Bean and Edward Comor, in their article "Data-Driven public diplomacy: A critical and reflexive assessment," asserted that data analytics techniques do not create useful innovations for public diplomacy but rather support governments’ strategic interests and ambitions, leading officials to move away from the human capacity of PD. [14]
The term "Database Diplomacy" was the title of an article first published by Amy Helen Johnson in 2000 about Metagon Technologies LCC's development of a system to connect disparate databases and allow end users to build reports and queries. The article was about the time when Metagon Technologies LCC used its DQpowersuite system to link disparate databases, provide users with a single view of different databases, and allow end users to build reports and queries. deployment and all other information. [15]
Today, the sharing of diplomatic databases has reached new heights. One representative database is the Freedom of Information Archive (FOIArchive), which is the first dataset to combine many machine-readable documents of intrastate communications for analysis. The database collects more than three million domestic diplomatic records on the United States, Great Britain and Brazil between 1620 and 2013, including data on private (i.e., classified) and public information and actions, diplomacy at multiple levels of analysis and communications on various topics. [16] New data is added each time a newly declassified document is released. Publicly available information includes previously classified (or publicly unavailable) internal government documents and the original (often full-text) text of these documents. Scholars can use the online platform to search, view online, and download datasets and texts customized to their research needs. [16] In addition, scholars can access many different types of metadata, including entities and topics identified using natural language processing and machine learning methods. [16]
Data evidence helped Jordan successfully enter into a bilateral cultural property protection agreement with the United States, allowing Jordan to use diplomatic measures to curb the illicit flow of looted and stolen materials. On January 31, 2019, the U.S. Department of State published an official notice in the Federal Register requesting the Government of the Hashemite Kingdom of Jordan to propose a bilateral agreement to the Government of the United States of America to prevent the importation of illegally exported Jordanian cultural materials. [17] Per the U.S. Cultural Property Convention Implementation Act of 1983 (CCPIA), any State Party to the 1970 UNESCO Convention on the Means of Prohibiting and Preventing the Illicit Import, Export and Transfer of Ownership of Cultural Property (1970 UNESCO Convention) that can demonstrate that its cultural landscapes and artifacts are at risk due to U.S. needs, may request a bilateral agreement from the United States. [17]
Jordan provided strong persuasive evidence to reach a bilateral agreement with the United States. The data came from the Jordan Antiquities Database and Information System (JADIS), MEGA-Jordan, Aerial Satellite Imagery and Landscapes of the Dead (LOD). The Jordan Antiquities Database and Information System (JADIS) is the first computerized database of antiquities in an Arab country, created with the help of the American Center for Oriental Research (ACOR) and the U.S. Agency for International Development (USAID). The database, whose data are available to all, is primarily used to monitor and respond quickly to threats from exploitation, looting and other negative impacts on monitored sites and enables the production of "risk maps" containing information on archaeological sites. [17] In 2010, the new Jordan Middle East Antiquities Geodatabase (MEGA-Jordan), a collaborative database between DoA, the Getty Conservation Institute and the World Monuments Fund replaced the Jordan Antiquities Database and Information System (JADIS). Objectives are to standardize and centralize data throughout the Kingdom and report on threats, disturbances and legal violations involving illegal excavation of sites and removal of artifacts. [17]
Data from the Landscapes of the Dead (LOD) combines spatial data from pedestrian surveys, drone surveys and oral histories with multiple people who encountered the site and its artifacts to provide a more nuanced picture of the site's continued destruction.[ citation needed ]
Data from databases, satellite imagery and Landscapes of the Dead (LOD) data were persuasive factors in determining that Jordan's cultural heritage was at risk and that U.S. import restrictions would help stop the severe looting situation. [17] These data provide evidence of landscape change over time in a series of resolutions, all of which amply confirm the need for bilateral agreements. The data are having an impact on the shaping of foreign policy. [17]
One of the risks and impediments to data diplomacy is the issue of sensitive data and data security. Some data, such as pathogen sequence data and genetic sequence data are sensitive because they are located at the core of multiple global value production chains. The disclosure and sharing of such data can be subject to considerable political pressure. [9] If data information causes leakage, it will lead to very serious consequences.[ citation needed ]
Since the public is a major contributor to data, it is critical to developing a standard for ethical and human rights protection. Data collection needs to be ethical in its purpose and designed with the input of key stakeholders, including the public. However, confidentiality and ethical standards are complex issues, and absolute confidentiality is even more impossible to achieve. [2] This can create tensions between groups. In some cases, however, individuals cannot make fair, free or informed choices about how their data is used, which leads to the marginalization of some groups. [2] For example, if one does not agree to the data terms proposed by Google one cannot use its services.[ citation needed ]
Data diplomacy can lead to issues of scientific fairness, intellectual property rights and future benefits. Scientists may not want to share their data quickly because this would give competitors easier access to the information behind the data, thus affecting their career status and development. [9] For scientists in some LMICs, analyses of samples they have shared in the past have subsequently been submitted to international conferences and scientific meetings without proper prior notice, and without including those who shared the samples in authorship arrangements. [9] Therefore, the equal treatment of scientists concerning intellectual property rights and benefits is a critical issue facing data diplomacy.[ citation needed ]
Inaccurate data, data manipulation, and the associated loss of diplomatic capital pose serious challenges to data diplomacy. Resisting data manipulation and establishing a good reputation for data use is an important step to building healthy data diplomacy relationships. [2] Currently, many countries around the world are developing strategies to reduce the probability of data inaccuracy and the harm it causes. Common strategies include increasing data transparency, pre-screening before sharing, and checking the credibility of data sources. However, opening all data is difficult to achieve. Given that states and other entities must classify and restrict intellectual property rights to certain data and must protect data from well-resourced users, some data cannot be made fully open and transparent. [2]
Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial and scientific domains. Data integration appears with increasing frequency as the volume, complexity and the need to share existing data explodes. It has become the focus of extensive theoretical work, and numerous open problems remain unsolved. Data integration encourages collaboration between internal as well as external users. The data being integrated must be received from a heterogeneous database system and transformed to a single coherent data store that provides synchronous data across a network of files for clients. A common use of data integration is in data mining when analyzing and extracting information from existing databases that can be useful for Business information.
Agricultural Information Management Standards (AIMS) is a web site managed by the Food and Agriculture Organization of the United Nations (FAO) for accessing and discussing agricultural information management standards, tools and methodologies connecting information workers worldwide to build a global community of practice. Information management standards, tools and good practices can be found on AIMS:
Semantic publishing on the Web, or semantic web publishing, refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and data integration more efficient.
Open data is data that is openly accessible, exploitable, editable and shared by anyone for any purpose. Open data is licensed under an open license.
GISAID, the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The database was expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other pathogens. The database has been described as "the world's largest repository of COVID-19 sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the emergence of new COVID-19 viral strains across the planet.
The International Generic Sample Number or IGSN is a persistent identifier for sample. As an active persistent identifier it can be resolved through the Handle System. The system is used in production by the System for Earth Sample Registration (SESAR), Geoscience Australia, Commonwealth Scientific and Industrial Research Organisation Mineral Resources, Australian Research Data Commons (ARDC), University of Bremen MARUM, German Research Centre for Geosciences (GFZ), IFREMER Institut Français de Recherche pour l'Exploitation de la Mer, Korea Institute of Geoscience & Mineral Resources (KIGAM), and University of Kiel. Other organisations are preparing the introduction of the IGSN.
Kepler is a free software system for designing, executing, reusing, evolving, archiving, and sharing scientific workflows. Kepler's facilities provide process and data monitoring, provenance information, and high-speed data movement. Workflows in general, and scientific workflows in particular, are directed graphs where the nodes represent discrete computational components, and the edges represent paths along which data and results can flow between components. In Kepler, the nodes are called 'Actors' and the edges are called 'channels'. Kepler includes a graphical user interface for composing workflows in a desktop environment, a runtime engine for executing workflows within the GUI and independently from a command-line, and a distributed computing option that allows workflow tasks to be distributed among compute nodes in a computer cluster or computing grid. The Kepler system principally targets the use of a workflow metaphor for organizing computational tasks that are directed towards particular scientific analysis and modeling goals. Thus, Kepler scientific workflows generally model the flow of data from one step to another in a series of computations that achieve some scientific goal.
Digital curation is the selection, preservation, maintenance, collection, and archiving of digital assets. Digital curation establishes, maintains, and adds value to repositories of digital data for present and future use. This is often accomplished by archivists, librarians, scientists, historians, and scholars. Enterprises are starting to use digital curation to improve the quality of information and data within their operational and strategic processes. Successful digital curation will mitigate digital obsolescence, keeping the information accessible to users indefinitely. Digital curation includes digital asset management, data curation, digital preservation, and electronic records management.
Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
AGRIS is a global public domain database with more than 12 million structured bibliographical records on agricultural science and technology. It became operational in 1975 and the database was maintained by Coherence in Information for Agricultural Research for Development, and its content is provided by more than 150 participating institutions from 65 countries. The AGRIS Search system, allows scientists, researchers and students to perform sophisticated searches using keywords from the AGROVOC thesaurus, specific journal titles or names of countries, institutions, and authors.
Science diplomacy is the use of scientific collaborations among nations to address common problems and to build constructive international partnerships. Science diplomacy is a form of new diplomacy and has become an umbrella term to describe a number of formal or informal technical, research-based, academic or engineering exchanges, within the general field of international relations and the emerging field of global policy making.
Open scientific data or open research data is a type of open data focused on publishing observations and results of scientific activities available for anyone to analyze and reuse. A major purpose of the drive for open data is to allow the verification of scientific claims, by allowing others to look at the reproducibility of results, and to allow data from many sources to be integrated to give new knowledge.
The iPlant Collaborative, renamed Cyverse in 2017, is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany). The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor". In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.
Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.
The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.
MetaboLights is a data repository founded in 2012 for cross-species and cross-platform metabolomic studies that provides primary research data and meta data for metabolomic studies as well as a knowledge base for properties of individual metabolites. The database is maintained by the European Bioinformatics Institute (EMBL-EBI) and the development is funded by Biotechnology and Biological Sciences Research Council (BBSRC). As of July 2018, the MetaboLights browse functionality consists of 383 studies, two analytical platforms, NMR spectroscopy and mass spectrometry.
BacDive is a bacterial metadatabase that provides strain-linked information about bacterial and archaeal biodiversity.
The BioCompute Object (BCO) project is a community-driven initiative to build a framework for standardizing and sharing computations and analyses generated from High-throughput sequencing. The project has since been standardized as IEEE 2791-2020, and the project files are maintained in an open source repository. The July 22nd, 2020 edition of the Federal Register announced that the FDA now supports the use of BioCompute in regulatory submissions, and the inclusion of the standard in the Data Standards Catalog for the submission of HTS data in NDAs, ANDAs, BLAs, and INDs to CBER, CDER, and CFSAN.
Originally started as a collaborative contract between the George Washington University and the Food and Drug Administration, the project has grown to include over 20 universities, biotechnology companies, public-private partnerships and pharmaceutical companies including Seven Bridges and Harvard Medical School. The BCO aims to ease the exchange of HTS workflows between various organizations, such as the FDA, pharmaceutical companies, contract research organizations, bioinformatic platform providers, and academic researchers. Due to the sensitive nature of regulatory filings, few direct references to material can be published. However, the project is currently funded to train FDA Reviewers and administrators to read and interpret BCOs, and currently has 4 publications either submitted or nearly submitted.
FAIR data are data which meet principles of findability, accessibility, interoperability, and reusability (FAIR). The acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations.
Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.