Worldwide Atrocities Dataset

Last updated

The Worldwide Atrocities Dataset is a dataset collected by the Computational Event Data System at Pennsylvania State University and sponsored by the Political Instability Task Force (PITF) that is, in turn, funded by the Central Intelligence Agency in the United States. [1]

Pennsylvania State University Public university with multiple campuses in Pennsylvania, United States

The Pennsylvania State University is a state-related, land-grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855 as the Farmers’ High School of Pennsylvania (FHS), and later known as the University of State College (USC), Penn State conducts teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the "Public Ivies," a publicly funded university considered as providing a quality of education comparable to those of the Ivy League.

The Political Instability Task Force (PITF), formerly known as State Failure Task Force, is a U.S. government-sponsored research project to build a database on major domestic political conflicts leading to state failures. The study analyzed factors to denote the effectiveness of state institutions, population well-being, and found that partial democracies with low involvement in international trade and with high infant mortality are most prone to revolutions. One of the members of the task force resigned on January 20, 2017 in protest of the Trump administration.

Central Intelligence Agency National intelligence agency of the United States

The Central Intelligence Agency (CIA) is a civilian foreign intelligence service of the federal government of the United States, tasked with gathering, processing, and analyzing national security information from around the world, primarily through the use of human intelligence (HUMINT). As one of the principal members of the United States Intelligence Community (IC), the CIA reports to the Director of National Intelligence and is primarily focused on providing intelligence for the President and Cabinet of the United States.

Contents

Data

Unlike other datasets such as the Global Database of Events, Language, and Tone (GDELT), Integrated Conflict Early Warning System (ICEWS), data for the Worldwide Atrocities Dataset is entered and coded manually. [1] Data is available for download in two files: [1]

The Global Database of Events, Language, and Tone (GDELT), created by Kalev Leetaru of Yahoo! and Georgetown University, along with Philip Schrodt and others, describes itself as "an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day." Early explorations leading up to the creation of GDELT were described by co-creator Philip Schrodt in a conference paper in January 2011. The dataset is available on Google Cloud Platform.

The Integrated Conflict Early Warning System (ICEWS) combines a database of political events and a system using these to provide conflict early warnings. It is supported by the Defense Advanced Research Projects Agency in the United States. The database as well as the model used by Lockheed Martin Advanced Technology Laboratories are currently undergoing operational test and evaluation by the United States Southern Command and United States Pacific Command.

In addition to the datasets, a coding manual is available for download. [2]

Reception

Academic reception

The Worldwide Atrocities Dataset has been referenced in academic research on the impact of climate change on violence in Africa. [3] It has also been referenced alongside the ACLED dataset and the Peacekeeping Operations Locations and Event Dataset in a paper on the geography of conflict by Wiedmann and Kuse (2009). [4] [5] A 2011 paper by Gold and Haar used the Worldwide Atrocities Dataset to understand the spatial dimension of refugee flows. [6]

Climate change Change in the statistical distribution of weather patterns for an extended period

Climate change occurs when changes in Earth's climate system result in new weather patterns that last for at least a few decades, and maybe for millions of years. The climate system is comprised of five interacting parts, the atmosphere (air), hydrosphere (water), cryosphere, biosphere, and lithosphere. The climate system receives nearly all of its energy from the sun, with a relatively tiny amount from earth's interior. The climate system also gives off energy to outer space. The balance of incoming and outgoing energy, and the passage of the energy through the climate system, determines Earth's energy budget. When the incoming energy is greater than the outgoing energy, earth's energy budget is positive and the climate system is warming. If more energy goes out, the energy budget is negative and earth experiences cooling.

The Armed Conflict Location & Event Data Project (ACLED) is a non-governmental organization that collates and analyzes data on political violence and protest around the world. As of 2018, ACLED has recorded over 420,000 individual events across Africa, South Asia, Southeast Asia, and the Middle East, and it is expanding coverage to include Europe and Latin America. It specializes in disaggregated conflict collection and crisis mapping. The ACLED team conducts analysis to describe, explore, and test conflict scenarios, and makes both data and analysis open for use by the public.

Reception in blogs

Political scientist and forecasting expert Jay Ulfelder called the Worldwide Atrocities Dataset a "useful data set on political violence that almost no one is using." [7] It was also referenced by Patrick Meier while reviewing a paper that used the dataset. [5]

Jay Ulfelder is a political scientist in the United States who works in the realm of political forecasting, specifically forecasting about political development and instability of different forms around the world. He maintains a blog titled Dart-Throwing Chimp that includes commentary on his work and on geopolitical forecasting at large. He is the former research director of the Political Instability Task Force commission by the Central Intelligence Agency.

Related Research Articles

A geographic information system (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data. GIS applications are tools that allow users to create interactive queries, analyze spatial information, edit data in pod maps, and present the results of all these operations. GIS sometimes refers to geographic information science (GIScience), the science underlying geographic concepts, applications, and systems.

R (programming language) programming language for statistical computing

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, data mining surveys, and studies of scholarly literature databases show substantial increases in popularity in recent years.. as of May 2019, R ranks 21st in the TIOBE index, a measure of popularity of programming languages.

Geoinformatics is the science and the technology which develops and uses information science infrastructure to address the problems of geography, cartography, geosciences and related branches of science and engineering.

Geocoding is the computational process of transforming a physical address description to a location on the Earth's surface. Reverse geocoding, on the other hand, converts geographic coordinates to a description of a location, usually the name of a place or an addressable location. Geocoding relies on a computer representation of address points, the street / road network, together with postal and administrative boundaries.

Spatial analysis Formal techniques which study entities using their topological, geometric, or geographic properties

Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early development, using different analytic approaches and applied in fields as diverse as astronomy, with its studies of the placement of galaxies in the cosmos, to chip fabrication engineering, with its use of "place and route" algorithms to build complex wiring structures. In a more restricted sense, spatial analysis is the technique applied to structures at the human scale, most notably in the analysis of geographic data.

Michael Don Ward is an American political scientist and academic. He is professor emeritus of political science at Duke University, an affiliate of the Duke Network Analysis Center, and the principal investigator at Ward Lab, a website that creates conflict predictions using Bayesian modeling and network analysis.

Visual analytics

Visual analytics is an outgrowth of the fields of information visualization and scientific visualization that focuses on analytical reasoning facilitated by interactive visual interfaces.

UCDP Conflict Encyclopedia

The neutral density or empirical neutral density is a density variable used in oceanography, introduced in 1997 by David R. Jackett and Trevor McDougall. It is function of the three state variables and the geographical location and it has the typical units of density (M/V). The level surfaces of form the “neutral density surfaces”. These are widely regarded as the most natural layer interfaces stratifying the deep ocean circulation, along which the strong lateral mixing in the ocean occurs, although this has yet to be rigorously established. These surfaces are widely used in water masses analysis. Neutral density is a density variable that depends on the particular state of the ocean, and hence a function of time as well, although this is often ignored. In practice, its construction for a given hydrographic observation is achieved by means of a computational code, that contains the computational algorithm developed by Jackett and McDougall. Use of this code is currently restricted to the present day ocean.

MNIST database database of handwritten digits

The MNIST database is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Global Terrorism Database

The Global Terrorism Database (GTD) is a database of incidents of terrorism from 1970 onward. As of July 2017, the list extended through 2016, with an incomplete data of 1993 due to issues with that year. The database is maintained by the National Consortium for the Study of Terrorism and Responses to Terrorism (START) at the University of Maryland, College Park in the United States. It is also the basis for other terrorism-related measures, such as the Global Terrorism Index (GTI) published by the Institute for Economics and Peace.

Conflict and Mediation Event Observations (CAMEO) is a framework for coding event data. It is a more recent alternative to the WEIS coding system developed by Charles A. McClelland and the Conflict and Peace Data Bank (COPDAB) coding system developed by Edward Azar.

Philip A. "Phil" Schrodt is a political scientist known for his work in automated data and event coding for political news. On August 1, 2013, he announced that he was leaving his job as professor at Pennsylvania State University to become a full-time consultant. Schrodt is currently a senior research scientist at the statistical consulting firm Parus Analytical Systems.

Rare or extreme events are events that occur with low frequency, and often refers to infrequent events that have widespread impact and which might destabilize systems. Rare events encompass natural phenomena, anthropogenic hazards, as well as phenomena for which natural and anthropogenic factors interact in complex ways.

The assessment of risk factors for genocide is an upstream prevention mechanism for genocide. This means that, if used correctly, the international community would be able to foresee a genocide before the killing took place, and prevent it. Countries can have many warning signs that it may be leaning in the direction of a future genocide. If signs are presented the international community takes notes of them and watches over the countries that have a higher risk. Many different scholars, and international groups, have come up with different factors that they think should be considered while examining whether a nation is at risk or not. One predominant scholar in the field James Waller came up with his own four categories of risk factors; including, governance, conflict history, economic conditions, and social fragmentation.

References

  1. 1 2 3 "Political Instability Task Force Worldwide Atrocities Dataset". Computational Event Data System maintained by Parus Analytical Systems for Pennsylvania State University. June 5, 2014. Retrieved June 21, 2014.
  2. "Coding manual for Worldwide Atrocities Dataset" (PDF). Retrieved June 21, 2014.
  3. Busby, Joshua W.; Smith, Todd G.; White, Kaiba L.; Strange, Shawn M. (2012). "Locating Climate Insecurity: Where Are the Most Vulnerable Places in Africa?". Climate Change, Human Security and Violent Conflict, Hexagon Series on Human and Environmental Security and Peace. 8: 463–511. doi:10.1007/978-3-642-28626-1_23.
  4. Weidmann, Nils B.; Kuse, Doreen (2009). "WarViews: Visualizing and Animating Geographic Data on Civil War". International Studies Perspectives . 10: 36–48.
  5. 1 2 Meier, Patrick (February 24, 2009). "NeoGeography and Crisis Mapping Analytics" . Retrieved June 21, 2014.
  6. Gold, Valentin; Haer, Ross (May 2011). "The Diffusion of Atrocities: A Spatial Analysis of the Role of Refugees" (PDF). Network of European Peace Scientists . Retrieved June 21, 2014.
  7. Ulfelder, Jay (June 10, 2014). "A Useful Data Set on Political Violence that Almost No One Is Using" . Retrieved June 21, 2014.

Official website