China Biographical Database

Last updated
China Biographical Database (CBDB)
CBDB .png
LanguagesEnglish, Chinese
Access
ProvidersFairbank Center for Chinese Studies at Harvard University, Institute of History and Philology of Academia Sinica, Center for Research on Ancient Chinese History at Peking University
Coverage
DisciplinesHumanities, Social Science
Temporal coverage7th - 19th centuries
Geospatial coverageChina
No. of recordsmore than 360,000 individuals
Links
Website projects.iq.harvard.edu/cbdb/home

The China Biographical Database (CBDB) is a relational database on Chinese historical figures from the 7th to 19th centuries. [1] The database provides biographical information (name, date of birth and death, ancestral place, degrees and offices held, kinship and social associations, etc.) of approximately 360,000 individuals up until April 2015. [2]

Contents

History

CBDB was originally started by the late Chinese historian Robert M. Hartwell. [3] Hartwell first conceived of using a relational database to study the social and family networks of Song dynasty officials. Aware of the lack of large dataset research in social and economic history of medieval China, he took the first step to collect large sets of data himself and generate meaningful answers to historical changes through data analysis. One important legacy of Professor is program of massive data which he structured around

  1. people,
  2. places,
  3. a bureaucratic system,
  4. kinship structures and
  5. contemporary modes of social association.

Before his death Professor Hartwell bequeathed the program, which by then consisted of more than 25,000 individuals, a bibliographic database of over 4500 titles, and multiple geo-reference tools to the Harvard Yenching Institute.[ citation needed ]

Later, Michael A. Fuller, Professor of Chinese Literature at UC Irvine, started to redesign the application. Professor Peter K. Bol at Harvard also has disseminated extensive digital information for quantitative analysis. As a joint project of Fairbank Center for Chinese Studies at Harvard University, Institute of History and Philology of Academia Sinica (中研院歷史語言研究所), and Center for Research on Ancient Chinese History at Peking University (北京大學中國古代史研究中心), the database has been greatly expanded in temporal and coverage scope.

Sources

CBDB uses wide range of biographical sources to collect information about individuals. The main types of writings covered include biographical index, biography sections of official histories, funerary essays, epitaphs, local gazetteers, preface, writings, letters, and colophons in personal writing collections, and other governmental compiled records. [4]

CBDB is a long-term open-ended project. It has incorporated sources from biographical indexes 傳記資料索引 for Song 宋 (completed), Yuan 元 (completed), and Ming 明, birth-death dates for Qing 清 figures and listing of Song local officials. CBDB is also cooperating with other databases such as Ming Qing Women's Writings (MQWW), Ming Qing Name Authority, and Pers-DB Knowledge Base of Tang Persons (Kyoto) to enrich its entries. [5]

Limitations

CBDB aims at extracting large amount of data from extant sources through data mining techniques. As a result, social and kinship associations, such as might be known from an individual's literary collection, and funerary biographies are not exhaustive. Because of the nature of the sources, career data (e.g. ranks and positions a person held), will be biased toward higher offices. Since the database does not require in-depth research into each individuals, factual errors and contradictory information would also be included in the entries, as long as they are from the primary source. [6]

Geo-reference tools

One area in which CBDB could be used is prosopographical research. [7] By combining geographic information system (GIS) software with CBDB, patterns could be mapped out through queries generated from large datasets, for instance, who came from a certain place and what were the social and kinship connections among all those who entered government through the civil service examination from a certain place within a certain span of years, etc. [8] One useful geo-reference tool for the study of Chinese history is the China Historical GIS (CHGIS) project, which makes datasets of the administrative units between 221 BC and 1911 AD and major towns for the 1820–1911 period freely available. Other GIS software such as ArcGIS or MapInfo (or even GoogleEarth) are also compatible with CBDB output.

Commercial tax quotas as of 1077 and the success of localities in the civil service examinations during Northern Song Commercial tax quotas as of 1077 and the success of localities in the civil service examinations during Northern Song.jpg
Commercial tax quotas as of 1077 and the success of localities in the civil service examinations during Northern Song

See also

Related Research Articles

<span class="mw-page-title-main">Geographic information system</span> System to capture, manage and present geographic data

A geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data. Much of this often happens within a spatial database, however, this is not essential to meet the definition of a GIS. In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.

<span class="mw-page-title-main">Gazetteer</span> Geographical dictionary or directory used in conjunction with a map or atlas

A gazetteer is a geographical dictionary or directory used in conjunction with a map or atlas. It typically contains information concerning the geographical makeup, social statistics and physical features of a country, region, or continent. Content of a gazetteer can include a subject's location, dimensions of peaks and waterways, population, gross domestic product and literacy rate. This information is generally divided into topics with entries listed in alphabetical order.

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.

<span class="mw-page-title-main">Virtual globe</span> 3D software model or representation of Earth or another world

A virtual globe is a three-dimensional (3D) software model or representation of Earth or another world. A virtual globe provides the user with the ability to freely move around in the virtual environment by changing the viewing angle and position. Compared to a conventional globe, virtual globes have the additional capability of representing many different views of the surface of Earth. These views may be of geographical features, man-made features such as roads and buildings, or abstract representations of demographic quantities such as population.

<span class="mw-page-title-main">Northwestern China</span> Geographical region of China

Northwestern China (西北) is a geographical region of China which includes three provinces and two autonomous regions.

gvSIG Desktop application for working with geographic data

gvSIG, geographic information system (GIS), is a desktop application designed for capturing, storing, handling, analyzing and deploying any kind of referenced geographic information in order to solve complex management and planning problems. gvSIG is known for having a user-friendly interface, being able to access the most common formats, both vector and raster ones. It features a wide range of tools for working with geographic-like information.

The Lemur Project is a collaboration between the Center for Intelligent Information Retrieval at the University of Massachusetts Amherst and the Language Technologies Institute at Carnegie Mellon University. The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri and Galago search engines, the ClueWeb09 and ClueWeb12 datasets, and the RankLib learning-to-rank library. The software and datasets are used widely in scientific and research applications, as well as in some commercial applications.

CBDB may refer to:

A historical geographic information system is a geographic information system that may display, store and analyze data of past geographies and track changes in time. It can be regarded as a tool for historical geography.

The China Historical Geographic Information System (CHGIS) is a Historical GIS project for creating a database of populated places and historical administrative units for the period of Chinese history between 222 BCE and 1911 CE. The project creates a dataset which tracks changes in place names, administrative status, and geography. It is a joint project of Harvard University and Fudan University. Its director is Professor Peter K. Bol of Harvard.

A Spatial Data Infrastructure (SDI), also called geospatial data infrastructure, is a data infrastructure implementing a framework of geographic data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. Another definition is "the technology, policies, standards, human resources, and related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data".

David Der-wei Wang is a literary historian, critic, and the Edward C. Henderson Professor of Chinese Literature at Harvard University. He has written extensively on post-late Qing Chinese fiction, comparative literary theory, colonial and modern Taiwanese literature, diasporic literature, Chinese Malay literature, Sinophone literature, and Chinese intellectuals and artists in the 20th century. His notions such as "repressed modernities", "post-loyalism", and "modern lyrical tradition" are instrumental and widely discussed in the field of Chinese literary studies.

<span class="mw-page-title-main">Michael Szonyi</span>

Michael A. Szonyi is Professor of Chinese History at Harvard University and former director of the Fairbank Center for Chinese Studies. His research focuses on the local history of southeast China, especially in the Ming dynasty, the history of Chinese popular religion, and Overseas Chinese history.

<span class="mw-page-title-main">Geospatial topology</span> Type of spatial relationship

Geospatial topology is the study and application of qualitative spatial relationships between geographic features, or between representations of such features in geographic information, such as in geographic information systems (GIS). For example, the fact that two regions overlap or that one contains the other are examples of topological relationships. It is thus the application of the mathematics of topology to GIS, and is distinct from, but complementary to the many aspects of geographic information that are based on quantitative spatial measurements through coordinate geometry. Topology appears in many aspects of geographic information science and GIS practice, including the discovery of inherent relationships through spatial query, vector overlay and map algebra; the enforcement of expected relationships as validation rules stored in geospatial data; and the use of stored topological relationships in applications such as network analysis. Spatial topology is the generalization of geospatial topology for non-geographic domains, e.g., CAD software.

A prosopographical network is a system which represents a historical group made up by individual actors and their interactions within a delimited spatial and temporal range. The network science methodology offers an alternative way of analyzing the patterns of relationships, composition and activities of people studied in their own historical context. Since prosopography examines the whole of a past society, its individuals who made it up, and its structure, this independent science of social history uses a collective study of biographies of a well-defined group, in a multiple career analysis, for collecting and interpreting relevant quantities of data, these same set of data can be employed for constructing a network of the studied group. Prosopographical network studies have emerged as a young and dynamic field in historical research; nevertheless, the category of prosopographical network is in its formative, initial phase and as a consequence it is hard to view as a stable and defined notion in history and beyond social network analysis. See also narrative network.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

The GDELT Project, or Global Database of Events, Language, and Tone, created by Kalev Leetaru of Yahoo! and Georgetown University, along with Philip Schrodt and others, describes itself as "an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day." Early explorations leading up to the creation of GDELT were described by co-creator Philip Schrodt in a conference paper in January 2011. The dataset is available on Google Cloud Platform.

Open energy system database projects employ open data methods to collect, clean, and republish energy-related datasets for open use. The resulting information is then available, given a suitable open license, for statistical analysis and for building numerical energy system models, including open energy system models. Permissive licenses like Creative Commons CC0 and CC BY are preferred, but some projects will house data made public under market transparency regulations and carrying unqualified copyright.

Peter Kees Bol is an American historian and sinologist. He is the Charles H. Carswell Professor of East Asian Languages and Civilizations of Harvard University. Since 2013, he has been a Vice Provost of Harvard with oversight of HarvardX and the Harvard Initiative in Learning and Teaching (HILT). He is the founding director of the Harvard Center for Geographic Analysis, and also directs the China Historical Geographic Information System (CHGIS) and the China Biographical Database (CBDB) project.

References

  1. "China Biographical Database Project (CBDB)". projects.iq.harvard.edu. Retrieved 2020-10-19.
  2. "China Biographical Database Project (CBDB)". Projects.iq.harvard.edu. 2016-11-07. Retrieved 2016-12-11.
  3. Mostern, Ruth (2011). "Dividing the Realm in Order to Govern": The Spatial Organization of the Song State (960-1276 CE). Cambridge: Harvard University Press. p. 10.
  4. Fuller, Michael A. (February 28, 2015). "The China Biographical Database User's Guide" (PDF). China Biographical Database.
  5. Reviews of Internet resources for Asian Studies. "Resource: China Biographical Database Project (CBDB) [New Release]". No. Jan 2011, Vol. 18, No. 1, 320. The Asian Studies WWW Monitor.
  6. "New Approaches in Chinese Digital Humanities - CBDB and Digging into Data Workshop". Peking University. Office of International Relations. 2016-01-11. Archived from the original on 2016-07-01. Retrieved 2016-06-02.
  7. Gerritsen, Anne (2008). "Prosopography and its Potential for Middle Period Research (Workshop on the Prosopography of Middle Period China: Using the China Biographical Database)". Journal of Song-Yuan Studies . 38: 161–201. doi:10.1353/sys.2008.a380505.
  8. "China Biographical Database | Qing Studies". Qing_studies.press.jhu.edu. Archived from the original on 2017-01-10. Retrieved 2016-12-11.

Further reading

  1. Bol, Peter K.; Liu, Chao-Lin; Wang, Hongsu (2015). "Mining and Discovering Biographical Information in Difangzhi with a Language-Model-based Approach". arXiv: 1504.02148 [cs.CL].
  2. Peter Bol. "The Late Robert M. Hartwell "Chinese Historical Studies, Ltd." Software Project" (PDF). Pnclink.org. Retrieved 2016-12-11.
  3. Anne Gerritsen. "Using the CBDB for the study of women and gender? Some of the pitfalls" (PDF). Humanities.uci.edu. Archived from the original (PDF) on 2017-01-10. Retrieved 2016-12-11.
  4. Michael A. Fuller (February 28, 2015). "The China Biographical Database : User's Guide" (PDF). Projects.iq.harvard.edu. Retrieved 2016-12-11.
  5. "CBDB Querying and Reporting System - Online Help". Db1.ihp.sinica.edu.tw. Retrieved 2016-12-11.