Github repository with code for the CMGPD Public Release

We created a repository to share the STATA code that processes the original CMGPD data from the Excel spreadsheets produced by our coders and turns it into the working file that is the basis of our analysis and the public release available at ICPSR. This is intended to help users of the data better understand the process by which it went from the spreadsheets transcribed by the coders to the datasets available at ICPSR. The code for linking individuals to their kin may be of particular interest.

Major phase of data entry for the China Government Employee Database-Qing Jinshenlu (CGED-Q JSL) completed

In November 2021, our coders completed the entry of virtually all the quarterly editions of the rosters of Qing civil officials 縉紳錄 and military officials 中樞備覧 available to the Lee-Campbell Group, including all the editions from the published Tsinghua University Library collection and other editions from  the Columbia University and Harvard University libraries, as well as the National Library and Shanghai Library. We are grateful to the staff of all these libraries, in particular the Columbia University Library, for their cooperation in making their library holdings available.  We have also located a number of other editions in the Peking University library and the Palace Museum Library, but do not yet have access to these data.  We are not aware of any other readily accessible editions in other collections.

The CGED-Q JSL now consists of 4,433,600 records of 327,618 officials for the period between 1760 and 1912. 3,843,644 are records of civil offices in editions of the jinshenlu and 589,956 are records of military offices in editions of the zhongshubeilan. The data are most complete for the period 1830 to 1912. According to our analysis based on our most recent record linkage, of these officials, 261,451 were civil officials, 58,482 were military officials, and 7,685 made appearances as both civil and military officials. Please note that since these counts of numbers of officials are based on record linkage, they may change as we adjust our nominative linkage procedures.

Figure 1 (below) summarizes the coverage of the entered 縉紳錄 editions by decade (black bar) and compares it to the potential coverage if all the editions in different collections were entered. In the 1840s, and then from 1870 to 1912, we have entered at least one edition per year. In the 1830s, and then from 1850 to 1869, we have at least one edition entered for 9 out of 10 years in each decade. Between 1800 and 1830, the coverage of our entered data is spottier. We have at least one edition in 7 out of 10 years in the 1800s, 4 out of 10 years in the 1810s, and 6 out of 10 years in the 1820. From 1760 to 1800, our coverage is less complete, with at least one edition entered every 2 to 4 years per decade.

Figure 1. Entered and Available Editions

Based on our review of the catalogs of other collections, it should still be possible to improve coverage of the last half of the 18th century and first half of the 19th century. The heights of the green bars represent the numbers of years for which at least one edition appears to exist in other collections. Most of these are in the Peking University Library and the Palace Museum. We hope very much to gain access to these collections at some point in the future.

Figure 2 presents a more detailed view of the coverage of the editions so far. From about 1865 onward, we have 3 or 4 editions per year entered all the way to 1911. From 1830 to 1865 or so, we have at least one or two editions per year entered, except for one year each in the 1850s and 1860s where we have no editions at all. Before 1830, it is more common to have one or two editions entered, or none at all.

Figure 2. Entered editions by year

For more details about the CGED, please see the project page.

Addendum – 30 April 2022

Since November 2021, we found five more editions that had been entered but not added to our central work file. This post and the content of related pages has been accordingly updated.

CGED-Q Jinshenlu 1900-1912 Public Release Tabulation and Visualization Platform

Charlie Liu, an undergraduate in the Quantitative Social Analysis program at HKUST, created a platform for producing tabulations and visualizations with the CGED-Q Jinshenlu 1900-1912 Public Release. At the platform, users can explore the contents of the publicly released CGED-Q  for the period 1900-1912 without having to download data and open it in a statistical package in R or Stata. Among the available variables are province and county of origin, location of current post, Banner status, and exam or purchase degree (出身). Here is the CGED-Q tabulation and visualization platform.

Our CGED-Q project page has more information about the CGED-Q itself, including links to sites where advanced users can download the data to be analyzed in a statistical package like R or Stata. These sites also include documentation.

As a reminder, if you’re looking for a specific official, the entire CGED-Q is searchable via this platform, originally created by Fi Siwei and housed on a server by the HKUST VisGroup.


New Lee-Campbell Group Dataverse at Harvard Dataverse, with CGED-Q Jinshenlu 1900-1912 Public Release

We have created a Lee-Campbell Group Dataverse at Harvard Dataverse to host the CGED-Q Jinshenlu 1900-1912 Public Release and future publicly releases. This will complement the sites that already host our public data at the HKUST Dataspace and the Renmin University Institute of Qing History. We hope that this will facilitate access to the data by users in North America and also increase the visibility of the publicly released data.

Lee-Campbell Group Dataverse at Harvard Dataverse

CGED-Q Jinshenlu 1900-1912 Public Release at Harvard Dataverse

Data will continue to be available at the existing sites. Please see the CGED-Q page for links.

China Multi-Generational Panel Dataset Wins a Prize

At the 2020 Chinese Digital Humanities Annual Meeting (中国数字人文年会), the China Multi Generational Panel Dataset 中国多世代人口数据库 CMGPD was awarded the inaugural “最佳题材奖” Prize.  This is the 14th prize or similar recognition awarded to Lee-Campbell group research projects and our first for something other than a book or article.  See Publications and Prizes for a complete list.

CGED-Q 1900-1912 Jinshenlu public release workshop held at Central China Normal University

Participants at the workshopWe held a workshop on July 20-22 at Central China Normal University to introduce the China Government Employee Database-Qing (CGED-Q) Jinshenlu 1900-1912 public release. The workshop was co-organized by the Renmin University Institute of Qing History, the Hong Kong University of Science and Technology Division of Social Science, and Central China Normal University, and the local organizer was the Central China Normal University School of History and Culture. Faculty and students from HKUST, Renmin University, Central China Normal University and other institutions made presentations introducing the public release and other major databases, providing examples of applications, and explaining how to load the data into major statistical packages. The participants included 34 postgraduate students from a variety of institutions in the mainland and elsewhere, a number of guests from Central China Normal University and other institutions in Wuhan. The program is below.

时间 内容 主讲人
7月19日 14:00-18:30 报到
7月20日 9:00 开幕式
9:30 合影
9:45  《中国历史官员量化数据库――清代》(以下简称CGED-Q)项目的历史、现状和未来 康文林,任玉雪
10:15 介绍人民大学清史研究所数字清史实验室(清史数据共享平台) 胡恒
10:45 休息15分钟
11:00 华中师大的大数据历史研究的源起 马敏、付海晏
11:30 李中清—康文林团队其他相关科研项目介绍 李中清、任韵竹
12:30 午餐
14:00 CGED-Q项目介绍第一节


15:00 CGED-Q项目介绍第二节


16:15 清代《缙绅录》的内容来源与出版过程 阚红柳
16:45 讨论  
17:15 结束
7月21日 8:45 如何应用STATA对CGED-Q进行量化分析 陈必佳
9:45 应用Python分析CGED-Q中文官仕途的可能性和文官群体仕途的共性 王彦邦、陈煦萌
10:45 休息15分钟
11:00 应用RGIS分析CGED-Q中以县级行政区划为单位的文官系统规律及变化 张梦迪
12:00 指导使用R, STATA和Python分析数据和相关问题答疑  
12:30 午餐
14:00 CGED-Q的记录连接和其他相关议题:
15:00 文官系统中有科名的官员及旗人官员 陈必佳
15:45 清朝的回避制度 任玉雪
16:15 冲繁疲难和地方官员的任职 胡恒
16:45 讨论  
17:30 结束
7月22日 9:00


学员将被分成小组,运用R, Python或STATA进行简单的分析
10:00 休息15分钟
10:15 分享和讨论
12:15 闭幕式
12:30 结束


马  敏      华中师范大学

彭南生      华中师范大学

李中清      香港科技大学

康文林      香港科技大学

付海晏      华中师范大学

胡  恒      中国人民大学清史研究所

阚红柳      中国人民大学清史研究所

胡  迪      南京师范大学

任玉雪      上海交通大学

陈必佳      香港科技大学

任韵竹      香港科技大学

陈煦萌      香港科技大学

张梦迪      香港科技大学

王彦邦      香港科技大学

吴艺贝      上海交通大学

杨  莉      上海交通大学

Updated version of CGED-Q 1900-1912 Jinshenlu Public Release available for download

We prepared a new version of the CGED-Q 1900-1912 Jinshenlu Public Release that removes leading and trailing blank spaces from all fields. The blank spaces were introduced during the data entry process and are unnecessary. Users previously had to remove them with the trim command in STATA or the equivalent in R or whatever other package they were using.

We have also prepared a version of the release where all the column headings/variables names are in pinyin rather than Chinese characters. We learned that R and possibly some other packages have trouble with Unicode variable names.

The files are available at the usual download sites.


Renmin University:

China Government Employee Database – Qing (CGED-Q) 1900-1912 Jinshenlu records available for download

We have made available a ‘beta’ version of the China Government Employee Database – Qing (CGED-Q) 1900-1912 Jinshenlu public release that includes data and documentation. The release consists of 638,152 records of 50,049 officials (based on our linkage) recorded in 43 quarterly editions. For more details, including links for downloading the data, please visit our CGED-Q Project Page.

The final, formal release will be in October. Until then, we will be updating data and documentation as problems are identified.

2019 Summer Workshop Introducing the 1900-1912 Jinshenlu Public Release

The Lee-Campbell group at HKUST in cooperation with the Institute of Qing History at Renmin University and the Institute of History and Culture at Central China Normal University is organizing a workshop to introduce the first public release from our China Government Employee Database-Qing (CGED-Q) database.

Civil officials according to whether they are Qiren or civilian, and serving in the capital or outside the capital, between 1900 and 1912. Constructed with the CGED-Q.

The initial release will consist of roughly 600,000 records of 60,000 civil officials who were recorded in the quarterly editions of the jinshenlu (缙绅录) between 1900 and 1912. Along with accompanying documentation, it will be available for download in May 2019 at sites at Renmin University and HKUST. In the coming years, the Lee-Campbell group plans to release all of the data, which at present consists of approximately 3.2 million records.

For additional information about the workshop, please see the announcement at the Renmin University Institute for Qing History website.

For an introduction to the CGED-Q, please see this project page at the Lee-Campbell Group website.

Construction and public release of the CGED-Q database has been supported by RGC GRF 16601718 and 16400114.