The UN Texts as Data Project

Project Description:

In April 2021, Professor Yordán received a small research grant from Drew University’s Digital Humanities Summer Institute to explore the UN General Debate Corpus using different “text-as-data” methodologies (Grimmer and Stewart 2013). Created by Alexander Baturo, Niheer Dasandi, and Slava Mikhaylov (2021), the corpus is a collection of all the statements delivered by the world’s leaders in the UN General Assembly’s General Debate from 1970 to 2020. 

This DHSI team used R and various text analysis packages, including the tm, tidytext and quanteda,  to examine these speeches. We also used different topic modeling packages, including stm which was created to run structural topic models. The following figure summarizes the team’s work during the institute’s four-week period.

This work led to two important outcomes. Students currently enrolled in Drew’s Semester on the United Nations are using some text-analysis methodologies to analyze UN texts. Instead of using R, they are using the web-based Voyant Tools. Professor Yordán also wrote an academic paper that analyzes the UN General Debate Corpus to explain how states’ shifting attitudes towards terrorism after the attacks of September 11, 2001, changed their views about the UN’s role in counter-terrorism.

Professor Yordán received another small research grant from Drew’s DHSI for the 2022 summer. On top of building the World Politics Data Lab’s website, his team is using these text analysis techniques to not only study the UN General Debate Corpus, but also other UN texts, such as UN Secretary-General Antonio Guterres’ tweets.

This project will continue exploring the UN General Debate Corpus as well as other UN documents and corpora, including the U.N. Security Council Corpus (Shoenfeld, Eckhard, Patz, Meegdenbur & Pires 2019).

Most Recent Analyses: