How to Create Maps in R with the ggplot2 Package – Part 1

Working with geographical data is not easy. But learning how to create choropleth maps is an important skill. In this tutorial, we will learn how to create maps in R with the ggplot2 package.

The ggplot2 package is informed by Leland Wilkinson’s grammar of graphics. To get familiarized with this philosophy and ggplot2’s layering system, you can read the following tutorial. This package includes different geometric layers to create maps.

In Part 1 of this tutorial, we will be working with the geom_polygon function. In the second tutorial, we will be using the geom_sf function and in the third tutorial, we will use the ggplot2’s geom_map function.

When creating maps, the main challenge is combining a dataset with information we want to map with a dataset that includes geographic information (e.g., spatial data). To put this challenge into perspective, in this tutorial, I will explain how I created a map that describes the ratification status of the 1966 International Convention on the Elimination of All Forms of Racial Discrimination (ICERD). This map was part of a post I co-authored illustrating the number of United Nations (UN) member states that have ratified the core UN human rights treaties and their protocols. Once we develop the R code for this map, we reused this code to create the other 17 maps. In other words, we created 18 dataframes with data on the ratification status of each treaty and protocol, but we used one dataframe with spatial data to develop all the maps.

The ability to reproduce existing code for similar projects streamlines our workflows. This is one of R‘s strengths.

Data on UN Member States’ Ratification Status

The data on the ratification status of the core UN human rights instruments can be found in the World Politics Data Lab’s GitHub repository. While we will be using the data on ICERD’s ratification status in this tutorial, all the datasets include the following variables:

  • iso2 – Represents the ISO2 code for the U.N. member state.
  • member – The name of the UN member state, usually in the English language.
  • state party – It is a dichotomous variable: 1 equals the member’s ratification of the treaty.
  • year ratification or accession – This is the year the member state ratified the convention.
  • signatory – A dichotomous variable that captures whether a state that has not ratified the treaty has signed it.
  • no action – A dichotomous variable: 1 equals a state which has not signed or ratified the convention.
  • status – The variable describes the UN member state’s status regarding the convention. It offers three options: state party, signatory state, or no action.
  • notes – Text explaining a particular observation.

It is important to note that the “status” variable is the one that is coded according to tidy data principles and the one we will use to create our maps. The dummy variables are not needed for this analysis.

The ISO2 code was created by the International Organization for Standardization and this body also has created a 3 letter code for states as well.

Finally, I extrapolated the data for ICERD’s ratification status from the UN Treaty Collection’s webpage. The data for the other treaties and protocols were collected by Júlia Oliveira Souza and Jeehae G. Park. We used Google Sheets to construct these datasets and once completed, we uploaded them to the GitHub repository.

From this point, we can go into RStudio to upload the ICERD data and use various packages to create our map. Using R’s library function, let’s load the following packages. If you have not, installed this package, please do so using the install.packages function.

library (tidyverse)
library (countrycode)

We can now upload the data from the GitHub repository. We will create an object called “hr” for human rights and we will use the read.csv function to assign the data to this object.

hr <- read.csv ("https://raw.githubusercontent.com/world-politics-datalab/humanrights-treaties-ratification-status/main/hr_status_icerd.csv")

After some experimentation, the original data did not include information about Greenland. Given that this is a Danish territory and that the map looked weird without this island, I decided to add Denmark’s ratification status to this territory. It is important to note that this is not a necessary step.

I then created a new object called “hr_gl” and we use the tibble packages add_row function to add the following information.

hr_gl <- hr %>%
  add_row(iso2 = "GL", member= "Greenland", state.party = 1, year.ratification.or.accession= 1971, signatory=0, no.action=0,status="state party")

For the sake of simplicity, I renamed the “hr_gl” object into the existing “hr” object. Then I used the view function to look at our dataframe and make sure that it includes data for Greenland and to also make sure that R read all the data correctly.

hr <- hr_gl
view (hr)

After reviewing the dataframe, I noticed that R is reading Namibia’s ISO2 code, which is NA, as a “NA“, which represents a missing value. We could change the information manually using the replace function. But I decided to use the following step.

hr$iso2 <- countrycode (hr$member, origin= "country.name", destination = "iso2c", )

I used the countrycode package to update the ISO2 code for all the countries listed in the “hr” dataframe. This step makes sure that all the ISO2 codes are entered correctly. This package is a very useful package and all international relations students should become familiarized with its functions!

Creating a New Dataframe to Generate a Map of the World

We are going to create a new object called “wmap” which will include the spatial data to generate a basic map of the world. We can access this geographic information using ggplot2’s map_data function. The head function will show us the first rows of the dataset, which includes more than 99,000 observations.

wmap <- map_data ("world")
head (wmap)
        long            lat           group   order     region subregion
        <dbl>           <dbl>          <dbl>  <dbl>     <chr>  <chr>
1	-69.89912	12.45200	1	1	Aruba	NA
2	-69.89571	12.42300	1	2	Aruba	NA
3	-69.94219	12.43853	1	3	Aruba	NA
4	-70.00415	12.50049	1	4	Aruba	NA
5	-70.06612	12.54697	1	5	Aruba	NA
6	-70.05088	12.59707	1	6	Aruba	NA

It is worth noting that the region is basically the name of the territory or country. This dataset does not include data for Aruba’s internal divisions, thus it has put an “NA” value for subregion. To draw the map, ggplot2 will use the group and order variables to draw the lines between different points, determined by the spatial information stored in the longitude and latitude variables.

We can use ggplot2 to generate the map. We will call the map object “worldplot”. We will use the geom_polygon function to create the map and coord_fixed function to set the map’s aspect ratio.

worldplot <- ggplot()+
  geom_polygon(data=wmap, aes(x=long, y=lat, group = group))+
  coord_fixed(1.3)

worldplot

Usually, when we generate graphs in ggplot2, we tend to enter the information in the first line between the parentheses following “ggplot”. We will use this structure when we create the final map. But for now, we entered the aesthetics in the geometric layer. It is also important to highlight that when creating maps with ggplot2 we need to specify that “group = group” all the time. This command instructs ggplot2 on how to generate the map.

To generate our final map, we need to create a new dataframe with the spatial data and the data on ICERD’s ratification status. We also have to add borders to each continent and then color each country according to its ratification status. Once we achieve these steps, we can work on other features of the map’s theme which will help us polish the illustration’s final “look”.

Merging the Spatial Data With Data on ICERD’s Ratification Status

We call the new dataframe “wamp_hr”. But before we create this new object we will add a new column to the “wamp” dataframe. This column will include the ISO2 code for all the listed states.

wmap$iso2 <- countrycode (wmap$region, origin= "country.name", destination = "iso2c", )

head(wmap)

Although not shown here, the head function shows the new column with the ISO2 codes.

Now we can merge the data on ratification status in the “hr” dataframe to the “wamp” dataframe using the dplyr package’s left_join function. We store this new dataframe in the following object: “wmap_hr”. We will be using the “iso2” columns to join these two dataframes. This is the reason we added a new ISO2 column to the “wmap” dataframe.

wmap_hr <- wmap %>%
  left_join (hr, by = c("iso2" = "iso2"))

If we use the view function to explore this new dataframe, we can note that there are a lot of missing values. The ISO2 code is also available for many international jurisdictions that are not recognized as sovereign states or are territories of other states, such as Aruba or Greenland. We can use the following code to get rid of all the rows that include “NA” values in the column titled: “status”. As noted above, this variable describes the UN member state’s ratification status. This categorical variable includes three options: state party, signatory state, or no action.

wmap_hr <- filter(wmap_hr, !is.na(status))

Using ggplot2 to Create the Final Map

To learn ggplot2’s layering system, the code will be divided into different parts.

In order to generate a map of the world with borders, we will start with the following code. Note that we are calling this object: “final_map”.

final_map <- ggplot(wmap_hr, aes(long, lat, group = group)) +
  coord_fixed(1.3)+
  geom_polygon(data =wmap_hr, colour = "white", fill = NA)

The geom_polygon function is used to add the borders to the map and I have painted them in white. Note that we will not add a fill at this point. We will do so with another geom_polygon layer. And we will also add a title to the map, using the ggtittle function.

final_map<- ggplot(wmap_hr, aes(long, lat, group = group)) +
  coord_fixed(1.3)+
  geom_polygon(aes(fill = status))+
  geom_polygon(data =wmap_hr, colour = "white", fill = NA) +
  ggtitle("International Covention on the Elimination of All Forms\nof Racial Discrimination (ICERD)")
It is important to add that the geom_polygon layer that includes the data on “status” of ratification should go before the one that produces the borders. Thus, the order of the layering can affect the look of a map and this is one of the more frustrating aspects of creating maps with ggplot2.

While this new map contains all the necessary information, we can add new layers to improve the “look” of the final map.

I am not a big fan of this color scheme. To set out my own colors, I created a new dataframe with the specific hex color numbers for each of the “status” variable’s three options. There are many websites that include a list of these codes. Here is a Wikipedia entry that I found useful with some sample hex codes.

cols <- c("state party" = "#0072B5FF", "signatory state" = "#E18727FF", "no action" = "#BC3C29FF")

Now we add the following layers to our existing code. The scale_fill_manual function added the information for colors and the theme_minimal function changed the background.

final_map<- ggplot(wmap_hr, aes(long, lat, group = group)) +
  coord_fixed(1.3)+
  geom_polygon(aes(fill = status))+
  geom_polygon(data =wmap_hr, colour = "white", fill = NA) +
  ggtitle("International Covention on the Elimination of All Forms\nof Racial     Discrimination (ICERD)")+
  scale_fill_manual(values= cols)+
  theme_minimal ()
This map is looking better but we can eliminate the grid with the spatial data and we can change the position of the legend, the size of the fonts, the font style, and so forth. These changes were introduced using the theme layers.
ggplot(wmap_hr, aes(long, lat, group = group)) +
  coord_fixed(1.3)+
  geom_polygon(aes(fill = status))+
  geom_polygon(data =wmap_hr, colour = "white", fill = NA) +
  ggtitle("International Covention on the Elimination of All Forms\nof Racial Discrimination (ICERD)") +
  scale_y_continuous()+
  scale_fill_manual(values= cols) +
  theme_minimal ()+
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        rect = element_blank())+
  theme(plot.title = element_text(size = 12, family = "sans", face="bold", hjust=0.5))+
    theme (legend.position = "bottom",
         legend.title = element_text(color = "black", size=10, family = "sans"),
         legend.text = element_text (color = "black", size =8, family = "sans"),
         legend.key.size = unit(0.3, 'cm'),
        legend.key.height = unit(0.3, 'cm'),
        legend.key.width = unit(0.3, 'cm'), 
         )+
  labs (fill = "Status of Ratification:",
        x="",
        y= "")+
   theme (panel.grid.major = element_blank(),
         panel.grid.minor = element_blank())

This code generates the final map.

To save the final map, we can use the ggsave function. This function will save the image in different formats: PNG, JPEG, TIFF, PDF, SVG, etc… We will save this map using the following name “icerd_map.png”. We can add a few other arguments.

ggsave ("icerd_map.png", width = 6, height = 4, unit = "in", dpi=300)

Concluding Thoughts

There are many ways to create maps in R with ggplot2. In Part 1 of this tutorial, we learned how to use the ggplot2‘s geom_polygon function to create a map showing the ratification status of ICERD. The code used in this post can be used to generate maps showing the ratification status of the other UN human rights instruments included in the GitHub repository. This example can also be applied to other treaties found in the UN Treaty Collection or any type of country-based data, such as the ones that inform the World Bank’s World Development Indicators (WDI).

In the next tutorial, we will create maps using the geom_sf function.

To learn more about creating maps in R with ggplot2 using the geom_polygon function, I recommend the following two resources:

Elena Tuzhilina, “PSet 9: Making Maps with R” (October 19, 2021).

Paul Christiansen’s YouTube video: R Tutorial: Creating Maps and Mapping Data with ggplot2 (March 16, 2021)

About the author:

Carlos L. Yordán is an Associate Professor of International Relations at Drew University. He is also the director of the Semester on the United Nations.

Leave a comment

Your email address will not be published. Required fields are marked *

css.php