How to Create Maps in R with the ggplot2 Package – Part 2

In Part 1 of this tutorial, we learned how to use ggplot2’s geom_polygon function to create a map illustrating the ratification status of the 1966 International Convention on the Elimination of All Forms of Racial Discrimination (ICERD).

In Part 2 of this tutorial, I will explain how to use the geom_sf function. I will also use different packages to add different layers of information to the map. These techniques can be used in any map created with any of ggplot2’s map functions.

In this tutorial, I will recreate a map I developed for one of my recent articles on Puerto Rico following Hurricane Maria. During the review process, one of the reviewers requested the addition of a map showing the hurricane’s path across the island. The article describes the storm’s effects on many of Puerto Rico’s 78 municipalities. Thus, I wanted my final map to include Maria’s path and the borders separating these local jurisdictions.

Getting Data on Hurricane Maria’s Path

Data on Hurricane Maria and other tropical storms in the Atlantic can be found in the National Hurricane Center’s Best Track Data also known HURDAT2. This dataset includes spatial data tracking the location of the storm, maximum winds, central pressure, and so forth for all Atlantic hurricanes starting in 1851 and ending in 2021.

I downloaded this dataset into my working directory and uploaded it into R using the readr package. I then subsetted the dataset for entries connected to Hurricane Maria. Each hurricane has its own unique identifier and after some exploring, I realized that Maria’s “key” was AL152017, representing the 15th tropical storm of the 2017 Atlantic hurricane season.

When I uploaded HURDAT2 into R, it stored the data in an object called “AL”. Using the filter function, I stored all the rows connected to Hurricane Maria in an object I called “df” for dataframe. Before wrangling the data, I first loaded the tidyverse using the library function. The tidyverse has many useful packages, including ggplot2 and dplyr. If you have not installed this package or the other packages used in this tutorial remember that you can use the install.packages function, as discussed in earlier tutorials.

install.packages (tidyverse)

library (tidyverse)

df <- filter (AL, Key == "AL152017")

Because I wanted my map to show the storm’s path over Puerto Rico, I had to further subset the data using dplyr’s slice function. So, I created a new object – “df1” – with the data found in rows 18 through 22.

df1 <- df%>%

The next step was to get a new dataframe to create a map of Puerto Rico.

My First Attempt

To create my first map, I decided to access a map of Puerto Rico, using ggplot2’s map_data function, which allows users to search the package’s collection of maps.

pr <- map_data ("world", "puerto rico")

As described in Part 1 of this tutorial, I used ggplot2 to create this map and to also add a line depicting Hurricane Maria’s path.

map1 <- ggplot (pr, aes(x=long, y=lat, group=group))+
  geom_line (data = df1, mapping = aes (x=Lon, y=Lat, group=NA),
             color = "red", size = 1)+


In this example, we used the geom_polygon function and we added a red line using the geom_line function. The “look” of the map is set using theme_classic(). Also note, that I used the coord_fixed function to set the image’s dimensions.

This map was not bad, but it lacked a few important elements. First, it does not show the island’s municipalities which were the focus of my article. Second, the map is incomplete. While the island of Vieques is included in this map, Culebras, which should be north of Vieques and east of the main island, is not included in this illustration.

My Second Attempt

To get a better map of Puerto Rico, I decided to use the tigris package, which allows users to download “shape files” (e.g., sf) from the United States Census Bureau. After loading the package using the library function, I followed the package’s instructions to access a map of Puerto Rico with its municipalities.

library (tigris)

options(tigris_class = "sf")

pr_sf <- counties (state = "PR", cb = TRUE)

The options function informs R we are searching for a “shape file”. The counties function allows us to search for a map of Puerto Rico (PR) with all the municipalities. To get familiarized with the “pr_sf” dataframe, I recommend that you use the view function. It includes the name of each municipality. It is worth noting names of the municipalities are in Spanish, thus some include accents and tildes. The geometry variable includes all the spatial information to draw the map.

To create the map using ggplot2, we will use the geom_sf function. It is worth noting, that I used the fill function to color the map in grey, using the following hex number ‘#A9A9A9’ and I used the color function to draw the borders of the municipalities in white. Note the information we used to set the map’s geometry.

In Part 1 of this tutorial, I noted that ggplot2 needed to use the group variable in the dataframe the geographic data to draw the map. The “pr_sf” dataframe we generated with the tigris package lacks a “group” or “order” variable. These data are found in the “geometry” column. Thus, it is important to note the changes in the code below.

map2 <- ggplot (data = pr_sf)
  geom_sf (fill = '#A9A9A9', color = "white" , aes (geometry = geometry, fill=NA))


In comparison to the previous map, this map includes all the municipalities as well as the island of Culebras.

To draw a line depicting Hurricane Maria’s path, we use the geom_line function as we did in the first attempt. I added a few parameters to improve the line’s final look: “size” establishes the thickness of the line and “alpha” its transparency.

map2 <- ggplot (data = pr_sf)+
  geom_sf (fill = '#A9A9A9', color = "white" , aes (geometry = geometry, fill=NA))+
  geom_line (data = df1, mapping = aes (x=Lon, y=Lat, group=NA), color = "red", size = .80, alpha = .50)


Now we can go ahead and work on the map’s final look. For my article, I used the tufte theme for all my infographics, which is part of the ggthemes package and we need to load it using the library function.

library (ggthemes)

map2 <- ggplot (data = pr_sf)+
  geom_sf (fill = '#A9A9A9', color = "white" , aes (geometry = geometry, fill=NA))+
  geom_line (data = df1, mapping = aes (x=Lon, y=Lat, group=NA), color = "red", size = .80, alpha = .50)+
  labs (title = "Hurricane Maria's Path Across Puerto Rico",
        caption = "Source: National Hurricane Center, HURDAT2 (2021)" )+
   theme(plot.title = element_text(size = 12, family = "serif",      
                                   face="bold", hjust=0.0))+
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        rect = element_blank())


Notice how we added a labs layer, which allows us to set a “title” and a “caption”. The other theme elements erased the longitudinal and latitudinal data. We also set the font size and font type, while justifying the title on the left side of the illustration.

What if we wanted to add a few other layers to this map? What if we wanted to color the municipalities by the number of customers with electricity on March 1, 2018?

Adding New Information to the Second Map

Hurricane Maria destroyed Puerto Rico’s electrical system. Many of the island’s towns were without electricity for over 6 months. On March 1, 2018, officials working for the US Army Corps of Engineers prepared a map detailing the number of customers with electricity in each of the island’s municipalities. Using the second map as a starting point, we can use these data to create a choropleth map. The first step was to upload the data from one of my Google Sheets file into R using the readr package and then we assigned these data to a new dataframe I called “df_aee”.

df_aee <- USACE_AEE_ESTIMATE_March2018_municipio_region_estimate_2_

Then I used dplyr’s left_join function to merge the “pr_sf” dataframe we develop using the tigris pacakge with the “df_aee”. As noted in the code, we joined the files using the “NAME” column found “pr_sf” and the “municipio” column found in “df_aee”. It is important to note that the names of the municipalities in both dataframes were written the same. I called this new dataframe: “pr_sf_aee”.

pr_sf_aee <- left_join(pr_sf, df_aee, by = c("NAME" = "municipio"))

The code that we used for the second map was used for the third map. The only difference is that I used the “pr_sf_aee” dataframe instead of the “pr_sf” dataframe. I also changed the justification of the caption. The color scheme is set by “scale_fill_gradient2_tableu()”, which is part of the ggthemes pakcage, which we loaded already.

map3 <- ggplot (data = pr_sf_aee)+
  geom_sf (color = "white" , aes (geometry = geometry, fill= abonados_perc))+
  geom_line (data = df1, mapping = aes (x=Lon, y=Lat, group=NA), color = "red", size = .80, alpha = .5)+
  labs (title = "Percentage of Customers with Electricity\nBy Municipality on March 1, 2018",
        caption = ,
        fill= "Customers with\nElectricity")+
   theme(plot.title = element_text(size = 12, family = "serif",      
                                   face="bold", hjust=0.0))+
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        rect = element_blank())+
   theme (legend.position = "right",
         legend.title = element_text(color = "black", size=10, family = "serif", face="bold"),
         legend.text = element_text (color = "black", size =8, family = "serif"),
         legend.key.size = unit(0.5, 'cm'),
        legend.key.height = unit(0.5, 'cm'),
        legend.key.width = unit(0.5, 'cm'),
        plot.caption = element_text (size=8, family="serif", face= "italic", hjust=0)+


This map illustrates how the hardest hit municipalities were located on the southeastern coast, where Hurricane Maria made landfall and the island’s mountainous interior. Close to six months after Hurricane Maria made landfall, many towns on the island had more than 50% of customers without electricity.

Concluding thoughts

A final tutorial on creating maps in R with the ggplot2 package using the geom_map function will be uploaded in a few weeks. For now, it is important to understand that whether we are using the geom_polygon or geom_sf functions, creating maps with the ggplot2 package requires at least two dataframes. One with the spatial data to draw the boundaries of the map and the other one with the data we want to layer on the map. More dataframes can be used.

While working with geographical data is not easy, every student should learn how to create choropleth maps. R has many packages that can help us develop these skills. In this series of tutorials, I have explained some of the basic steps on how to create maps in R with the ggplot2 package. The code used in these examples can be appropriated, reworked, and used to develop other maps.

About the author:

Carlos L. Yordán is an Associate Professor of International Relations at Drew University. He is also the director of the Semester on the United Nations.

Leave a comment

Your email address will not be published. Required fields are marked *