Representing data on maps

There are many packages available in R to plot maps and plot data on these maps. This is actually an entire field of study, called GIS (Geographic Information System). Here, we will only scratch the surface of this field.

Let’s start by loading the packages that we will need (you might need to install some of these packages, depending on your current configuration). Here, we are using the package “usmap”. This package does not have the best maintenance istory and will serve as an illustration of what can happens when packages are updated over time…

library(usmap)
library(ggplot2)

Adding points based on geographical coordinates.

In this first exercise, we will add to a usmap a number of points defined by geographical coordinates (longitude and latitude). To illustrate this principle, I will plot the location of the 3 places that I’ve lived in (in the US) with dots of size proportional to the amount of time that I’ve spent in each city.

The three cities are:

  • Bloomington, IN (86.52 West, 39.15 North) - 7 years
  • Scottsdale, AZ (111.89 West, 33.46 North) - 1 year
  • Starkville, MS (88.73 West, 33.35 North) - 7 years

Let’s start by creating a data.frame containing all this information:

places <- data.frame(
  longitude = c(-86.52, -111.89, -88.73),
  latitude = c(39.15,  33.46, 33.35),
  years = c(7, 1, 7),
  labels = c("Bloomington, IN", "Phoenix, AZ", "Starkville, MS")
)

Note how longitudes West of the Greenwich meridian are negative values.
Unfortunately, we can’t simply use the latitude/longitude as our coordinates on the plot. We need to use the usmap_transform function to convert these coordinates into values to match on the usmap.
The “old” (as of 2021) way of using usmap_transform was:

# Note how the following call to the "usmap_transform" function returns an error message.
# Read the documentation for this function and find how to modify the code to make it work...
transformed_data <- usmap_transform(places)

This code used to work with older versions of the usmap package, but not anymore…
For some time, this solution worked:

transformed_data <- usmap_transform(places,
                                    input_names = c("longitude", "latitude"), 
                                    output_names = c("longitude.1", "latitude.1"))

Inspect the error and warning messages obtained from running this code and look into the documentation (or online) for a solution.

Look at the fields inside transformed_data. What do you notice?

Read the official documentation for the package usmap and find the code which will allow you to plot a map looking like this:

As you can see, updates to packages can “break” the code that you have previously written. This is more likely to happen with “non official” packages. Big packages used by many in the ocmmunity try to maintain as much backward compatibility as possible.

So, in this roadmap, we’ll see how to use the more “official” ggplot/maps/mapdata” combo to plot maps (although not as convinient/easy to use as usmap, it is certainly more powerful and less prone to bugs!)

Make sure you have the following packages installed and loaded:

library(ggplot2)
library(maps)
library(mapdata)
library(RColorBrewer)

Examples of maps:

Let’s start with a map of the entire world:

world_map <- map_data("world")
ggplot(world_map, aes(x = long, y = lat, group = group)) +
  geom_polygon(fill="lightgray", colour = "white")

And a map of the US, without any state/county borders drawn:

usa_map <- map_data("usa")
ggplot(usa_map, aes(x = long, y = lat, group = group)) +
  geom_polygon(fill="lightgray", colour = "white")

Adding the states borders and changing the aspect ratio a notch:

usa <- map_data("state")

usa_map <- ggplot(data=usa, mapping=aes(x=long, y=lat, group=group)) + 
  coord_fixed(1.3) + 
  geom_polygon(color="black", fill="gray") + 
  geom_polygon(data=usa, fill=NA, color="white") + 
  geom_polygon(color="black", fill=NA)

plot(usa_map)

Zooming on the state of Mississippi + county borders:

usa <- map_data('state')
ms <- subset(usa, region=="mississippi")

counties <- map_data("county")
ms_counties <- subset(counties, region=="mississippi")

myMap <- ggplot(data=ms, mapping=aes(x=long, y=lat, group=group)) + 
  coord_fixed(1.3) + 
  geom_polygon(color="black", fill="gray") + 
  geom_polygon(data=ms_counties, fill=NA, color="white") + 
  geom_polygon(color="black", fill=NA) + 
  ggtitle('Mississippi Map with Counties')
#  geom_point(data=tr, aes(x=longitude, y=latitude)) + 
#  theme(axis.title.x=element_blank(), axis.text.x=element_blank(), axis.ticks.x=element_blank(),
#        axis.title.y=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank())

plot(myMap)

Adding points based on geographical coordinates.

In this first exercise, we will add to a usmap a number of points defined by geographical coordinates (longitude and latitude). To illustrate this principle, I will plot the location of the 3 places that I’ve lived in (in the US) with dots of size proportional to the amount of time that I’ve spent in each city.

The three cities are:

  • Bloomington, IN (86.52 West, 39.15 North) - 7 years
  • Scottsdale, AZ (111.89 West, 33.46 North) - 1 year
  • Starkville, MS (88.73 West, 33.35 North) - 3 years

Let’s start by creating a data.frame containing all this information:

places <- data.frame(
  longitude = c(-86.52, -111.89, -88.73),
  latitude = c(39.15,  33.46, 33.35),
  years = c(7, 1, 7),
  labels = c("Bloomington, IN", "Phoenix, AZ", "Starkville, MS")
)

Note how longitudes West of the Greenwitch meridian are negative values.

Unfortunately, we can’t simply use the latitude/longitude as our coordinates on the plot. We need to use the usmap_transform function to convert these coordinates into values to match on the usmap.

This time, we do not need to transform the coordinates!

usa <- map_data("state")

usa_map <- ggplot(data=usa, mapping=aes(x=long, y=lat, group=group)) + 
  coord_fixed(1.3) + 
  geom_polygon(color="black", fill="gray") + 
  geom_polygon(data=usa, fill=NA, color="white") + 
  geom_polygon(color="black", fill=NA)

mapWithCities <- usa_map + geom_point(data=places, aes(x=longitude, y=latitude, size=years), col="red", inherit.aes = F)

mapWithCities + geom_text(data = places, aes(label = labels, x = longitude+2, y = latitude+1), inherit.aes = F, col="steelblue")

Take a minute to generate a map of the places you’ve lived in (or places you like most in the US). Try to adjust the location of labels so that they are nicely positioned relatively to the dots on the map.

Saguaro data

Let’s look at some more biologically-relevent data. The file saguaros.csv contains the list of saguaro cactus observations reported in the iNaturalist.org website.

A short note: iNaturalist.org is an app (with corresponding website) allowing users to report observations of fauna and flora. It is sometimes used by researchers.
If you are not familiar with saguaros, they are the most majestic type of cactus, frequently seen in the American South-West. Here is an example:

.

Download the file to your computer, load it in R and generate a map showing the location of every data point in this file. This is what the result should look like:

Notice how data point outside of the US (in Mexico) end up being drawn over Alaska a void region. It’s time to zoom out and plot maps of the entire world…

world_map <- map_data("world")
world_map <- ggplot(world_map, aes(x = long, y = lat, group = group)) +
  geom_polygon(fill="lightgray", colour = "white")

mapWithSaguaro <- world_map + geom_point(data=saguaro, aes(x=longitude, y=latitude), col="red", size=0.1, inherit.aes = F)

plot(mapWithSaguaro)

Exercise (Graded):

First part:

Submit an RMarkdown document with the solution to the first part of this road map. It must successfully plot a map of the US showing at least 2 cities (it can be the places where you have lived, or any other data you would like to show on the map).

Second part:

Download the file aligators-MS.csv which contains the location of all iNaturalist reported observations of alligators in the state of Mississippi. Write some R code to generate a map showing the location of these observations.

First, displaying observations as dots on the map:

And then with a spatial heatmap overlay:

## Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(level)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

END OF ROADMAP