Mapping In R Part 1

8 minute read

Published:

Making maps with ggplot2 and dggridr

Mapping in R

The purpose of this document is to help people get a jump start with mapping in R using dggridR and other helpful packages. If you have questions/comments, I would appreciate your input! btonelli (at) ucla [.] edu

First off, packages. You’ll need dplyr, devtools, and an older version of dggridR from the CRAN archive (the most recent version has some issues, or I don’t know how to use it correctly…). To do this, make sure you have devtools installed. You can skip this step once everything is installed.

library(devtools)
library(dplyr)
library(ggplot2)
library(maps)
library(mapproj)
#Uncomment and run the line immediately below if you haven't already!!
#install_version("dggridR", version = "2.0.4", repos = "http://cran.us.r-project.org")
library(dggridR)

Making up some data

Next up, we will make some fake spatial data to work with. This will be data frame with 4 columns: lon, lat, species and beak length. We’ll simulate location and trait data for 3 fake species in North America. Each species will have a different spatial distribution and beak length.

wp_lat <- rnorm(1000,40,4)
warbler_lon <- rnorm(500,-110,2)
fake_sp_data <- rbind(data.frame(lon=rnorm(1000,-100,6),
                           lat=wp_lat,
                           species = rep("Drab Woodpecker",1000),
                           beak_length = rnorm(1000,(5+(wp_lat-40)/4),.5)),
                      data.frame(lon=warbler_lon,
                           lat=rnorm(500,35,2),
                           species = rep("Plain Warbler",1000),
                           beak_length = rnorm(500,3+(warbler_lon+110)/10,.25)),
                      data.frame(lon=rnorm(100,-90,5),
                           lat=rnorm(100,45,2),
                           species = rep("Fire-breasted Trainbearer",1000),
                           beak_length = rnorm(100,1.5,.1)))

fake_sp_data %>% group_by(species) %>% summarise(mean_beak_length = mean(beak_length))
## # A tibble: 3 × 2
##   species                   mean_beak_length
##   <chr>                                <dbl>
## 1 Drab Woodpecker                       4.99
## 2 Fire-breasted Trainbearer             1.51
## 3 Plain Warbler                         2.99

OK, so now lets do some basic plots to see where these birds are. In base R, we can see what it looks like if we just plot these birds out.

plot(fake_sp_data$lon,fake_sp_data$lat)

Not super helpful! Lat and lon automatically scale, and it’s hard to get a sense of where these birds are across the continent. So next, let’s try to get the points plotted on a map. To do this, we will use ggplot2 and the function “map_data”

#First import country data. 
#Because this data is located in the western hemisphere, we will filter to countries with longitudes less than -50 degrees
countries <- map_data("world") %>% filter(long < -30)

#Now plot using ggplot
basic_map1 <- ggplot() +
    #This line adds the countries in, and colors the interior and border
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey")
plot(basic_map1)

So now we have a basic map, but it looks…. bad. So let’s change it to a more attractive projection, narrow the spatial area and make the plot have less gridlines and such.

basic_map2 <- ggplot() + 
    # This will limit the range to reasonable coordinates and reproject to a more attractive projection
    coord_map("mollweide",xlim=c(-140,-55),ylim=c(10,70)) + 
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey") +
    #All this stuff just makes things look neater
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
    theme(axis.ticks.x = element_blank(),axis.text.x = element_blank(),
          axis.ticks.y = element_blank(),axis.text.y = element_blank(),
          axis.title.x=element_blank(),axis.title.y=element_blank(),
          plot.margin = unit(c(0, 0, 0, 0), "null"),
          panel.spacing = unit(c(0, 0, 0, 0), "null")) +
    theme(panel.background = element_rect(fill = alpha("white",.5)))
print(basic_map2)

Wow! So much better. Now let’s plot our points

data_map1 <- ggplot() + 
    coord_map("mollweide",xlim=c(-140,-55),ylim=c(10,70)) + 
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey") +
    #This line will add all of our points in. We will adapt it as we go along
    geom_point(data=fake_sp_data,aes(x=lon,y=lat)) +
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
    theme(axis.ticks.x = element_blank(),axis.text.x = element_blank(),
          axis.ticks.y = element_blank(),axis.text.y = element_blank(),
          axis.title.x=element_blank(),axis.title.y=element_blank(),
          plot.margin = unit(c(0, 0, 0, 0), "null"),
          panel.spacing = unit(c(0, 0, 0, 0), "null")) +
    theme(panel.background = element_rect(fill = alpha("white",.5)))
data_map1

Great! We’re on the right track. Now let’s try to make this a little more appealing by shrinking the points and adding colors that represent species.

#We have three groups so let's assign three colors to represent the species. (because the base colors are trash)
sp_colors <- c("tan4","skyblue2","orange2")
data_map2 <- ggplot() + 
    coord_map("mollweide",xlim=c(-140,-55),ylim=c(10,70)) + 
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey") +
    #We add two more arguments now, color and size
    geom_point(data=fake_sp_data,aes(x=lon,y=lat,color=species),size=.25) +
    scale_color_manual(values = sp_colors) +
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
    theme(axis.ticks.x = element_blank(),axis.text.x = element_blank(),
          axis.ticks.y = element_blank(),axis.text.y = element_blank(),
          axis.title.x=element_blank(),axis.title.y=element_blank(),
          plot.margin = unit(c(0, 0, 0, 0), "null"),
          panel.spacing = unit(c(0, 0, 0, 0), "null")) +
    theme(panel.background = element_rect(fill = alpha("white",.5)))
data_map2

Note how we get a legend automatically too.

So now you are able to plot datapoints on a map with unique colors by group. Let’s try something else, plotting colors representing beak size.

You may have noticed I programmed in some variation to the fake data, so that beak size tends to vary for two of the species across space. Let’s focus on one species, and plot variation in beak size across space.

woodpecker_data <- fake_sp_data %>% filter(species == "Drab Woodpecker")
data_map3 <- ggplot() + 
    coord_map("mollweide",xlim=c(-140,-55),ylim=c(10,70)) + 
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey") +
    #Here we will change the color argument to beak_length, a continuous variable
    geom_point(data=woodpecker_data,aes(x=lon,y=lat,color=beak_length),size=.25) +
    #Also note I removed the manual color scale here, and replaced it with a flexible
    #function to add a custom color scale. We can use this to set colors on a gradient,
    #customizing the colors at the low, mid and high points. We also want to set the midpoint
    #to the mean beak length for clarity.
    scale_color_gradient2(low="tan",mid="grey",high="forestgreen",midpoint = mean(woodpecker_data$beak_length)) +
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
    theme(axis.ticks.x = element_blank(),axis.text.x = element_blank(),
          axis.ticks.y = element_blank(),axis.text.y = element_blank(),
          axis.title.x=element_blank(),axis.title.y=element_blank(),
          plot.margin = unit(c(0, 0, 0, 0), "null"),
          panel.spacing = unit(c(0, 0, 0, 0), "null")) +
    theme(panel.background = element_rect(fill = alpha("white",.5)))
data_map3

That look’s ok, but let’s try something else to make things look really good!

Enter dggridR. This package will let us summarise spatial data on an equal-area hexagonal grid. This is helpful for a number of reasons. First off, the hexagons look cool. Also, it’s problematic to group spatial data by un-equal spatial units. Think 1 by 1 lat long grid cells. They are radically different in size depending on where you are on the Earth’s surface. So don’t do it!

We are going to group our woodpecker data into hexagonal grids and see if there are differences across space in the average beak length.

#This code creates grids according to a certain resolution
hexgrid6 <- dgconstruct(res = 6)

# Now let's take our woodpecker lat longs and figure out which cell they each belong to
# Just ignore how annoying and unintuitive this function is... Just save the code somewhere
woodpecker_data$cell <- dggridR::dgGEO_to_SEQNUM(hexgrid6, 
                                           in_lon_deg = woodpecker_data$lon, 
                                           in_lat_deg = woodpecker_data$lat)[[1]]

#Now let's find the average beak length in each cell
wp_by_cell <-  woodpecker_data %>% 
  group_by(cell) %>% 
  summarise(mean_bl = mean(beak_length))

#We're now going to take our average beak length data and merge it with the hexagon grids
# to get it in the right format for plotting.
grid  <- dgcellstogrid(hexgrid6,wp_by_cell$cell,frame=TRUE,wrapcells=TRUE)
grid  <- merge(grid,wp_by_cell,by.x="cell",by.y="cell")

#This format encodes the location of each point of the hexagon, so you'll see cells each have 6 rows
head(grid)
##   cell      long      lat order  hole piece  group  mean_bl
## 1 3671 -84.63027 29.04292     1 FALSE     1 3671.1 2.777139
## 2 3671 -85.51548 27.85954     2 FALSE     1 3671.1 2.777139
## 3 3671 -87.06069 28.20538     3 FALSE     1 3671.1 2.777139
## 4 3671 -87.77260 29.71763     4 FALSE     1 3671.1 2.777139
## 5 3671 -86.91910 30.91640     5 FALSE     1 3671.1 2.777139
## 6 3671 -85.31770 30.58166     6 FALSE     1 3671.1 2.777139
#Now let's take that data and plot it
data_map4 <- ggplot() + 
    coord_map("mollweide",xlim=c(-140,-55),ylim=c(10,70)) + 
    geom_polygon(data=countries, aes(x=long, y=lat, group=group), fill="grey90", color="darkgrey") +
    geom_polygon(data=grid,      aes(x=long, y=lat, group=group, fill=mean_bl), alpha=0.9)    +
    geom_path   (data=grid,      aes(x=long, y=lat, group=group), alpha=0.4, color="white") +
    #Not the subtle change here to "scale_fill_gradient2"
    scale_fill_gradient2(low="tan",mid="grey",high="forestgreen",midpoint = mean(grid$mean_bl)) +
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
    theme(axis.ticks.x = element_blank(),axis.text.x = element_blank(),
          axis.ticks.y = element_blank(),axis.text.y = element_blank(),
          axis.title.x=element_blank(),axis.title.y=element_blank(),
          plot.margin = unit(c(0, 0, 0, 0), "null"),
          panel.spacing = unit(c(0, 0, 0, 0), "null")) +
    theme(panel.background = element_rect(fill = alpha("white",.5)))
data_map4

Look at that gradient. We can clearly see that the beak length varies by latitude.

So that’s it. Hopefully, the above code can serve as a template for your next mapping-related project. If there is anything else you would like to see added, please let me know and I’ll update this document when I have the time!