Advancing the Grammar of Graphics: gg packages, maps, interactivity

SISBID 2025
https://github.com/dicook/SISBID

gg extensions

  • There are more than 150 extensions to ggplot2
  • Many adhere to the grammar, to define new types of displays, like
    • ggdist: including representation of uncertainty and error
    • gganimate: specifies animations as layers
  • Others are supporting packages, such as
    • patchwork for laying out multiple displays
    • ggthemes for styling plots

https://exts.ggplot2.tidyverse.org/gallery/

ggdist

tb_inc_100k <- read_csv(here::here("data/TB_burden_countries_2025-07-22.csv")) |>
  filter(iso3 %in% c("USA", "AUS"))
ggplot(tb_inc_100k, aes(y = iso3, 
                        x = e_inc_100k)) +
  stat_gradientinterval(fill = "darkorange") +
  ylab("") +
  xlab("Inc per 100k") +
  theme_ggdist()

ggplot(tb_inc_100k, aes(y = iso3, 
                        x = e_inc_100k)) +
  stat_halfeye(side = "right") +
  geom_dots(side="left", 
                    fill = "darkorange", color = "darkorange") +
  ylab("") +
  xlab("Inc per 100k") +
  theme_ggdist()

Your turn

05:00

Explore other possibilities in ggdist, for example, how stat_interval() would change a previous chart.

Example: Monotremes in Australia

Where can you find the strange platypus and echidna live in Australia?

Monotremes in Australia

load(here::here("data/monotremes.rda"))
ggplot(data=monotremes) + 
  geom_point(aes(x = longitude, 
                 y = latitude, 
                 colour = family),
             alpha=0.5)

We can make the data look a bit more realistic with coord_map.

If you are good at recognising the shape of Australia, you might realise that the sightings are primarily along the east coast and Tasmania, with fewer sightings in western Australia.

Monotremes in Australia

ggplot(data=monotremes) + 
  geom_point(aes(x=longitude, 
                 y=latitude, 
                 colour = common_name), 
             alpha=0.1) +
  scale_colour_brewer("", palette = "Dark2") +
  coord_map()
  • Adding a map layer would help!
  • Install the ggmap R package to use google maps as a background

Monotremes in Australia

load(here::here("data/oz.rda"))
ggmap(oz) + 
  geom_point(data=monotremes, 
             aes(x=longitude, 
                 y=latitude, 
                 colour=common_name), 
              alpha=0.2) +
  scale_color_manual(values = c("#e66100", "#5d3a9b"))

Leaflet Monotreme Maps

library(leaflet)
monotremes |>
  filter(family == "Ornithorhynchus") |>
  filter(!is.na(latitude), 
         !is.na(longitude)) |>
  leaflet() |>
  addTiles() |>
  addCircleMarkers(
    radius=1, 
    opacity = 0.5, 
    color = "orange", 
    label = ~day,
    lat = ~latitude, lng = ~longitude) 

Temporal trend

Our goal is to examine how sightings change over time. The orginal variable eventDate was renamed to be datetime and converted into two variables during pre-processing: day which is a date variable, and hour.

SOME STUFF HERE |>
  mutate(
    day = ymd(str_sub(datetime, 1, 10),
              tz="Australia/Sydney"), 
    hour = as.numeric(str_sub(datetime, 15, 16)))

Temporal trend

then we can simply plot occurrence over time.

monotremes |>
  group_by(day, common_name) |>
  summarise(n = n()) |>
  ungroup() |>
  ggplot(aes(x=day, y=n)) +
    geom_point() +
    facet_wrap(~common_name)

Temporal trend

Add a trend line.

monotremes |>
  group_by(day, common_name) |>
  summarise(n = n()) |>
  ungroup() |>
  ggplot(aes(x=day, y=n)) +
    geom_point() +
    geom_smooth(se=F) +
    facet_wrap(~common_name) 

Interactivity!

library(plotly)
ggplotly(width=800, height=500)

Your turn

Make sure you can make all the plots shown in this session, and then tackle these tasks:

  • Change the dotplot into a density plot. This changes the focus to be on the locations of the most frequent sightings.
  • Facetting by creatures might
  • How would the interpretation change using the density plot instead of the scatterplot?

Previously it looked like echidnas are found all over Australia. The density plot suggests they are found in the south-east similar to platypus. This possibly reveals a limitation of this data, that to be an occurrence requires a person to be present. There are a lot more people in Australia’s southeast than the rest of the country, so it is possibly just showing where people are!

Your turn

Explore the temporal trend differently by making a side-by-side boxplot of the occurrences over month. Note that you still need to summarise by day for this to be meangingful. Does this change the focus for the temporal trend?

The focus now is median, and IQR over the year.

Your turn

There is something quite wrong in the previous plots. Can you guess what it is?

Days without any observations are missing from these plots. This is something that would need to be created in the pre-processing.

What should it be?

If no observations happened on any day, it would be reasonable to use 0 for these days. So we would need to complete the data to have a row for each day in the year, and fill the n column with 0.

Resources