class: center, middle, inverse, title-slide .title[ # Advancing the Grammar of Graphics ] .subtitle[ ## SISBID 2024
https://github.com/dicook/SISBID
] .author[ ### Di Cook (
dicook@monash.edu
)
Heike Hofmann (
hhofmann4@unl.edu
)
Susan Vanderplas (
susan.vanderplas@unl.edu
) ] .date[ ### 08/14-16/2024 ] --- # Example 2: Platypus in Australia Where can you find the strange platypus in Australia? <img src="https://upload.wikimedia.org/wikipedia/commons/7/7e/Platypus-sketch.jpg" width="60%" /> Source: https://en.wikipedia.org/wiki/File:Platypus-sketch.jpg --- ``` r load(here::here("data/platypus.rda")) platydata <- platypus ggplot(data=platydata) + geom_point(aes(x=longitude, y=latitude)) ``` <img src="index_files/figure-html/load the platypus obervation data-1.png" width="80%" style="display: block; margin: auto;" /> --- Add some transparency to see density of locations. ``` r ggplot(data=platydata) + geom_point(aes(x=longitude, y=latitude), alpha=0.1) ``` <img src="index_files/figure-html/Add some transparency to see density of locations-1.png" width="80%" style="display: block; margin: auto;" /> --- class: middle If you are good at recognising the shape of Australia, you might realise that the sightings are all along the east coast and Tasmania. There is a strange location, that looks like someone saw one in Antarctica! We might need to filter this observation out later, because its extremely unlikely to have been found that far south. We can make the data look a bit more like it is collected in Australia by making a map projection, using `coord_map`. --- .pull-left[ ``` r ggplot(data=platydata) + geom_point(aes(x=longitude, y=latitude), alpha=0.1) + coord_map() ``` <img src="index_files/figure-html/making a map projection-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ The locations would be even more recognisable if we added a real map underneath. One way this can be done is to use a google map. The `ggmap` package has an interface to extracting google maps. Install it and then grab a map of Australia with this code. ] --- ``` r load(here::here("data/oz.rda")) ggmap(oz) + geom_point(data=platydata, aes(x=longitude, y=latitude), alpha=0.1, colour="orange") ``` <img src="index_files/figure-html/load the saved map data-1.png" width="80%" style="display: block; margin: auto;" /> --- .pull-left[ ``` r library(leaflet) platydata |> filter(!is.na(latitude), !is.na(longitude)) |> leaflet() |> addTiles() |> addCircleMarkers( radius=1, opacity = 0.5, color = "orange", label = ~day, lat = ~latitude, lng = ~longitude) ``` ] .pull-right[
] --- # Temporal trend The date of the sighting is another variable in the data set. Let's make a plot of the sightings over time. The original variable is called `eventDate`. It has both date and time. This has been split into two variables `day` which is a date variable by R, and hour. ``` r library(lubridate) platypus <- platypus |> rename( longitude = decimalLongitude, latitude = decimalLatitude ) |> mutate( day = ymd(str_sub(eventDate, 1, 10), tz="Australia/Sydney"), hour = as.numeric(str_sub(eventDate, 15, 16))) ``` --- then we can simply plot occurrence by day. ``` r ggplot(data=platydata) + geom_point(aes(x=day, y=1), alpha=0.2) ``` <img src="index_files/figure-html/show sightings over time-1.png" width="60%" style="display: block; margin: auto;" /> There are some records dating back before 1800. There were only records from 1770! And these first records are in the database! --- There are a LOT more sightings in more recent years than in the 18th and 19th century, but this database has an amazing history! ``` r ggplot(data=platydata) + geom_quasirandom(aes(x=1, y=day), alpha=0.2) ``` <img src="index_files/figure-html/show jittered sightings over time-1.png" width="60%" style="display: block; margin: auto;" /> --- Let's focus on records since 1900, and count the number for each year. ``` r platydata <- platydata |> mutate(year = year(day)) platydata1900 <- platydata |> filter(year>1900) |> count(year) ggplot(data=platydata1900) + geom_point(aes(x=year, y=n)) ``` <img src="index_files/figure-html/focus on records since 1900 and count the number for each year-1.png" width="60%" style="display: block; margin: auto;" /> --- Add a trend line. ``` r ggplot(data=platydata1900, aes(x=year, y=n)) + geom_point() + geom_smooth(se=F) ``` <img src="index_files/figure-html/add a trend line-1.png" width="60%" style="display: block; margin: auto;" /> --- Make it interactive so that we can investigate some observations. ``` r library(plotly) ggplotly() ```
--- class: inverse middle # Your turn Make sure you can make all the plots shown in this session, and then tackle the next set of tasks.
−
+
10
:
00
--- class: inverse middle # Your turn Change the dotplot into a density plot, to focus on the locations of frequent sightings. Do you learn anything different than from the scatterplot? --- class: inverse middle # Your turn Platypus are mostly found on the east coast of the country, but there is a population close to Adelaide in south Australia. Why is this? (You might need to do some sleuthing with a web search engine to answer this.) --- class: inverse middle # Your turn *Discussion question:* Was there a population explosion in 1980 and 2004? Is the population of platypus increasing since 1900, and decreasing in the last decade? - Subset the data to 1950-2020 - Create a new variable for decade - Make a map separately for each decade - the `facet_wrap` function can help here. --- class: inverse middle # Your turn - Focusing again on TB, choose a different country to examine, for example, Australia. - Create a similar sequence of plots for your chosen country's data, and feel free to experiment with new types. - Write a data story describing what you have learned about TB in your chosen country based on a selection of your favorite plots, by putting the code and explanations into an Rmarkdown document, and compile to html. - Feel free to share with your instructors. --- # Resources - [RStudio cheatsheets](https://www.rstudio.com/resources/cheatsheets/) - [ggplot2: Elegant Graphics for Data Analysis, Hadley Wickham](https://ggplot2-book.org), [web site](https://ggplot2.tidyverse.org) - [R Graphics Cookbook, Winston Chang](http://www.cookbook-r.com/Graphs/) - [Data Visualization, Kieran Healy](https://socviz.co) - [Data Visualization with R, Rob Kabacoff](https://rkabacoff.github.io/datavis/index.html) - [Fundamentals of Data Visualization, Claus O. Wilke](https://serialmentor.com/dataviz/) - [Leaflet in R](https://rstudio.github.io/leaflet/) --- # Share and share alike <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.