Next Steps

CS&SS 508 β€’ Lecture 10

28 May 2024

Victoria Sass

Roadmap

You’ve already learned SO MUCH in this class:

But if grad school teaches us anything, it’s that the more we learn, the more we realize how much more there is to learn. πŸ₯΄πŸ˜‘πŸ« 

This can be freeing! And even fun?! Let your curiosity run wild!

Today, we’ll look at some of the ways you can extend your learning beyond the scope of this introductory course.

  • Tidy modeling
  • Even more visualizations
  • Creating web applications
  • Version control with Git/GitHub
  • Creating slides, articles, books, and/or websites

Tidy Modeling

Modeling in R

Modeling with tidymodels

library(tidymodels)
> ── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
> βœ” broom        1.0.6      βœ” rsample      1.2.1 
> βœ” dials        1.2.1      βœ” tune         1.2.1 
> βœ” infer        1.0.7      βœ” workflows    1.1.4 
> βœ” modeldata    1.3.0      βœ” workflowsets 1.1.0 
> βœ” parsnip      1.2.1      βœ” yardstick    1.3.1 
> βœ” recipes      1.0.10
> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
> βœ– scales::discard() masks purrr::discard()
> βœ– dplyr::filter()   masks stats::filter()
> βœ– recipes::fixed()  masks stringr::fixed()
> βœ– dplyr::lag()      masks stats::lag()
> βœ– yardstick::spec() masks readr::spec()
> βœ– recipes::step()   masks stats::step()
> β€’ Search for functions across packages at https://www.tidymodels.org/find/

tidymodels Packages

tidymodels Approach


Packages Using Tidymodels Framework

tidymodels Resources

  1. The website for tidymodels has extensive documentation on all core tidymodels packages as well as related ones.
  2. There is an excellent book online by Max Kuhn and Julia Silge called Tidy Modeling with R that provides a great overview of how to approach modeling within a tidy framework and well as showing how to use the various packages within tidymodels.
  3. Additionally, there are also several related, online book resources:

Even More Visualizations

Spatial Data with ggmap

There are numerous ways to work with spatial data in R but you can use ggmap to visualize spatial data in a tidyverse framework.

source("stadia_api_key.R")
library(ggmap)
`%notin%` <- function(lhs, rhs) !(lhs %in% rhs)

violent_crimes <- crime |>
  filter(offense %notin% c("auto theft", "theft", "burglary"),
         between(lon, -95.39681, -95.34188),
         between(lat, 29.73631, 29.78400)) |> 
  mutate(offense = fct_drop(offense),
         offense = fct_relevel(offense, 
                               c("robbery", "aggravated assault", "rape", "murder")))
1
You need to register for Stadia Maps or Google Maps and get a respective API key in order to download their maps. This R script simply saves my API key and registers it in the current session so it’s not included in my code that’s accessible on GitHub.
2
Creating a helper function that negates the %in% function.
3
crime is a built-in dataset in the ggmap package.

Making a Map

Once we have data we want to visualize we can call ggmap to visualize the spatial area and layer on any geoms/stats as you would with ggplot2.

bbox <- make_bbox(lon, lat, 
                  data = violent_crimes)
map <- get_stadiamap( bbox = bbox, 
                      maptype = "stamen_toner_lite", 
                      zoom = 14 )

ggmap(map) + 
  geom_point(data = violent_crimes, 
             color = "red")
4
Creating the bounding box for the longitude and latitude.
5
Retrieving the map with specifications.
6
The only difference with layers using ggmap is that (1) you need to specify the data arguments in the layers and (2) the spatial aesthetics x and y are set to lon and lat, respectively. (If they’re named something different in your dataset, just put mapping = aes(x = longitude, y = latitude), for example.)

Using Different Geoms

With ggmap you’re working with ggplot2, so you can add in other kinds of layers, use patchwork, etc. All the ggplot2 geom’s are available.

library(patchwork)
library(ggdensity)
library(geomtextpath)

robberies <- violent_crimes |> filter(offense == "robbery")

points_map <- ggmap(map) + geom_point(data = robberies, color = "red")

hdr_map <- ggmap(map) + 
  geom_hdr(aes(lon, lat, fill = after_stat(probs)), 
           data = robberies,
           alpha = .5) +
  geom_labeldensity2d(aes(lon, lat, level = after_stat(probs)),
                      data = robberies, 
                      stat = "hdr_lines", 
                      size = 3, boxcolour = NA) +
  scale_fill_brewer(palette = "YlOrRd") +
  theme(legend.position = "none")

(points_map + hdr_map) & 
  theme(axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank())

Using Different Geoms

Marginal Histogram

library(ggExtra)
data(mpg, package = "ggplot2")

mpg_select <- mpg |> 
  filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) + 
  geom_count() + 
  geom_smooth(method = "lm", se = F) + 
  theme_bw() 
ggMarginal(g, type = "histogram", fill = "transparent")
7
Code that adds marginal plot.

Marginal Boxplot

library(ggExtra)
data(mpg, package = "ggplot2")

mpg_select <- mpg |> 
  filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) + 
  geom_count() + 
  geom_smooth(method = "lm", se = F) + 
  theme_bw()
ggMarginal(g, type = "boxplot", fill = "transparent")
7
Code that adds marginal plot.

Marginal Density Curve

library(ggExtra)
data(mpg, package = "ggplot2")

mpg_select <- mpg |> 
  filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) + 
  geom_count() + 
  geom_smooth(method = "lm", se = F) + 
  theme_bw()
ggMarginal(g, type = "density", fill = "transparent")
7
Code that adds marginal plot.

Create Animations

library(gapminder)
library(gganimate)
library(gifski)

ggplot(gapminder, aes(gdpPercap, lifeExp, size = pop, colour = country)) +
  geom_point(alpha = 0.7, show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  scale_x_log10() +
  facet_wrap(~continent) +
  labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'life expectancy') +
  transition_time(year) +
  ease_aes('linear')
8
Need this package to create a gif of the gganimate output.
9
gganimate-specific code.

Create Animations

Data Visualization Resources

  1. Hadley Wickham’s ggplot2: Elegant Graphics for Data Analysis (Second Edition) is available through the UW Library and the forthcoming Third Edition is being written as we speak and will be available online soon!
  2. Spatial Data Science with Applications in R by Edzer Pebesma and Roger Bivand
  3. The sf package also works within the tidyverse framework and pairs very nicely with data from the census which can be easily accessed using tidycensus
  4. More ideas for visualizations can be found on this master list and by looking at ggplot2 extension packages.
  5. Think about taking CSSS 569 with Chris Adolph (offered in Winter quarter) if you want to learn more about how to create visualizations in R and gain more understanding about best practices for conveying your data and findings effectively.

Creating web applications

Shiny

Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R. Shiny helps you turn your analyses into interactive web applications without requiring HTML, CSS, or JavaScript knowledge.

Shiny Examples

Version control with Git/GitHub

What is Version Control?

Version control allows you to work work individually and/or collaboratively in a highly structured, documented way.

It’s basically like a robust save program for your project. You track and log changes you make over time and the version control system allows you to review or even restore earlier versions of your project.


Originally meant for software developers, git has been adopted by computational social scientists to source code but also to keep track of the whole collection of files that make up a research project.

Why Use Version Control?

What is Github?

Why Use Github?

Collaboration Made β€œEasier”

Git Integration in R Studio

Git/GitHub Resources

  1. Hands down the best introduction to git and using git/GitHub with RStudio Projects is Jennifer Bryan’s online book Happy Git and GitHub for the useR
  2. Software carpentry has a nice beginner’s β€œclass” that’ll help you learn the git basics.
  3. Here’s a user-contributed cheat-sheet for Using git and GitHub with RStudio.

Creating slides, articles, books, and/or websites

Quarto

That’s a Warp!

Plug for CSSCR

Also, this lab is part of CSSCR (The Center for Social Science Computation and Research) where I’m also on staff as a consultant. CSSCR is a resource center for the social science departments1 at the University of Washington.


As you continue to learn R feel free to drop by2 with any/all of your R coding questions.

Thanks for spending so much time this quarter learning with me 😎

Don’t forget to fill out the course evaluation that you received via email!