Homework 2

Instructions

In this homework, you’ll pose a question regarding the Gapminder dataset and investigate it graphically. Rather than using the gapminder package like we did in lecture, you’ll want to use the dslabs package1, which has a larger subset of the Gapminder data (i.e. more observations and variables).

1 Remember to install the package in your console first (not in your .qmd file) and then load it in your .qmd file with the library() function.

  • At the beginning of your document, write down a research question that is based on this larger version of the Gapminder dataset2 (e.g., “How does population change over time in different countries?”)
  • Create 3-6 plots to answer/investigate your research question. Consider histograms (geom_histogram()), scatterplots (geom_point()), or lineplots (geom_line()).
  • Be sure all titles, axes, and legends are clearly labelled (no raw variable names).
  • Include at least one plot with facet_wrap() or facet_grid().
  • You can use other geoms like bar charts or box plots, add meaningful vertical or horizontal lines, etc. You may find this data visualization cheat sheet helpful.

2 Run dslabs::gapminder in the console to read descriptions of all available variables.

Your document should be pleasant for a peer to look at, with some organization. You must write up your observations in words as well as showing the graphs. Upload both the .qmd file and the .html file to Canvas.

Optional: If you’d like to compare several specific countries, you can adapt the following pseudo code3 below to create a subset of the data with as many countries as you like. Replace "country1" et al. with the country name as it appears in the dataset. Use unique(gapminder$country) to see a complete list of all the countries available.

3 We’ll cover how to do this type of data manipulation in Week 4.

subset <- gapminder |> 
  filter(country %in% c("county1", "country2", "country3"))
1
filter() filters the rows of the dataset gapminder based upon a logical condition. %in% creates a logical vector the same length as country, evaluating whether each element in country matches any of the available values in the vector to its right (in this case, country1, country2, or country3).
Before you submit:

Have you remembered to add embed-resources: true to your YAML?

Due Dates

# Homework Due Peer Review Due
1 7 October 12 October
2 14 October 19 October
3 21 October 26 October
4 28 October 2 November
5 11 November 16 November
6 18 November 23 November
7 25 November 30 November
8 2 December 7 December
9 9 December 14 December