8  Exercises (Chapter 9)

  1. Create a scatterplot of hwy vs. displ where the points are pink filled in triangles.

    Answer
    # Your R code here
  2. Why did the following code not result in a plot with blue points?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy, color = "blue"))
    Answer
    # Proper R code here

    Your text answer here.

  3. What does the stroke aesthetic do? What shapes does it work with? (Hint: use ?geom_point)

    Answer
    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 0.5) -> p1
    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 1) -> p2
    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 2) -> p3
    library(patchwork)
    p1 / p2 / p3

    Your text answer here.

  4. What happens if you map an aesthetic to something other than a variable name, like aes(color = displ < 5)? Note, you’ll also need to specify x and y.

    Answer
    mpg |>
      ggplot(aes(x = displ, y = hwy, color = displ < 5)) +
        geom_point()

    Your text answer here.

  5. What geom would you use to draw a line chart? A boxplot? A histogram? An area chart?

    Answer

    Your text answer here.

    # R Code here
    Answer

    Your text answer here.

    # R code here
    Answer

    Your text answer here.

    # R code here
    Answer

    Your text answer here.

    # Youe R code here
  6. Earlier in this chapter we used show.legend without explaining it:

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = FALSE)

    What does show.legend = FALSE do here? What happens if you remove it? Why do you think we used it earlier?

    Answer

    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = FALSE) -> p1
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = TRUE) -> p2
    p1 / p2

  7. What does the se argument to geom_smooth() do?

    Answer

    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
      geom_smooth(se = FALSE)

  8. Recreate the R code necessary to generate the following graphs. Note that wherever a categorical variable is used in the plot, it’s drv.

    Answer

    The code for each of the plots is given below.

    # Your R code here
  9. What happens if you facet on a continuous variable?

    Answer

    Your text answer here.

    mpg |> 
      ggplot(aes(x = drv, y = cyl)) + 
      geom_point() + 
      facet_wrap(~hwy)

  10. What do the empty cells in the plot above with facet_grid(drv ~ cyl) mean? Run the following code. How do they relate to the resulting plot?

    ggplot(mpg) + 
      geom_point(aes(x = drv, y = cyl))
    Answer
    ggplot(mpg) + 
      geom_point(aes(x = drv, y = cyl)) +
      facet_grid(drv ~ cyl)

    Your text answer here.

  11. What plots does the following code make? What does . do?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(. ~ cyl)
    Answer
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)

    Your text answer here.

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(. ~ cyl)

    Your text answer here.

  12. Take the first faceted plot in this section:

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) + 
      facet_wrap(~ cyl, nrow = 2)

    What are the advantages to using faceting instead of the color aesthetic? What are the disadvantages? How might the balance change if you had a larger dataset?

    Answer

    Your text answer here.

    # facet
    ggplot(mpg) + 
     geom_point(aes(x = displ, y = hwy)) + 
      facet_wrap(~ class, nrow = 2)

    # color
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy, color = class))

    # both
    ggplot(mpg) + 
      geom_point(
        aes(x = displ, y = hwy, color = class), 
        show.legend = FALSE) + 
     facet_wrap(~ class, nrow = 2)

    # highlighting
    ggplot(mpg, aes(x = displ, y = hwy)) + 
      geom_point(color = "gray") +
      geom_point(
        data = mpg |> filter(class == "compact"),
        color = "pink"
      )

  13. Read ?facet_wrap. What does nrow do? What does ncol do? What other options control the layout of the individual panels? Why doesn’t facet_grid() have nrow and ncol arguments?

    Answer

    Your text answer here.

  14. Which of the following plots makes it easier to compare engine size (displ) across cars with different drive trains? What does this say about when to place a faceting variable across rows or columns?

    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() + 
      facet_grid(drv ~ .)
    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() +
      facet_grid(. ~ drv)
    Answer
    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() + 
      facet_grid(drv ~ .)
    `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() +
      facet_grid(. ~ drv)
    `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

    Your text answer here.

  15. Recreate the following plot using facet_wrap() instead of facet_grid(). How do the positions of the facet labels change?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)
    Answer
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .) -> p1
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_wrap(~drv, nrow = 3) -> p2
    p1 + p2

    Your text answer here.

  16. What is the default geom associated with stat_summary()? How could you rewrite the previous plot to use that geom function instead of the stat function?

    ggplot(diamonds) + 
      stat_summary(
        aes(x = cut, y = depth),
        fun.min = min,
        fun.max = max,
        fun = median
      )

    Answer

    Your text answer here.

    diamonds |>
      group_by(cut) |>
      summarize(
        lower = min(depth),
        upper = max(depth),
        midpoint = median(depth)
      ) |>
      ggplot(aes(x = cut, y = midpoint)) +
      geom_pointrange(aes(ymin = lower, ymax = upper))

  17. What does geom_col() do? How is it different from geom_bar()?

    Answer

    Your text answer here.

  18. Most geoms and stats come in pairs that are almost always used in concert. Make a list of all the pairs. What do they have in common? (Hint: Read through the documentation.)

    Answer

    Geoms and stats that are almost always used in concert are listed below:

    geom stat
    geom_bar() stat_count()
    geom_bin2d() stat_bin_2d()
    geom_boxplot() stat_boxplot()
    geom_contour_filled() stat_contour_filled()
    geom_contour() stat_contour()
    geom_count() stat_sum()
    geom_density_2d() stat_density_2d()
    geom_density() stat_density()
    geom_dotplot() stat_bindot()
    geom_function() stat_function()
    geom_sf() stat_sf()
    geom_sf() stat_sf()
    geom_smooth() stat_smooth()
    geom_violin() stat_ydensity()
    geom_hex() stat_bin_hex()
    geom_qq_line() stat_qq_line()
    geom_qq() stat_qq()
    geom_quantile() stat_quantile()
  19. What variables does stat_smooth() compute? What arguments control its behavior?

    Answer

    stat_smooth() computes the following variables:

    • y or x: Predicted value
    • ymin or xmin: Lower pointwise confidence interval around the mean
    • ymax or xmax: Upper pointwise confidence interval around the mean
    • se: Standard error
  20. In our proportion bar chart, we needed to set group = 1. Why? In other words, what is the problem with these two graphs?

    ggplot(diamonds, aes(x = cut, y = after_stat(prop))) + 
      geom_bar()
    ggplot(diamonds, aes(x = cut, fill = color, y = after_stat(prop))) + 
      geom_bar()
    Answer

    Your text answer here.

    # one variable
    ggplot(diamonds, aes(x = cut, 
                         y = after_stat(prop))) + 
      geom_bar()
    ggplot(diamonds, aes(x = cut, 
                         y = after_stat(prop), 
                         group = 1)) + 
      geom_bar()
    # two variables
    ggplot(diamonds, aes(x = cut, 
                         fill = color, 
                         y = after_stat(prop))) + 
      geom_bar()
    ggplot(diamonds, aes(x = cut, 
                         fill = color, 
                         y = after_stat(prop), 
                         group = color)) + 
      geom_bar()

  21. What is the problem with the following plot? How could you improve it?

    ggplot(mpg, aes(x = cty, y = hwy)) + 
      geom_point()
    Answer

    Your text answer here.

    ggplot(mpg, aes(x = cty, y = hwy)) + 
      geom_point()
    ggplot(mpg, aes(x = cty, y = hwy)) + 
      geom_jitter()

  22. What, if anything, is the difference between the two plots? Why?

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point()
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(position = "identity")
    Answer

    Your text answer here.

    # Your R code here
  23. What parameters to geom_jitter() control the amount of jittering?

    Answer

    Your text answer here.

    set.seed(321)
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 1, width = 1)
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 1, width = 5)
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 5, width = 1)

  24. Compare and contrast geom_jitter() with geom_count().

    Answer

    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_jitter()
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_count()

  25. What’s the default position adjustment for geom_boxplot()? Create a visualization of the mpg dataset that demonstrates it.

    Answer

    Your text answer here.

    ggplot(mpg, aes(x = drv, y = displ)) +
      geom_boxplot()
    ggplot(mpg, aes(x = drv, y = displ)) +
      geom_boxplot(position = "dodge2")

  26. Turn a stacked bar chart into a pie chart using coord_polar().

    Answer

    Your text answer here.

    # Your R code here  
  27. What’s the difference between coord_quickmap() and coord_map()?

    Answer

    Your text answer here.

  28. What does the following plot tell you about the relationship between city and highway mpg? Why is coord_fixed() important? What does geom_abline() do?

    ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
      geom_point() + 
      geom_abline() +
      coord_fixed()
    Answer

    Your text answer here.

    ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
      geom_point() + 
      geom_abline() +
      coord_fixed()