8  Exercises (Chapter 9)

  1. Create a scatterplot of hwy vs. displ where the points are pink filled in triangles.

    # Your R code here
  2. Why did the following code not result in a plot with blue points?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy, color = "blue"))
    # Proper R code here

    Your text answer here.

  3. What does the stroke aesthetic do? What shapes does it work with? (Hint: use ?geom_point)

    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 0.5) -> p1
    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 1) -> p2
    mpg |>
      ggplot(aes(x = displ, y = hwy)) +
        geom_point(shape = 21, stroke = 2) -> p3
    p1 / p2 / p3

    Your text answer here.

  4. What happens if you map an aesthetic to something other than a variable name, like aes(color = displ < 5)? Note, you’ll also need to specify x and y.

    mpg |>
      ggplot(aes(x = displ, y = hwy, color = displ < 5)) +

    Your text answer here.

  5. What geom would you use to draw a line chart? A boxplot? A histogram? An area chart?


    Your text answer here.

    # R Code here

    Your text answer here.

    # R code here

    Your text answer here.

    # R code here

    Your text answer here.

    # Youe R code here
  6. Earlier in this chapter we used show.legend without explaining it:

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = FALSE)

    What does show.legend = FALSE do here? What happens if you remove it? Why do you think we used it earlier?


    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = FALSE) -> p1
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_smooth(aes(color = drv), show.legend = TRUE) -> p2
    p1 / p2

  7. What does the se argument to geom_smooth() do?


    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
      geom_smooth(se = FALSE)

  8. Recreate the R code necessary to generate the following graphs. Note that wherever a categorical variable is used in the plot, it’s drv.


    The code for each of the plots is given below.

    # Your R code here
  9. What happens if you facet on a continuous variable?


    Your text answer here.

    mpg |> 
      ggplot(aes(x = drv, y = cyl)) + 
      geom_point() + 

  10. What do the empty cells in the plot above with facet_grid(drv ~ cyl) mean? Run the following code. How do they relate to the resulting plot?

    ggplot(mpg) + 
      geom_point(aes(x = drv, y = cyl))
    ggplot(mpg) + 
      geom_point(aes(x = drv, y = cyl)) +
      facet_grid(drv ~ cyl)

    Your text answer here.

  11. What plots does the following code make? What does . do?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(. ~ cyl)
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)

    Your text answer here.

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(. ~ cyl)

    Your text answer here.

  12. Take the first faceted plot in this section:

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) + 
      facet_wrap(~ cyl, nrow = 2)

    What are the advantages to using faceting instead of the color aesthetic? What are the disadvantages? How might the balance change if you had a larger dataset?


    Your text answer here.

    # facet
    ggplot(mpg) + 
     geom_point(aes(x = displ, y = hwy)) + 
      facet_wrap(~ class, nrow = 2)

    # color
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy, color = class))

    # both
    ggplot(mpg) + 
        aes(x = displ, y = hwy, color = class), 
        show.legend = FALSE) + 
     facet_wrap(~ class, nrow = 2)

    # highlighting
    ggplot(mpg, aes(x = displ, y = hwy)) + 
      geom_point(color = "gray") +
        data = mpg |> filter(class == "compact"),
        color = "pink"

  13. Read ?facet_wrap. What does nrow do? What does ncol do? What other options control the layout of the individual panels? Why doesn’t facet_grid() have nrow and ncol arguments?


    Your text answer here.

  14. Which of the following plots makes it easier to compare engine size (displ) across cars with different drive trains? What does this say about when to place a faceting variable across rows or columns?

    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() + 
      facet_grid(drv ~ .)
    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() +
      facet_grid(. ~ drv)
    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() + 
      facet_grid(drv ~ .)
    `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

    ggplot(mpg, aes(x = displ)) + 
      geom_histogram() +
      facet_grid(. ~ drv)
    `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

    Your text answer here.

  15. Recreate the following plot using facet_wrap() instead of facet_grid(). How do the positions of the facet labels change?

    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .)
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_grid(drv ~ .) -> p1
    ggplot(mpg) + 
      geom_point(aes(x = displ, y = hwy)) +
      facet_wrap(~drv, nrow = 3) -> p2
    p1 + p2

    Your text answer here.

  16. What is the default geom associated with stat_summary()? How could you rewrite the previous plot to use that geom function instead of the stat function?

    ggplot(diamonds) + 
        aes(x = cut, y = depth),
        fun.min = min,
        fun.max = max,
        fun = median


    Your text answer here.

    diamonds |>
      group_by(cut) |>
        lower = min(depth),
        upper = max(depth),
        midpoint = median(depth)
      ) |>
      ggplot(aes(x = cut, y = midpoint)) +
      geom_pointrange(aes(ymin = lower, ymax = upper))

  17. What does geom_col() do? How is it different from geom_bar()?


    Your text answer here.

  18. Most geoms and stats come in pairs that are almost always used in concert. Make a list of all the pairs. What do they have in common? (Hint: Read through the documentation.)


    Geoms and stats that are almost always used in concert are listed below:

    geom stat
    geom_bar() stat_count()
    geom_bin2d() stat_bin_2d()
    geom_boxplot() stat_boxplot()
    geom_contour_filled() stat_contour_filled()
    geom_contour() stat_contour()
    geom_count() stat_sum()
    geom_density_2d() stat_density_2d()
    geom_density() stat_density()
    geom_dotplot() stat_bindot()
    geom_function() stat_function()
    geom_sf() stat_sf()
    geom_sf() stat_sf()
    geom_smooth() stat_smooth()
    geom_violin() stat_ydensity()
    geom_hex() stat_bin_hex()
    geom_qq_line() stat_qq_line()
    geom_qq() stat_qq()
    geom_quantile() stat_quantile()
  19. What variables does stat_smooth() compute? What arguments control its behavior?


    stat_smooth() computes the following variables:

    • y or x: Predicted value
    • ymin or xmin: Lower pointwise confidence interval around the mean
    • ymax or xmax: Upper pointwise confidence interval around the mean
    • se: Standard error
  20. In our proportion bar chart, we needed to set group = 1. Why? In other words, what is the problem with these two graphs?

    ggplot(diamonds, aes(x = cut, y = after_stat(prop))) + 
    ggplot(diamonds, aes(x = cut, fill = color, y = after_stat(prop))) + 

    Your text answer here.

    # one variable
    ggplot(diamonds, aes(x = cut, 
                         y = after_stat(prop))) + 
    ggplot(diamonds, aes(x = cut, 
                         y = after_stat(prop), 
                         group = 1)) + 
    # two variables
    ggplot(diamonds, aes(x = cut, 
                         fill = color, 
                         y = after_stat(prop))) + 
    ggplot(diamonds, aes(x = cut, 
                         fill = color, 
                         y = after_stat(prop), 
                         group = color)) + 

  21. What is the problem with the following plot? How could you improve it?

    ggplot(mpg, aes(x = cty, y = hwy)) + 

    Your text answer here.

    ggplot(mpg, aes(x = cty, y = hwy)) + 
    ggplot(mpg, aes(x = cty, y = hwy)) + 

  22. What, if anything, is the difference between the two plots? Why?

    ggplot(mpg, aes(x = displ, y = hwy)) +
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(position = "identity")

    Your text answer here.

    # Your R code here
  23. What parameters to geom_jitter() control the amount of jittering?


    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 1, width = 1)
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 1, width = 5)
    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(color = "gray") +
      geom_jitter(height = 5, width = 1)

  24. Compare and contrast geom_jitter() with geom_count().


    Your text answer here.

    ggplot(mpg, aes(x = displ, y = hwy)) +
    ggplot(mpg, aes(x = displ, y = hwy)) +

  25. What’s the default position adjustment for geom_boxplot()? Create a visualization of the mpg dataset that demonstrates it.


    Your text answer here.

    ggplot(mpg, aes(x = drv, y = displ)) +
    ggplot(mpg, aes(x = drv, y = displ)) +
      geom_boxplot(position = "dodge2")

  26. Turn a stacked bar chart into a pie chart using coord_polar().


    Your text answer here.

    # Your R code here  
  27. What’s the difference between coord_quickmap() and coord_map()?


    Your text answer here.

  28. What does the following plot tell you about the relationship between city and highway mpg? Why is coord_fixed() important? What does geom_abline() do?

    ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
      geom_point() + 
      geom_abline() +

    Your text answer here.

    ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
      geom_point() + 
      geom_abline() +