Exploratory Analysis II

Data Visualization, part 2. Code for Quiz 8

  1. Load the packages we will use.
  1. Quiz questions

Question: modify slide 51

ggplot(data = mpg) + 
   geom_point(aes(x = displ, y = hwy)) +
   facet_wrap(facets = vars(manufacturer))


Question: modify facet-ex-2

ggplot(mpg) + 
  geom_bar(aes(y = manufacturer)) + 
  facet_grid(vars(class), scales = "free_y", space = "free_y")


Question: spend_time

Download the file spend_time.csv from moodle into directory for this post. Or read it in directly:

read it into spend_time

spend_time  <- read_csv("spend_time.csv")

– assign activity to the x-axis

– assign avg_hours to the y-axis

– assign activity to fill

– set subtitle to Avg hours per day: 2018

– set x and y to NULL so they won’t be labeled

p1  <- spend_time %>% filter(year == "2018")  %>% 
ggplot() + 
  geom_col(aes(x = activity, y = avg_hours, fill = activity)) +
  scale_y_continuous(breaks = seq(0, 6, by = 1)) +
  labs(subtitle = "Avg hours per day: 2018", x = NULL, y = NULL)

p1


-assign year to the x-axis

-assign avg_hours to the y-axis

-assign activity to fill

p2  <- spend_time  %>% 
ggplot() + 
  geom_col(aes(x = year, y = avg_hours, fill = activity)) +
  labs(subtitle  = "Avg hours per day: 2010-2019", x = NULL, y = NULL) 

p2
ggsave(filename = "preview.png", 
       path = here::here("_posts", "2022-03-28-exploratory-analysis-ii"))

p_all  <-  p1 / p2 

p_all


p_all_no_legend  <- p_all & theme(legend.position = 'none')
p_all_no_legend


Start with p_all_no_legend

caption to “Source: American Time of Use Survey, https://data.bls.gov/cgi-bin/surveymost?tu

p_all_no_legend  +
 plot_annotation(title = "How much time Americans spent on selected activities", 
                  caption = "Source: American Time of Use Survey, https://data.bls.gov/cgi-bin/surveymost?tu")


Question: Patchwork 2

usespend_time` from last question patchwork slides

Start with spend_time

–assign avg_hours to the y-axis

-ADD line with geom_smooth –assign year to the x-axis

– set x and y to NULL so x and y axes won’t be labeled

p4  <- 
spend_time %>% filter(activity == "food prep")  %>% 
ggplot() + 
  geom_point(aes(x = year, y = avg_hours)) +
  geom_smooth(aes(x = year, y = avg_hours)) +
  scale_x_continuous(breaks = seq(2010, 2019, by = 1)) +
  labs(subtitle = "Avg hours per day: food prep", x = NULL, y = NULL) 

p4


Start with p4

p5 <-  p4 + coord_cartesian(ylim = c(0, 6))
p5


Start with spend_time

– assign avg_hours to the y-axis

– assign activity to color

– assign activity to group

p6   <- 
 spend_time  %>% 
ggplot() + 
  geom_point(aes(x = year, y = avg_hours, color = activity, group = activity)) +
  geom_smooth(aes(x = year, y = avg_hours, color = activity, group = activity)) +
  scale_x_continuous(breaks = seq(2010, 2019, by = 1)) +
  coord_cartesian(ylim = c(0, 6)) + 
  labs(x = NULL, y = NULL) 

p6


Use patchwork to display p4 and p5 on top of p6

( p4 | p5 ) / p6