Pitch sequence aggregation

Author

Tom Mock

Published

November 27, 2022

Generate pitches

── Attaching packages ────────────────────────────────── tidyverse 1.3.2.9000 ──
✔ ggplot2   3.4.0           ✔ dplyr     1.0.99.9000
✔ tibble    3.1.8           ✔ stringr   1.4.1      
✔ tidyr     1.2.1           ✔ forcats   0.5.1      
✔ readr     2.1.3           ✔ lubridate 1.8.0      
✔ purrr     0.3.5           
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
pitches <- c(
  "Two Seam FB",
  "Cutter",
  "Slider",
  "Knuckleball",
  "Change Up"
)
set.seed(37)
fake_pitch <- tibble(
  at_bat = rep(1:5, each = 3)
) %>%
  mutate(pitch_type = sample(pitches, size = 15, replace = TRUE))

# include 1st pitch output
fake_pitch %>%
  group_by(at_bat) %>% 
  mutate(pitch_n = row_number()) %>% 
  mutate(
    pitch_seq = slider::slide2_chr(
      pitch_type,
      pitch_n,
      ~paste(.x,collapse = "-"),
      .before = 1L
    )
  ) %>%
  ungroup()
# A tibble: 15 × 4
   at_bat pitch_type  pitch_n pitch_seq              
    <int> <chr>         <int> <chr>                  
 1      1 Cutter            1 Cutter                 
 2      1 Slider            2 Cutter-Slider          
 3      1 Change Up         3 Slider-Change Up       
 4      2 Knuckleball       1 Knuckleball            
 5      2 Two Seam FB       2 Knuckleball-Two Seam FB
 6      2 Knuckleball       3 Two Seam FB-Knuckleball
 7      3 Knuckleball       1 Knuckleball            
 8      3 Change Up         2 Knuckleball-Change Up  
 9      3 Cutter            3 Change Up-Cutter       
10      4 Knuckleball       1 Knuckleball            
11      4 Slider            2 Knuckleball-Slider     
12      4 Two Seam FB       3 Slider-Two Seam FB     
13      5 Knuckleball       1 Knuckleball            
14      5 Slider            2 Knuckleball-Slider     
15      5 Slider            3 Slider-Slider          
# output NA for 1st pitch
fake_pitch %>%
  group_by(at_bat) %>% 
  mutate(pitch_n = row_number()) %>% 
  mutate(
    pitch_seq = slider::slide2_chr(
      pitch_type,
      pitch_n,
      ~paste(.x,collapse = "-"),
      .before = 1L,
      .complete = TRUE # force complete measures (at least 2x pitches)
    )
  ) %>% 
  ungroup()
# A tibble: 15 × 4
   at_bat pitch_type  pitch_n pitch_seq              
    <int> <chr>         <int> <chr>                  
 1      1 Cutter            1 <NA>                   
 2      1 Slider            2 Cutter-Slider          
 3      1 Change Up         3 Slider-Change Up       
 4      2 Knuckleball       1 <NA>                   
 5      2 Two Seam FB       2 Knuckleball-Two Seam FB
 6      2 Knuckleball       3 Two Seam FB-Knuckleball
 7      3 Knuckleball       1 <NA>                   
 8      3 Change Up         2 Knuckleball-Change Up  
 9      3 Cutter            3 Change Up-Cutter       
10      4 Knuckleball       1 <NA>                   
11      4 Slider            2 Knuckleball-Slider     
12      4 Two Seam FB       3 Slider-Two Seam FB     
13      5 Knuckleball       1 <NA>                   
14      5 Slider            2 Knuckleball-Slider     
15      5 Slider            3 Slider-Slider          

Plot it

set.seed(37)

fake_pitch <- tibble(
  at_bat = rep(c(1:5), each = 9),
  batter = rep(rep(c("A.Name", "B.Name", "C.Name"), each = 3), 5)
) %>%
  mutate(pitch_type = sample(pitches, size = 45, replace = TRUE))

fake_pitch %>%
  group_by(at_bat) %>% 
  mutate(pitch_n = row_number()) %>% 
  mutate(
    pitch_seq = slider::slide2_chr(
      pitch_type,
      pitch_n,
      ~paste(.x,collapse = "-"),
      .before = 1L,
      .complete = TRUE
    )
  ) %>% 
  ungroup() %>% 
  count(at_bat, pitch_seq) %>% 
  filter(!is.na(pitch_seq)) %>% 
  group_by(pitch_seq) %>% 
  mutate(roll_n = cumsum(n)) %>% 
  ggplot(aes(x = at_bat, y = roll_n, group = 1)) +
  geom_step(aes(group = 1), direction = "hv") +
  geom_point() +
  facet_wrap(~pitch_seq) +
  scale_y_continuous(limits = c(0, 5))
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_path()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?