A question rose about how to converting a variable into categories. The question was related creating a vessel length class from a continous variable of vessel length. Here two methods are shown, one using the case_when function in dplyr the other one the base cut-function. And we length measurments in the minke dataset.

library(tidyverse)
d <- 
  read_csv("ftp://ftp.hafro.is/pub/data/csv/minke.csv")
d <- 
  d %>% 
  select(id, length) %>% 
  mutate(length.class1 = case_when(length < 600 ~ "0 - 599",
                                   length >= 600 & length < 800 ~ "600 - 800",
                                   length >= 800 & length < 1000 ~ "800 - 1000",
                                   TRUE ~ NA_character_),
         length.class2 = cut(length, breaks = c(0, 600, 1000)))
glimpse(d)
## Rows: 190
## Columns: 4
## $ id            <dbl> 1, 690, 926, 1333, 1334, 1335, 1336, 1338, 1339, 1341, 1…
## $ length        <dbl> 780, 793, 858, 567, 774, 526, 809, 820, 697, 777, 739, 5…
## $ length.class1 <chr> "600 - 800", "600 - 800", "800 - 1000", "0 - 599", "600 …
## $ length.class2 <fct> "(600,1e+03]", "(600,1e+03]", "(600,1e+03]", "(0,600]", …

visual:

d %>% 
  group_by(length.class1) %>% 
  count() %>% 
  ggplot() +
  geom_col(aes(length.class1, n)) +
  coord_flip()