A question rose about how to converting a variable into categories. The question was related creating a vessel length class from a continous variable of vessel length. Here two methods are shown, one using the case_when function in dplyr the other one the base cut-function. And we length measurments in the minke dataset.
library(tidyverse)
d <-
read_csv("ftp://ftp.hafro.is/pub/data/csv/minke.csv")
d <-
d %>%
select(id, length) %>%
mutate(length.class1 = case_when(length < 600 ~ "0 - 599",
length >= 600 & length < 800 ~ "600 - 800",
length >= 800 & length < 1000 ~ "800 - 1000",
TRUE ~ NA_character_),
length.class2 = cut(length, breaks = c(0, 600, 1000)))
glimpse(d)
## Rows: 190
## Columns: 4
## $ id <dbl> 1, 690, 926, 1333, 1334, 1335, 1336, 1338, 1339, 1341, 1…
## $ length <dbl> 780, 793, 858, 567, 774, 526, 809, 820, 697, 777, 739, 5…
## $ length.class1 <chr> "600 - 800", "600 - 800", "800 - 1000", "0 - 599", "600 …
## $ length.class2 <fct> "(600,1e+03]", "(600,1e+03]", "(600,1e+03]", "(0,600]", …
visual:
d %>%
group_by(length.class1) %>%
count() %>%
ggplot() +
geom_col(aes(length.class1, n)) +
coord_flip()