This post originated in that moi wanted to check if the code ‘x = lon %/% dx * dx + dx/2’ would give the same results as CSquare gridding as implemented in {vmstools}. Short answer is no. The former is more consistent, always resulting in binning such that if a value is exactly at ‘grid’-boundary it is included if on the lower boundary, excluded if on the upper boundary. In the {vmstools}-csquare case this depends on the global quarters. Now, somebody may argue that the practical consequences is miniscule and I would fully concur :-) Say so because a very small portion of real numeric data will be exactly on the grid boundary as focused on here. Leaving the devil in the details aside, the main message is that using ‘x = lon %/% dx * dx + dx/2’ is OK to use within R, arrow or duckdb code workflow. Why is this important? Because in the latter two workflows a csquare algorithm is not available/implemented.
code
rtip
Author
Einar Hjörleifsson
Published
June 18, 2025
Some synthetic data
library(tidyverse)library(duckdbfs)library(vmstools)library(patchwork)dx <- dy <-0.05g <-expand_grid(lon =seq(-1, 1, by =0.01),lat =seq(-1, 1, by =0.01)) |>mutate(cs = vmstools::CSquare(lon, lat, dx),x = lon %/% dx * dx + dx/2,y = lat %/% dy * dy + dy/2)g <-bind_cols(g, vmstools::CSquare2LonLat(g$cs, dx)) |>rename(lon2 = SI_LONG, lat2 = SI_LATI)g |>glimpse()