Benchmarks

Using tidylog adds a small overhead to each function call. For instance, because tidylog needs to figure out how many rows were dropped when you use tidylog::filter, this call will be a bit slower than using dplyr::filter directly. The overhead is usually not noticeable, but can be for larger datasets, especially when using joins. The benchmarks below give some impression of how large the overhead is.

 library("dplyr")
 library("tidylog", warn.conflicts = FALSE)
 library("bench")
 library("knitr")

filter

On a small dataset:

bench::mark(
 dplyr::filter(mtcars, cyl == 4),
 tidylog::filter(mtcars, cyl == 4), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
expression min median n_itr
dplyr::filter(mtcars, cyl == 4) 281μs 289μs 99
tidylog::filter(mtcars, cyl == 4) 633μs 665μs 98

On a larger dataset:

df <- tibble(x = rnorm(100000))
 
bench::mark(
 dplyr::filter(df, x > 0),
 tidylog::filter(df, x > 0), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
expression min median n_itr
dplyr::filter(df, x > 0) 636.32μs 762.6μs 96
tidylog::filter(df, x > 0) 1.08ms 1.2ms 96

mutate

On a small dataset:

bench::mark(
 dplyr::mutate(mtcars, cyl = as.factor(cyl)),
 tidylog::mutate(mtcars, cyl = as.factor(cyl)), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
expression min median n_itr
dplyr::mutate(mtcars, cyl = as.factor(cyl)) 322μs 335μs 99
tidylog::mutate(mtcars, cyl = as.factor(cyl)) 766μs 798μs 97

On a larger dataset:

df <- tibble(x = round(runif(10000) * 10))
 
bench::mark(
 dplyr::mutate(df, x = as.factor(x)),
 tidylog::mutate(df, x = as.factor(x)), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
expression min median n_itr
dplyr::mutate(df, x = as.factor(x)) 2.59ms 2.64ms 99
tidylog::mutate(df, x = as.factor(x)) 3.03ms 3.1ms 97

joins

Joins are the most expensive operation, as tidylog has to do two additional joins behind the scenes.

On a small dataset:

bench::mark(
 dplyr::inner_join(band_members, band_instruments, by = "name"),
 tidylog::inner_join(band_members, band_instruments, by = "name"), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
expression min median n_itr
dplyr::inner_join(band_members, band_instruments, by = "name") 418.32μs 432.7μs 98
tidylog::inner_join(band_members, band_instruments, by = "name") 2.64ms 2.7ms 91

On a larger dataset (with many row duplications):

N <- 1000
df1 <- tibble(x1 = rnorm(N), key = round(runif(N) * 10))
df2 <- tibble(x2 = rnorm(N), key = round(runif(N) * 10))
 
bench::mark(
 dplyr::inner_join(df1, df2, by = "key"),
 tidylog::inner_join(df1, df2, by = "key"), iterations = 100
) %>%
 dplyr::select(expression, min, median, n_itr) %>%
 kable()
 #> Warning in dplyr::inner_join(df1, df2, by = "key"): Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> Detected an unexpected many-to-many relationship between `x` and `y`.
 #> i Row 1 of `x` matches multiple rows in `y`.
 #> i Row 23 of `y` matches multiple rows in `x`.
 #> i If a many-to-many relationship is expected, set `relationship =
 #> "many-to-many"` to silence this warning.
expression min median n_itr
dplyr::inner_join(df1, df2, by = "key") 6.24ms 6.42ms 79
tidylog::inner_join(df1, df2, by = "key") 3.42ms 3.56ms 88

AltStyle によって変換されたページ (->オリジナル) /