Add ALTREP view support #1725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

DavisVaughan wants to merge 5 commits into r-lib:main

from DavisVaughan:feature/altrep-view

Draft

Add ALTREP view support #1725

DavisVaughan wants to merge 5 commits into r-lib:main from DavisVaughan:feature/altrep-view

Conversation

@DavisVaughan

Copy link

Member

@DavisVaughan DavisVaughan commented Jul 1, 2024 •

edited

Loading

More tests?
Hear back about DATAPTR() usage in character implementation
Extract_subset implementation? Or is the fallback good enough? (New approach is much faster even without Extract_subset due to faster Elt access with cached pointers, we really want base R to handle all the intricacies of this.)
Use #defines to section off code that doesn't work in older R versions, possibly bump minimal rlang R version
Decide on r_init_library_with_dll() helper with Lionel
Support R >=3.6?

Then

Follow up PR to rename r_vec_resize() to r_vec_grow() and remove support for shrinking
- Use ALTREP views in r_dyn_unwrap() in this PR

Adds support for ALTREP contiguous views, i.e. vec_view(x, start, size), for the major R types

A few notes:

x gets marked as not mutable on the way in
No attributes are allowed on x right now
The ALTREP classes are registered at load time but need the dll, so I had to introduce a new r_init_library_with_dll() helper that goes in your R_init_<pkg>() function (good to know if you are utilize the rlang C API). The ALTREP classes are registered with the package name, so if vctrs uses the rlang C API then it will register its own ALTREP view classes under the "vctrs" package name. I think this is a good thing.
Serialization will force materialization, which I think is nice because it avoids us ever having to worry about "old" ALTREP view objects if we ever change the internal structure.

Here are important notes on the internal structure:

/*
Structure of a `view`:
- `data1` is:
 - the original vector, before materialization
 - the materialized vector, after materialization
- `data2` is a RAWSXP holding a `r_view_metadata`
*/
struct r_view_metadata {
 // Whether or not the ALTREP view has been materialized.
 bool materialized;
 // The offset into the original data to start at.
 r_ssize start;
 // The size of the view.
 r_ssize size;
 // A read only pointer into the data, to save indirection costs. We typically
 // set this upon view creation, unless `x` is ALTREP, in which case we delay
 // setting it until the first `DATAPTR_RO()` request, to be friendly to
 // ALTREP types that we wrap. After materialization, it is always set.
 const void* v_data_read;
 // A write only pointer into the data, to save indirection costs.
 // Always `NULL` before materialization, and it set at materialization time.
 // Never set for lists or character vectors, as write pointers are unsafe.
 void* v_data_write;
};

Benchmarking - The most notable thing in the benchmarks is that slicing with a large slice can be up to 2x slower than native code due to ExtractSubset() in base R using the *_Elt() ALTREP method to extract out each element. I think it could be much more efficient by using Dataptr_or_null() to first see if it could get a cheap readonly dataptr to the altrep data (it can here, it's a view) and then using that directly. I plan to submit that patch or talk to Luke about it, then this difference would poof go away entirely.

Also note that the difference is mostly apparent only when a contiguous slice is made with ExtractSubset(), where native objects probably have 0 cache misses, but we have to go through an ALTREP method each time. When you extract a random slice of the same size, the timings actually tend to converge and cache misses become more important.

Otherwise I think it's pretty darn good!

×ばつ 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 x[[25L]] 0 41ns 23280892. 0B 69.8 #> 2 y[[25L]] 0 41ns 18932329. 0B 56.8 # A slice bench::mark( x[10:60], y[10:60], iterations = 1000000 ) #> # A tibble: 2 ×ばつ 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 x[10:60] 123ns 246ns 3228295. 512B 25.8 #> 2 y[10:60] 287ns 369ns 2216498. 512B 15.5 # Still not materialized! view_is_materialized(y) #> [1] FALSE # Duplication currently forces materialization, I think this is something # we can fix in base R. They use `INTEGER()` where I think they should use # `INTEGER_RO()`. Returned object is not ALTREP. z <- duplicate(y) view_is_materialized(y) #> [1] TRUE is_altrep(z) #> [1] FALSE # A slice after materialization - still takes the same amount of time as before materialization # (in both cases we use a cached readonly pointer to the data) bench::mark( x[10:60], y[10:60], iterations = 1000000 ) #> # A tibble: 2 ×ばつ 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 x[10:60] 123ns 246ns 3180695. 512B 22.3 #> 2 y[10:60] 287ns 369ns 2222617. 512B 13.3 # Large view, 1,000,000 elements base <- sample(1e7) x <- base[seq(from = 1000, length.out = 1000000)] y <- vec_view(base, start = 1000, size = 1000000) # Large slice (still roughly 2x slower, same as with small slice) bench::mark( x[1:500000], y[1:500000], iterations = 1000 ) #> # A tibble: 2 ×ばつ 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 x[1:5e+05] 909.54μs 1.17ms 822. 3.81MB 24.5 #> 2 y[1:5e+05] 2.19ms 2.42ms 413. 3.81MB 10.2 # Large slice, random index # The difference goes away as cache misses tend to dominate instead idx <- sample(1e7, 1e6) bench::mark( x[idx], y[idx], iterations = 1000 ) #> # A tibble: 2 ×ばつ 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 x[idx] 2.19ms 2.4ms 408. 3.81MB 10.5 #> 2 y[idx] 2.47ms 2.69ms 357. 3.81MB 8.78 # Still not materialized! view_is_materialized(y) #> [1] FALSE">

# R-devel at the time
getRversion()
#> [1] '4.5.0'
# load rlang
devtools::load_all(path = "~/files/r/packages/rlang/")
# Identical test
# Currently `identical()` forces materialization due to usage of `INTEGER()`
# rather than `INTEGER_RO()`, booooo.
x <- c(1, 2, 3, 4)
y <- vec_view(x, start = 1, size = 4)
identical(x, y)
#> [1] TRUE
view_is_materialized(y)
#> [1] TRUE
# Object size test
# I think lobstr is more accurate here, there is a small fixed amount of overhead
# from the metadata. Nicely neither materialize the object.
x <- c(1, 2, 3, 4)
y <- vec_view(x, start = 1, size = 4)
object.size(x)
#> 80 bytes
object.size(y)
#> 80 bytes
lobstr::obj_size(x)
#> 80 B
lobstr::obj_size(y)
#> 776 B
view_is_materialized(y)
#> [1] FALSE
# Small view, 100 elements
base <- sample(1e6)
x <- base[1:100]
y <- vec_view(base, start = 1L, size = 100L)
is_altrep(y)
#> [1] TRUE
view_is_materialized(y)
#> [1] FALSE
# One element
bench::mark(
 x[[25L]],
 y[[25L]],
 iterations = 1000000
)
#> # A tibble: 2 ×ばつ 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 x[[25L]] 0 41ns 23280892. 0B 69.8
#> 2 y[[25L]] 0 41ns 18932329. 0B 56.8
# A slice
bench::mark(
 x[10:60],
 y[10:60],
 iterations = 1000000
)
#> # A tibble: 2 ×ばつ 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 x[10:60] 123ns 246ns 3228295. 512B 25.8
#> 2 y[10:60] 287ns 369ns 2216498. 512B 15.5
# Still not materialized!
view_is_materialized(y)
#> [1] FALSE
# Duplication currently forces materialization, I think this is something
# we can fix in base R. They use `INTEGER()` where I think they should use
# `INTEGER_RO()`. Returned object is not ALTREP.
z <- duplicate(y)
view_is_materialized(y)
#> [1] TRUE
is_altrep(z)
#> [1] FALSE
# A slice after materialization - still takes the same amount of time as before materialization
# (in both cases we use a cached readonly pointer to the data)
bench::mark(
 x[10:60],
 y[10:60],
 iterations = 1000000
)
#> # A tibble: 2 ×ばつ 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 x[10:60] 123ns 246ns 3180695. 512B 22.3
#> 2 y[10:60] 287ns 369ns 2222617. 512B 13.3
# Large view, 1,000,000 elements
base <- sample(1e7)
x <- base[seq(from = 1000, length.out = 1000000)]
y <- vec_view(base, start = 1000, size = 1000000)
# Large slice (still roughly 2x slower, same as with small slice)
bench::mark(
 x[1:500000],
 y[1:500000],
 iterations = 1000
)
#> # A tibble: 2 ×ばつ 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 x[1:5e+05] 909.54μs 1.17ms 822. 3.81MB 24.5
#> 2 y[1:5e+05] 2.19ms 2.42ms 413. 3.81MB 10.2
# Large slice, random index
# The difference goes away as cache misses tend to dominate instead
idx <- sample(1e7, 1e6)
bench::mark(
 x[idx],
 y[idx],
 iterations = 1000
)
#> # A tibble: 2 ×ばつ 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 x[idx] 2.19ms 2.4ms 408. 3.81MB 10.5 
#> 2 y[idx] 2.47ms 2.69ms 357. 3.81MB 8.78
# Still not materialized!
view_is_materialized(y)
#> [1] FALSE

DavisVaughan added 5 commits

July 1, 2024 13:45

@DavisVaughan


 First implementation of ALTREP views on lgl/int/dbl/cpl

@DavisVaughan


 Add chr/list support

ce1da3b

@DavisVaughan


 Add raw support

ad22706

@DavisVaughan


 Cache pointers for performance optimizations

acdf0ab

This typically makes most accesses 30-40% faster than the previous approach, due to less indirection. This puts us at just 2x slower than non-ALTREP accesses, rather than previously where we were 3-4x slower. Seems pretty good.

@DavisVaughan


 Make ALTLIST view support conditional on R >=4.3.0

8ae0fee

@DavisVaughan DavisVaughan mentioned this pull request

Jul 12, 2024

Making R internals more ALTREP friendly r-devel/r-dev-day#23

Open

@DavisVaughan DavisVaughan mentioned this pull request

Jul 26, 2024

Avoid non-API calls when truncating overallocated vectors r-lib/cpp11#362

Merged

@DavisVaughan

Copy link

Member Author

DavisVaughan commented Aug 5, 2024

Try using this from r_init_library() to get at the DLLInfo

> unclass(unclass(getLoadedDLLs())$rlang)
$name
[1] "rlang"
$path
[1] "/Users/davis/Library/R/arm64/4.4/library/rlang/libs/rlang.so"
$dynamicLookup
[1] FALSE
$handle
<pointer: 0xa3302e10>
attr(,"class")
[1] "DLLHandle"
$info
<pointer: 0x14488b710>
attr(,"class")
[1] "DLLInfoReference"
$forceSymbols
[1] FALSE

@DavisVaughan DavisVaughan mentioned this pull request

Sep 30, 2024

feat: Rework view() to better work with RStudio and Positron tidyverse/tibble#1603

Merged

@lionel- lionel- force-pushed the main branch from fb2f3b7 to 38eb82d Compare

October 29, 2024 06:41

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ALTREP view support #1725

Are you sure you want to change the base?

Add ALTREP view support #1725

Uh oh!

Conversation

@DavisVaughan DavisVaughan commented Jul 1, 2024 •

edited

Loading

Uh oh!

Uh oh!

DavisVaughan commented Aug 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add ALTREP view support #1725

Are you sure you want to change the base?

Add ALTREP view support #1725

Uh oh!

Conversation

@DavisVaughan DavisVaughan commented Jul 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavisVaughan commented Aug 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

@DavisVaughan DavisVaughan commented Jul 1, 2024 •

edited

Loading