I am trying to compare any performance gain(s) from moving a current offset-based pagination endpoint to a cursor based one.
I have the following ruby code:
@limit = 25
@max_pages = 200
def query
@q ||= MyDataBaseClass.where(status: "open").order(:id)
end
def offset_paginatiion
# count is currently an implementation for this appraoch
total = query.count
ctr = 0
page_size = 1
while page_size < @max_pages
query.limit(@limit).offset(ctr).load
ctr += @limit
page_size += 1
end
end
def keyset_pagination
current_page = query.limit(@limit)
page_size = 1
while true
break if page_size >= @max_pages
page_size += 1
last_id = current_page.last.id
current_page = query.where("id > ?", last_id).limit(@limit).load
end
end
require 'benchmark/ips'
Benchmark.ips do |x|
x.report("offset_pagination") { offset_paginatiion }
x.report("keyset_pagination") { keyset_pagination }
x.compare!
end
I've seeded the database with 1 million rows but you can see here I'm only trying to page up to 200 pages (or 5,000 results).
The two methods are returning the exact same data in the same order so they are functionally equivalent and after running the benchmark are performing nearly the same:
│Calculating -------------------------------------
│ offset_pagination 0.009 (± 0.0%) i/s - 1.000 in 110.100327s
│ keyset_pagination 0.009 (± 0.0%) i/s - 1.000 in 112.732942s
│
│Comparison:
│ offset_pagination: 0.0 i/s
│ keyset_pagination: 0.0 i/s - 1.02x slower
I was under the impression performance (from numerous articles) was a benefit to moving to cursor pagination but have yet to been able to reproduce - what am I doing worng?
1 Answer 1
I'm not surprised at your results, bar a few minor details that any decent database will optimize away your implementations are the same. You are also basically just re-implementing the find_each
method.
Note, you might find a real difference if you used database cursors but that is a completely different thing.
The real difference between the two is that cursor based pagination responds better when the underlying data changes. i.e. using offset based pagination you might have a record included twice or skipped if a record is inserted or deleted by another process.
Explore related questions
See similar questions with these tags.