Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Streaming performance #272

Unanswered
maximeg asked this question in Q&A
Discussion options

Hi @jcupitt,

I saw the promising streaming features from the 8.9 release 🥳.
So I tested to see if I could gain some perf improvement over one of my uses of vips.

The result is not what I hoped for.

Here is a synthetic test I used:

puts "Ruby version: #{RUBY_VERSION}"
require "bundler/inline"
gemfile(true) do
 source "https://rubygems.org"
 gem "benchmark-ips"
 gem "down"
 gem "http"
 gem "ruby-vips"
end
require "benchmark/ips"
require "down/http"
IMAGE_URL = "https://images.unsplash.com/photo-1491933382434-500287f9b54b?q=80&w=5000"
SIZE = 512
def calc_shrink(image)
 image_size = image.size.min
 exp = Math.log2(image_size / (SIZE * 2.0)).floor
 [2 ** exp, 8].min # max for jpeg is 8x8
end
def vips_ops(image)
 image = image.thumbnail_image(SIZE, height: SIZE, crop: "centre")
 image = image.sharpen(sigma: 1, x1: 2, y2: 10, y3: 20, m1: 0, m2: 3)
 image
end
def old_way
 buffer = HTTP.get(IMAGE_URL).to_s
 image = Vips::Image.new_from_buffer(buffer, "", access: :sequential)
 shrink = calc_shrink(image)
 image = Vips::Image.new_from_buffer(buffer, "", access: :sequential, shrink: shrink) if shrink > 1
 image = vips_ops(image)
 # simulate old process then write to file
 result = image.jpegsave_buffer(Q: 95, interlace: true, strip: true)
 File.write("old_way.jpg", result)
end
def stream_way
 remote = Down::Http.open(IMAGE_URL)
 source = Vips::SourceCustom.new
 source.on_read { |length| remote.read(length) }
 source.on_seek { |offset, whence| remote.seek(offset, whence) }
 image = Vips::Image.new_from_source(source, "", access: :sequential)
 shrink = calc_shrink(image)
 image = Vips::Image.new_from_source(source, "", access: :sequential, shrink: shrink) if shrink > 1
 image = vips_ops(image)
 image.write_to_file("stream_way.jpg", Q: 95, interlace: true, strip: true)
 remote.close
end
Benchmark.ips do |x|
 x.config(time: 10, warmup: 2)
 x.report("old_way") { old_way }
 x.report("stream_way") { stream_way }
end

Result:

Ruby version: 2.6.5
Fetching gem metadata from https://rubygems.org/............
Resolving dependencies...
Using rake 13.0.1
Using public_suffix 4.0.3
Using addressable 2.7.0
Using benchmark-ips 2.7.2
Using bundler 1.17.3
Using unf_ext 0.0.7.6
Using unf 0.1.4
Using domain_name 0.5.20190701
Using down 5.1.0
Using ffi 1.11.3
Using ffi-compiler 1.0.1
Using http-cookie 1.0.3
Using http-form_data 2.2.0
Using http-parser 1.2.1
Using http 4.3.0
Using ruby-vips 2.0.17
Warming up --------------------------------------
 old_way 1.000 i/100ms
 stream_way 1.000 i/100ms
Calculating -------------------------------------
 old_way 3.326 (± 0.0%) i/s - 33.000 in 10.105056s
 stream_way 3.230 (± 0.0%) i/s - 32.000 in 10.044809s

Do you have any insight? Maybe I missed something.

Note: when I don't implement on_seek, I get a bunch of read: fiber called across threads and VipsJpeg: Premature end of JPEG file.

You must be logged in to vote

Replies: 3 comments

Comment options

Hello @maximeg,

That's very interesting, thank you for testing this.

I think what's happening is that your processing is relatively quick, so being able to do it during the download does not save a great deal of time.

I tried:

$ time vips jpegload 'photo-1491933382434-500287f9b54b?q=80&w=5000' x.jpg --shrink 8
real	0m0.099s
user	0m0.102s
sys	0m0.005s

It's only 100ms of CPU that can be overlapped with download.

It should save you a useful chunk of memory -- you're no longer buffering the entire source image.

I don't know why implementing seek makes a difference, that seems very curious (and inefficient). I'll investigate.

You must be logged in to vote
0 replies
Comment options

I had a thought: how about adding upload as well, to simulate writing back to S3?

The stream version should be able to overlap download and upload, and that ought to give a nice speedup.

You must be logged in to vote
0 replies
Comment options

I spent a little time on this again. Here's a revised version of that nice benchmark:

puts "Ruby version: #{RUBY_VERSION}"
require "bundler/inline"
gemfile(true) do
 source "https://rubygems.org"
 gem "benchmark-ips"
 gem "down"
 gem "http"
 gem "ruby-vips"
end
require "benchmark/ips"
require "down/http"
IMAGE_URL = "https://images.unsplash.com/photo-1491933382434-500287f9b54b?q=80&w=5000"
SIZE = 512
def vips_ops(image)
 image = image.sharpen(sigma: 1, x1: 2, y2: 10, y3: 20, m1: 0, m2: 3)
 image
end
def old_way
 buffer = HTTP.get(IMAGE_URL).to_s
 image = Vips::Image.thumbnail_buffer(buffer, SIZE, crop: "centre")
 image = vips_ops(image)
 image.write_to_file("old_way.jpg", Q: 95, strip: true)
end
def stream_way
 remote = Down::Http.open(IMAGE_URL)
 source = Vips::SourceCustom.new
 source.on_read { |length| remote.read(length) }
 # fails if we don't implement seek() ... why?
 source.on_seek { |offset, whence| remote.seek(offset, whence) }
 image = Vips::Image.thumbnail_source(source, SIZE, crop: "centre")
 image = vips_ops(image)
 image.write_to_file("stream_way.jpg", Q: 95, strip: true)
 remote.close
end
def download_and_stream_way
 tmpfile = Down::Http.download(IMAGE_URL)
 source = Vips::Source.new_from_memory tmpfile.read
 tmpfile.unlink
 image = Vips::Image.thumbnail_source(source, SIZE, crop: "centre")
 image = vips_ops(image)
 image.write_to_file("download_and_stream_way.jpg", Q: 95, strip: true)
end
Benchmark.ips do |x|
 x.config(time: 10, warmup: 2)
 # streaming is slower overall, perhaps because of the number of callbacks we
 # end up triggering ... try buffering reads?
 x.report("old_way") { old_way }
 x.report("stream_way") { stream_way }
 x.report("download_and_stream_way") { download_and_stream_way }
end

Changes:

  • don't write with interlace since that will force the whole image to buffer
  • don't use thumbnail_image since that will stop shrink-on-load
  • add a third path: download to memory, then stream from memory

I see:

Calculating -------------------------------------
 old_way 2.412 (± 0.0%) i/s - 25.000 in 10.377908s
 stream_way 2.349 (± 0.0%) i/s - 24.000 in 10.247770s
download_and_stream_way
 2.426 (± 0.0%) i/s - 25.000 in 10.315862s

Notes:

  • stream_way is slower overall by ~3%
  • downloading and then streaming from the buffer is the same speed as the old way
  • Source is generating ~220 callbacks for each resize

So I think it's likely that the slowdown is just the Ruby callback overhead. Therefore:

  • investigate and find out why the seek needs to be implemented (this would save 20 pointless callbacks)
  • see if we can make the JPEG reader fetch the image in fewer pieces (it's averaging 2.3kb per read at the moment)
You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #220 on January 31, 2021 13:10.

AltStyle によって変換されたページ (->オリジナル) /