Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Huge memory consumption while using ruby-vips with shrine/image_processing gem #395

Unanswered
qdegraeve asked this question in Q&A
Discussion options

Hi there,

I will forward the issue I opened in the Shrine repo as it may belongs here. I was considering using ruby-vips instead of image_magick in my Rails projects to resize images but while it is super fast, the memory consumption is not at all what I expected. I'm not sure what this issue is related to so maybe anybody here will have an idea.

My leads are:

  • Debian image is not well suited for libvips
  • Memory is ok but GC is not releasing the memory
  • Some mean black magick is trying to turn me mad

The original discussion is here => shrinerb/shrine#686
I copy here the question:

recently we tried tu move out from Imagemagick to Vips to improve performances of derivatives processing as we use it a lot.
We had great expectations but unfortunately, the result is not at all what we expected in terms of memory consumption.

While the processing speed is increased by between 5 and 10 times, the memory is increased as well by 2.

I do not have a precise measure but I have graphs that represent the same exact flow, with the same pictures (all around 5 mb) to be processed.

In the example we try to process each image, with sidekiq and a throttle configured with a concurrency of 1. There is 19 images to be processed and here are the graphs :

With VIPS

image

With ImageMagick

image

Maybe there something I miss or do not understande.

My configuration is a container build with ruby-3.2-bookworm-slim, libvips 8.15, rails 6.1 , shrine 3.4 and sidekiq 7.2

For now I will stay with ImageMagick but if you have any clue on what is happening that would be great.

Ask me if you need further details
Regards
Quentin

You must be logged in to vote

Replies: 5 comments 17 replies

Comment options

Hello @qdegraeve,

Could you share some details about the sample images you used? For example, progressive JPEGs will need a lot of memory to process.

You must be logged in to vote
5 replies
Comment options

Oh, and the processing you are doing for each image, of course.

Comment options

Hi @jcupitt ,

thanks for you prompt answer.
The image I used is the one below.
I'm processing all my image with sidekiq without concurrency and with this code in my shrine uploader:

 derivation :thumbnail do |file, width, height|
 # convert width and height to integer only if present
 width = width.to_i if width.present?
 height = height.to_i if height.present?
 ImageProcessing::Vips
 .source(file)
 .resize_to_fill!(width, height)
 end

SampleJPGImage_5mbmb

It's not really easy to monitor this as is part of the whole application but I will try to run your script in my docker image and get back to you with te results.

Comment options

What are typical values for width and height? How many derivatives do you make from each image?

Comment options

For this project had one derivation with [910, 650] and 1 with [200, 150] by image. There was 15 images to process 1 by 1.

Comment options

I tried:

#!/usr/bin/ruby
require 'vips'
ARGV.each do |filename|
 puts "processing #{filename} ..."
 dirname = File.dirname filename
 basename = File.basename filename, ".*"
 Vips::Image
 .thumbnail(filename, 910)
 .write_to_file("#{dirname}/910-#{basename}.jpg")
 Vips::Image
 .thumbnail(filename, 200)
 .write_to_file("#{dirname}/200-#{basename}.jpg")
end

Using your sample image I see:

$ /usr/bin/time -f %M:%e ./thumb3.rb sample/*
processing sample/10.jpg ...
processing sample/11.jpg ...
processing sample/12.jpg ...
processing sample/13.jpg ...
processing sample/14.jpg ...
processing sample/15.jpg ...
processing sample/1.jpg ...
processing sample/2.jpg ...
processing sample/3.jpg ...
processing sample/4.jpg ...
processing sample/5.jpg ...
processing sample/6.jpg ...
processing sample/7.jpg ...
processing sample/8.jpg ...
processing sample/9.jpg ...
232252:4.27

So 230mb of memory and 4.2s of runtime.

Setting VIPS_CONCURRENCY to 1 (or adding Vips::concurrency_set 1 near the start of the program) helps a bit on this PC, though it may not do much on your machine.

Comment options

I made a tiny test program:

#!/usr/bin/ruby
require 'vips'
target_width = ARGV[0].to_i
ARGV[1..].each do |filename|
 thumb = Vips::Image.thumbnail(filename, target_width)
 dirname = File.dirname filename
 basename = File.basename filename, ".*"
 output_filename = "#{dirname}/thumb_#{basename}.jpg"
 puts "writing #{output_filename} ..."
 thumb.write_to_file output_filename
end

I used this test image (6k x 4k JPEG, 1.2mb):

nina

I made a test dataset like this:

$ mkdir sample
$ cd sample
$ for i in {1..19}; do cp ~/pics/nina.jpg $i.jpg; done

Then ran the thumbnailer like this:

$ /usr/bin/time -f %M:%e ./thumb3.rb 200 sample/*
writing sample/thumb_10.jpg ...
writing sample/thumb_11.jpg ...
writing sample/thumb_12.jpg ...
writing sample/thumb_13.jpg ...
writing sample/thumb_14.jpg ...
writing sample/thumb_15.jpg ...
writing sample/thumb_16.jpg ...
writing sample/thumb_17.jpg ...
writing sample/thumb_18.jpg ...
writing sample/thumb_19.jpg ...
writing sample/thumb_1.jpg ...
writing sample/thumb_2.jpg ...
writing sample/thumb_3.jpg ...
writing sample/thumb_4.jpg ...
writing sample/thumb_5.jpg ...
writing sample/thumb_6.jpg ...
writing sample/thumb_7.jpg ...
writing sample/thumb_8.jpg ...
writing sample/thumb_9.jpg ...
97800:0.89

So it made the 19 thumbnails in 900ms and needed a peak of 98mb of memory.

There's little concurrency in thumbnailing, so you can turn off the libvips threadpool and save some time and memory. This makes a relatively big difference on this PC since it has 32 hardware threads. You'll see less of a change on typical machines:

$ VIPS_CONCURRENCY=1 /usr/bin/time -f %M:%e ./thumb3.rb 200 sample/*
writing sample/thumb_10.jpg ...
...
writing sample/thumb_9.jpg ...
78792:0.59

78mb peak and 600ms.

If you run with no images, you can see the startup cost:

$ VIPS_CONCURRENCY=1 /usr/bin/time -f %M:%e ./thumb3.rb 200 
53756:0.11

So starting ruby and libvips needs 54mb and 110ms. Subtracting the two, the actual image processing is around 24mb and 500ms.

You must be logged in to vote
0 replies
Comment options

I looked at the original issue more carefully and I think your graphs might not be including subprocesses, could that be correct?

mini_magick works by generating imagemagick command lines and executing them in subprocesses. It's called mini_magick because this approach minimises the memory use of the calling ruby process. Of course you still need memory for processing, it's just less visible.

You can make a little shell script that reproduces this behaviour, for example:

#!/bin/bash
for filename in $*; do
 echo processing $filename ...
 for size in 910x650 200x150; do
 convert $filename -resize ${size} $filename-$size.jpg
 done
done

The bash process will see little memory use, it'll all be in the convert process that's being run. I think this is what your minimagick graph is showing.

If I run that script ^^ on 15 of your test images I see:

$ /usr/bin/time -f %M:%e ../thumb3.sh *
processing 10.jpg ...
...
processing 1.jpg ...
processing 2.jpg ...
processing 3.jpg ...
processing 4.jpg ...
processing 5.jpg ...
processing 6.jpg ...
processing 7.jpg ...
processing 8.jpg ...
processing 9.jpg ...
344616:18.53

So it takes 18.5s and needs a peak of 350mb of memory (/usr/bin/time will include memory used in subprocesses). The ruby-vips benchmark above was 24mb peak and 500ms runtime, so memory use is a lot lower overall.

You could speed up the convert script using a define to exploit libjpeg shrink on load, though I don't know if mini_magick does this, it'd be interesting to see the exact command it's generating.

One drawback of ruby-vips is that the heavy processing is happening directly in your web server process. While it's quick, this has two big downsides:

  1. Security. If there's a bad bug in one of the image load libraries that libvips uses (for example, libpng), your web server process could be attacked with a crafted file.

    To mitigate this, I would set the VIPS_BLOCK_UNTRUSTED env var. You could also consider a separate ruby process for image handling, but perhaps that's only worthwhile for very large sites.

  2. Memory fragmentation. A lot of active low level threaded code in your process will cause memory fragmentation with many malloc implementations. I would either live with it, or look into something like jemalloc.

You must be logged in to vote
1 reply
Comment options

There are some notes about blocking of untrusted loaders here:

https://www.libvips.org/2022/05/28/What's-new-in-8.13.html

Comment options

Hi @jcupitt ,

thanks a lot for your help and all the tips you gave me. I have try to build a docker image with your script inside and run it and there was no issue at all. The memory usage graph stayed very flat.

In the meantime I saw another github thread where you said that on debian build-essentials package was mandatory and that was not the case for me. Do you think this could be the cause of the memory peaks ?

You must be logged in to vote
1 reply
Comment options

You need build-essentials for pyvips, but ruby-vips is fine with just the library binary.

Comment options

Hi again,

I did some more tests and I fear that the problem is still here. I think I just badly tested last time or had a false positive.
I did a lot of tests , trying with ruby-vips directly with the help of the script you gave me, trying with the image-processing gem as in my rails application.

There is no difference in memory usage with the scripts and at the end they all have a peak of memory around 500 mb for the processing of 60 images of 3 mb each.

I noticed something interesting. When I run the script I see no memory issue. When I run the code in a rails console, the memory usage increase but seems to never get released until I shut down the console. As you can see in the graph below.

image

Could there be a memory leak in the ruby-vips gem ? I doubt of that but I ask the question anyway ...

You must be logged in to vote
10 replies
Comment options

Just to be sure of what I m saying, I did a test with the image_processing gem using mini_magick and the result is shown on the graph below. The execution is extremely slow (~ 3/4 min vs ~ 30 secs with vips) and use much more CPU than with vips but the memory remain quite low and gets released after use without having to kill the rails console.
For information the script start at 17:19 and ends at 17:23 and I killed the rails console at 17:29

image
Comment options

For jemalloc, MALLOC_CONF=narenas:2 would be the equivalent of the glibc-specific MALLOC_ARENA_MAX=2 environment variable. You could see useful stats of jemalloc by defining MALLOC_CONF=stats_print:true in the environment, which will cause jemalloc to dump statistics to stderr just before program exit.

Since you mentioned Docker, you might consider switching to a Alpine-based Docker image, as the memory allocator in musl, upon which Alpine Linux is based, is generally considered very good in terms of lack of fragmentation and returning freed memory.

Comment options

Perhaps the problem is the thing making the memory graph? I'm not clear exactly what it's measuring.

What if you watch RES in top? Does that give a different number? RES is usually the best guide.

Comment options

This graph measure the total memory of the kubernetes pod that is running my sidekiq process. The pod is only running this process and no one use the app at the moment so the total memory is:

  • ~ 250 mb for the sidekiq process running in background
  • ~ 250 mb for the rails console used to run the script
  • the rest is the memory used by the processing

I had a look at top and it's giving similar results . When I have a RES that goes up to 660164 and never get down untill I kill the rails console.
I tried with MALLOC_CONF=narenas:2 with no effect

This is really frustrating as I really want to use vips and I feel I'm the only one with memory issues. But the comparison with imagemagick lets me think that there is something wrong not in rails or with my code. I can't tell if it comes from my docker image, from debian, from vips or anything else in between.

Comment options

I'd guess this is memory fragmentation. Perhaps something is allocated on the heap after libvips runs and before the next GC, and that's preventing jemalloc from shrinking the heap again?

Fragmentation will stabilise over time and unused pages will get swapped out by the OS, so I'd think this is not a serious issue unless you have very tight memory constraints.

I'd be very concerned if memory use continued to grow without bounds, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /