I recently discovered that there is no free read-write lock implementation for Ruby available on the Internet, so I built one myself. The intention is to keep it posted on GitHub indefinitely for the Ruby community at large to use.
The code is at https://github.com/alexdowad/showcase/blob/master/ruby-threads/read_write_lock.rb. For convenience, it is also repeated below. (There is another version which goes further to avoid reader starvation; it is currently at https://github.com/alexdowad/showcase/blob/fair-to-readers/ruby-threads/read_write_lock.rb.)
Alex Kliuchnikau very kindly reviewed the first version and found a serious bug. To fix that bug, I had to rethink the implementation and make some major changes. I am hoping someone can review the new version, first for correctness, and then for any ways to increase performance. Also, if you have a multi-core processor, please try running the file (it has a built-in test script). Problems which don't show up on my single-core machine may show up much more easily with multiple cores.
# Ruby read-write lock implementation
# Allows any number of concurrent readers, but only one concurrent writer
# (And if the "write" lock is taken, any readers who come along will have to wait)
# If readers are already active when a writer comes along, the writer will wait for
# all the readers to finish before going ahead
# But any additional readers who come when the writer is already waiting, will also
# wait (so writers are not starved)
# Written by Alex Dowad
# Bug fixes contributed by Alex Kliuchnikau
# Thanks to Doug Lea for java.util.concurrent.ReentrantReadWriteLock (used for inspiration)
# Usage:
# lock = ReadWriteLock.new
# lock.with_read_lock { data.retrieve }
# lock.with_write_lock { data.modify! }
# Implementation notes:
# A goal is to make the uncontended path for both readers/writers lock-free
# Only if there is reader-writer or writer-writer contention, should locks be used
# Internal state is represented by a single integer ("counter"), and updated
# using atomic compare-and-swap operations
# When the counter is 0, the lock is free
# Each reader increments the counter by 1 when acquiring a read lock
# (and decrements by 1 when releasing the read lock)
# The counter is increased by (1 << 15) for each writer waiting to acquire the
# write lock, and by (1 << 30) if the write lock is taken
require 'atomic'
require 'thread'
class ReadWriteLock
def initialize
@counter = Atomic.new(0) # single integer which represents lock state
@reader_q = ConditionVariable.new # queue for waiting readers
@reader_mutex = Mutex.new # to protect reader queue
@writer_q = ConditionVariable.new # queue for waiting writers
@writer_mutex = Mutex.new # to protect writer queue
end
WAITING_WRITER = 1 << 15
RUNNING_WRITER = 1 << 30
MAX_READERS = WAITING_WRITER - 1
MAX_WRITERS = RUNNING_WRITER - MAX_READERS - 1
def with_read_lock
acquire_read_lock
yield
release_read_lock
end
def with_write_lock
acquire_write_lock
yield
release_write_lock
end
def acquire_read_lock
while(true)
c = @counter.value
raise "Too many reader threads!" if (c & MAX_READERS) == MAX_READERS
# If a writer is running OR waiting, we need to wait
if c >= WAITING_WRITER
# But it is possible that the writer could finish and decrement @counter right here...
@reader_mutex.synchronize do
# So check again inside the synchronized section
@reader_q.wait(@reader_mutex) if @counter.value >= WAITING_WRITER
end
else
break if @counter.compare_and_swap(c,c+1)
end
end
end
def release_read_lock
while(true)
c = @counter.value
if @counter.compare_and_swap(c,c-1)
# If one or more writers were waiting, and we were the last reader, wake a writer up
if c >= WAITING_WRITER && (c & MAX_READERS) == 1
@writer_mutex.synchronize { @writer_q.signal }
end
break
end
end
end
def acquire_write_lock
while(true)
c = @counter.value
raise "Too many writers!" if (c & MAX_WRITERS) == MAX_WRITERS
if c == 0 # no readers OR writers running
# if we successfully swap the RUNNING_WRITER bit on, then we can go ahead
break if @counter.compare_and_swap(0,RUNNING_WRITER)
elsif @counter.compare_and_swap(c,c+WAITING_WRITER)
while(true)
# Now we have successfully incremented, so no more readers will be able to increment
# (they will wait instead)
# However, readers OR writers could decrement right here, OR another writer could increment
@writer_mutex.synchronize do
# So we have to do another check inside the synchronized section
# If a writer OR reader is running, then go to sleep
c = @counter.value
@writer_q.wait(@writer_mutex) if (c >= RUNNING_WRITER) || ((c & MAX_READERS) > 0)
end
# We just came out of a wait
# If we successfully turn the RUNNING_WRITER bit on with an atomic swap,
# Then we are OK to stop waiting and go ahead
# Otherwise go back and wait again
c = @counter.value
break if (c < RUNNING_WRITER) && @counter.compare_and_swap(c,c+RUNNING_WRITER-WAITING_WRITER)
end
break
end
end
end
def release_write_lock
while(true)
c = @counter.value
if @counter.compare_and_swap(c,c-RUNNING_WRITER)
if (c & MAX_WRITERS) > 0 # if any writers are waiting...
@writer_mutex.synchronize { @writer_q.signal }
else
@reader_mutex.synchronize { @reader_q.broadcast }
end
break
end
end
end
end
if __FILE__ == 0ドル
# for performance comparison with ReadWriteLock
class SimpleMutex
def initialize; @mutex = Mutex.new; end
def with_read_lock
@mutex.synchronize { yield }
end
alias :with_write_lock :with_read_lock
end
# for seeing whether my correctness test is doing anything...
# and for seeing how great the overhead of the test is
# (apart from the cost of locking)
class FreeAndEasy
def with_read_lock
yield # thread safety is for the birds... I prefer to live dangerously
end
alias :with_write_lock :with_read_lock
end
require 'benchmark'
TOTAL_THREADS = 40 # set this number as high as practicable!
def test(lock)
puts "READ INTENSIVE (80% read, 20% write):"
single_test(lock, (TOTAL_THREADS * 0.8).floor, (TOTAL_THREADS * 0.2).floor)
puts "WRITE INTENSIVE (80% write, 20% read):"
single_test(lock, (TOTAL_THREADS * 0.2).floor, (TOTAL_THREADS * 0.8).floor)
puts "BALANCED (50% read, 50% write):"
single_test(lock, (TOTAL_THREADS * 0.5).floor, (TOTAL_THREADS * 0.5).floor)
end
def single_test(lock, n_readers, n_writers, reader_iterations=50, writer_iterations=50, reader_sleep=0.001, writer_sleep=0.001)
puts "Testing #{lock.class} with #{n_readers} readers and #{n_writers} writers. Readers iterate #{reader_iterations} times, sleeping #{reader_sleep}s each time, writers iterate #{writer_iterations} times, sleeping #{writer_sleep}s each time"
mutex = Mutex.new
bad = false
data = 0
result = Benchmark.measure do
readers = n_readers.times.collect do
Thread.new do
reader_iterations.times do
lock.with_read_lock do
mutex.synchronize { bad = true } if (data % 2) != 0
sleep(reader_sleep)
mutex.synchronize { bad = true } if (data % 2) != 0
end
end
end
end
writers = n_writers.times.collect do
Thread.new do
writer_iterations.times do
lock.with_write_lock do
# invariant: other threads should NEVER see "data" as an odd number
value = (data += 1)
# if a reader runs right now, this invariant will be violated
sleep(writer_sleep)
# this looks like a strange way to increment twice;
# it's designed so that if 2 writers run at the same time, at least
# one increment will be lost, and we can detect that at the end
data = value+1
end
end
end
end
readers.each { |t| t.join }
writers.each { |t| t.join }
puts "BAD!!! Readers+writers overlapped!" if mutex.synchronize { bad }
puts "BAD!!! Writers overlapped!" if data != (n_writers * writer_iterations * 2)
end
puts result
end
test(ReadWriteLock.new)
test(SimpleMutex.new)
test(FreeAndEasy.new)
end
-
\$\begingroup\$ And if someone were to test it on JRuby on an 864 core Azul Vega system, I'm sure Alex wouldn't complain, either :-) \$\endgroup\$Jörg W Mittag– Jörg W Mittag2012年02月20日 18:15:24 +00:00Commented Feb 20, 2012 at 18:15
-
\$\begingroup\$ Actually, anybody at all who can just run the test script and post the results is highly appreciated. I would feel irresponsible putting this up on the web for people to use, unless it has been tested on a variety of systems. \$\endgroup\$Alex D– Alex D2012年02月20日 18:37:34 +00:00Commented Feb 20, 2012 at 18:37
1 Answer 1
Have you considered to join reader and writer queues into a single writer-waiting queue? This may make the code easier to synchronize and also solve the issue with readers starvation when constant writes are performed, something like this:
- if write lock is taken or queue is not empty, new operation (read or write) goes into queue
- if no write lock is taken and queue is empty, new read operation is performed. New write operation goes into queue and waits for read operations to end
- when write lock released, run next operation from the queue. If this is read operation, try run next operations until encounter write operation/queue is empty. If write operation is encounered, wait for reads (see item 2)
I believe you should encapsulate operations on the counter into separate, say ReadWriteCounter
, abstraction. It will hold Atomic
@counter
instance internally and will perform check and operations through methods like new_writer
, writer_running?
etc. This should improve code readability.
I ran the code at Pentium(R) Dual-Core CPU E5400 @ 2.70GHz machine:
MRI Ruby 1.9.2-p290:
READ INTENSIVE (80% read, 20% write):
Testing ReadWriteLock with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.030000 0.030000 0.060000 ( 0.499945)
WRITE INTENSIVE (80% write, 20% read):
Testing ReadWriteLock with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.030000 0.010000 0.040000 ( 1.761913)
BALANCED (50% read, 50% write):
Testing ReadWriteLock with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.030000 0.010000 0.040000 ( 1.121450)
READ INTENSIVE (80% read, 20% write):
Testing SimpleMutex with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.010000 0.020000 0.030000 ( 2.125943)
WRITE INTENSIVE (80% write, 20% read):
Testing SimpleMutex with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.010000 0.010000 0.020000 ( 2.114639)
BALANCED (50% read, 50% write):
Testing SimpleMutex with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.010000 0.010000 0.020000 ( 2.114838)
READ INTENSIVE (80% read, 20% write):
Testing FreeAndEasy with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.000000 0.000000 0.000000 ( 0.058086)
WRITE INTENSIVE (80% write, 20% read):
Testing FreeAndEasy with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.010000 0.010000 0.020000 ( 0.060335)
BALANCED (50% read, 50% write):
Testing FreeAndEasy with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.000000 0.000000 0.000000 ( 0.057053)
Jruby 1.6.5.1:
READ INTENSIVE (80% read, 20% write):
Testing ReadWriteLock with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
0.875000 0.000000 0.875000 ( 0.875000)
WRITE INTENSIVE (80% write, 20% read):
Testing ReadWriteLock with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
1.880000 0.000000 1.880000 ( 1.880000)
BALANCED (50% read, 50% write):
Testing ReadWriteLock with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
1.201000 0.000000 1.201000 ( 1.201000)
READ INTENSIVE (80% read, 20% write):
Testing SimpleMutex with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
2.196000 0.000000 2.196000 ( 2.196000)
WRITE INTENSIVE (80% write, 20% read):
Testing SimpleMutex with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
2.219000 0.000000 2.219000 ( 2.219000)
BALANCED (50% read, 50% write):
Testing SimpleMutex with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
2.229000 0.000000 2.229000 ( 2.229000)
READ INTENSIVE (80% read, 20% write):
Testing FreeAndEasy with 32 readers and 8 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.074000 0.000000 0.074000 ( 0.074000)
WRITE INTENSIVE (80% write, 20% read):
Testing FreeAndEasy with 8 readers and 32 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.060000 0.000000 0.060000 ( 0.060000)
BALANCED (50% read, 50% write):
Testing FreeAndEasy with 20 readers and 20 writers. Readers iterate 50 times, sleeping 0.001s each time, writers iterate 50 times, sleeping 0.001s each time
BAD!!! Readers+writers overlapped!
BAD!!! Writers overlapped!
0.105000 0.000000 0.105000 ( 0.104000)
-
\$\begingroup\$ Alex, encapsulating operations on the counter is a great idea. I'm not sure if I should use a separate
ReadWriteCounter
class, or just someprivate
methods on theReadWriteLock
. I thinkprivate
methods may actually be enough. I really like the idea of using only 1 queue, primarily because it would reduce the memory footprint of eachReadWriteLock
and make it more feasible to use very large numbers of them. (Have you seen theConcurrentHash
which I am working on? It's in my GitHub repo. I'm concerned about the memory used byReadWriteLock
s for that one.) \$\endgroup\$Alex D– Alex D2012年02月29日 18:44:19 +00:00Commented Feb 29, 2012 at 18:44 -
\$\begingroup\$ I originally decided to use 2 queues because then I can release all readers with
@read_q.broadcast
. Since any number of readers can run concurrently, if you release one, it's best to release them all. But the scheme you described is workable and has definite advantages too. I think I should try another implementation using the scheme you described, and compare with the current implementation... \$\endgroup\$Alex D– Alex D2012年02月29日 18:47:52 +00:00Commented Feb 29, 2012 at 18:47 -
\$\begingroup\$ I tried a new implementation with only a single queue, and refactored for readability at the same time. It looks much nicer than before, but performance is terrible. I haven't done any testing to figure out why yet. It's the new 'alexs-ideas' branch on my GitHub repo. \$\endgroup\$Alex D– Alex D2012年02月29日 20:28:41 +00:00Commented Feb 29, 2012 at 20:28
-
\$\begingroup\$ thanks for your input! I think I'm going to stick with 2 queues, but factoring out the operations on the counter is a great idea. BTW, I have a bounty which is expiring tomorrow, if you want it, please comment on this question: codereview.stackexchange.com/questions/9583/… \$\endgroup\$Alex D– Alex D2012年03月09日 10:07:39 +00:00Commented Mar 9, 2012 at 10:07