8
\$\begingroup\$

I came across the question about creating memory-efficient Ruby pipe class with lazy evaluation. Some code is given to effectively create a pipeline of commands using lazy enumerators. I have been messing round with it and have implemented a tee command shown below. I feel like there is a better way to do it though.

class Pipe
 class Tee
 def initialize(pipe)
 @pipe = pipe
 @buffer = []
 @pipe.source = self.to_enum
 end
 def <<(item)
 @buffer << item
 # puts "Added '#{item}' to buffer: #{@buffer.inspect}"
 end
 def to_enum
 Enumerator.new do |yielder|
 item = @buffer.shift
 yielder << item if item
 end.lazy
 end
 end
 attr_writer :source
 def initialize
 @commands = []
 @source = nil
 end
 def add(command, opts = {})
 @commands << [command, opts]
 self
 end
 def run
 enum = @source
 @commands.each do |command, options|
 enum = method(command).call(enum, options)
 end
 enum.each {}
 enum
 end
 def cat(enum, options)
 Enumerator.new do |yielder|
 enum.map { |line| yielder << line } if enum
 options[:input].tap do |ios|
 ios.each { |line| yielder << line }
 end
 end.lazy
 end
 def out(enum, options)
 Enumerator.new do |yielder|
 enum.each do |line|
 puts line
 yielder << line
 end
 end.lazy
 end
 def cut(enum, options)
 Enumerator.new do |yielder|
 enum.each do |line|
 fields = line.chomp.split(%r{#{options[:delimiter]}})
 yielder << fields[options[:field]]
 end
 end.lazy
 end
 def upcase(enum, options)
 Enumerator.new do |yielder|
 enum.each do |line|
 yielder << line.upcase
 end
 end
 end
 def tee(enum, options)
 teed = Tee.new(options.fetch(:other))
 Enumerator.new do |yielder|
 enum.each do |line|
 yielder << line
 teed << line
 end
 end
 end
 def grep(enum, options)
 Enumerator.new do |yielder|
 enum.each do |line|
 yielder << line if line.match(options[:pattern])
 end
 end.lazy
 end
end
another_pipe = Pipe.new
another_pipe.add(:grep, pattern: /testing/i)
another_pipe.add(:out)
pipe = Pipe.new
pipe.add(:cat, :input => StringIO.new("testing\na new\nline"))
# pipe.add(:grep, pattern: /line/)
pipe.add(:tee, :other => another_pipe)
pipe.add(:upcase)
pipe.add(:out)
pipe.run
puts "================================================="
another_pipe.run

Here's the output:

TESTING
A NEW
LINE
=================================================
testing

Does anyone see a smarter way of doing this?

asked Jun 10, 2014 at 21:57
\$\endgroup\$
2
  • \$\begingroup\$ Is there a reason you don't call Enumerator#lazy in the pipe or tee methods when you do in all of the others? \$\endgroup\$ Commented Oct 15, 2014 at 20:08
  • \$\begingroup\$ @Jordan I think I left it out by mistake. They should probably both use Enumerator#lazy \$\endgroup\$ Commented Oct 15, 2014 at 20:18

1 Answer 1

6
\$\begingroup\$

I know it's an old question, but it's an interesting one.

First, a couple small things:

  1. In Pipe#run:

    enum = method(command).call(enum, options)
    

    Calling method has a lot of overhead, and send(command, enum, options) does the same thing.

  2. Also in Pipe#run:

    enum.each {}
    

    The purpose of this is to cause the final object to be enumerated. We can do the same thing cheaper (even an empty block takes CPU cycles) with enum.to_a, which is also aliased as enum.force.

Those things aside, the first thing I noticed is that there's a lot of repetition. grep and upcase, for example, only differ on one line. We can extract the shared code out into another method, which I'll call enum_lazy:

def upcase(enum, options)
 enum_lazy(enum) do |line, yielder|
 yielder << line.upcase
 end
end
def grep(enum, options)
 enum_lazy(enum) do |line, yielder|
 yielder << line if line.match(options[:pattern])
 end
end
private
def enum_lazy(enum, &block)
 Enumerator.new do |yielder|
 enum.each do |line|
 yield(line, yielder)
 end
 end.lazy
end

...but this made me realize that some of these methods don't need the yielder pattern at all. For example, upcase takes one element at a time, performs an operation on it, and yields one value each time—it's just a lazy map, and this is equivalent:

def upcase(enum, options)
 enum.lazy.map(&:upcase)
end

Since we called lazy, this will return an Enumerator::Lazy as expected. We can simplify cut and out in the same way:

def out(enum, options)
 enum.lazy.map do |line|
 puts line
 line
 end
end
def cut(enum, options)
 enum.lazy.map do |line|
 fields = line.chomp.split(options[:delimiter])
 fields[ options[:field] ]
 end
end

grep is different, because we only don't want output for every input line. As it turns out, though, grep is just a lazy select:

def grep(enum, options)
 enum.lazy.select {|line| line =~ options[:pattern] }
end

That just leaves cat. I have a strong inkling that there's a similar optimization we could do for cat, but I haven't yet been able to actually figure it out. Perhaps someone else can help figure it out.

Here's the final code:

class Pipe
 class Tee
 # (no changes here)
 end
 attr_writer :source
 def initialize
 @commands = []
 @source = nil
 end
 def add(command, opts = {})
 @commands << [command, opts]
 self
 end
 def run
 enum = @source
 @commands.each do |command, options|
 enum = send(command, enum, options)
 end
 enum.force
 enum
 end
 def cat(enum, options)
 Enumerator.new do |yielder|
 enum.map { |line| yielder << line } if enum
 options[:input].tap do |ios|
 ios.each { |line| yielder << line }
 end
 end.lazy
 end
 def out(enum, options)
 enum.lazy.map do |line|
 puts line
 line
 end
 end
 def cut(enum, options)
 enum.lazy.map do |line|
 fields = line.chomp.split(/#{options[:delimiter]}/)
 fields[ options[:field] ]
 end
 end
 def upcase(enum, options)
 enum.lazy.map(&:upcase)
 end
 def tee(enum, options)
 teed = Tee.new(options.fetch(:other))
 enum_lazy(enum) do |line, yielder|
 yielder << line
 teed << line
 end
 end
 def grep(enum, options)
 enum.lazy.select {|line| line =~ options[:pattern] }
 end
 private
 def enum_lazy(enum, &block)
 Enumerator.new do |yielder|
 enum.each do |line|
 yield(line, yielder)
 end
 end.lazy
 end
end
answered Oct 15, 2014 at 22:02
\$\endgroup\$
1
  • \$\begingroup\$ Since I just remembered that your real question was whether or not your Tee class could be improved... somewhat anti-climactically, I can't think of any obvious improvements. Good work. :) \$\endgroup\$ Commented Oct 15, 2014 at 22:11

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.