I wrote a Ruby gem to analyse strings for word length, word density, and a few more hand methods...
The Counter
class accepts a filter (String). The class will remove any words from the original string that you include in the filter. Spec:
it "removes words passed in the filter" do
counter = Counter.new("That was magnificent, Trevor.", filter: "magnificent")
expect(counter.words).to eq(%w[That was Trevor])
end
I added a feature that allows you to send in a Regexp filter instead of a string. It's my first use of Procs, and I'm sure it can be done better. It's not very readable, and probably not efficient.
it "it accepts a regexp filter" do
counter = Counter.new("That was magnificent, Trevor.", filter: /magnificent/i)
expect(counter.words).to eq(%w[That was Trevor])
end
def initialize(string, options = {})
@options = options
@char_count = string.length
# Here I split the wrongs from the string, then call reject to
# remove any words that meet the filter criteria
@words = string.scan(regex).reject { |word| filter.call(word) }
# ...
end
# The filter method checks to see if the filter is a string or a Regexp
# It returns a proc object that acts according.
def filter
filters = @options[:filter]
if filters.is_a?(String)
f = filters.split.collect { |word| word.downcase }
Proc.new { |w| f.include?(w.downcase) }
elsif filters.is_a?(Regexp)
Proc.new { |w| w =~ filters }
else
Proc.new {}
end
end
As you can see, it looks a bit messy compared to the original methods:
def initialize(string, options = {})
@options = options
@char_count = string.length
@words = string.scan(regex).reject { |word| filter.include? word.downcase }
# snip...
end
def filter
if filters = @options[:filter]
filters.split.collect { |word| word.downcase }
else
[]
end
end
For the sake of brevity I have not included the enter class, but you can see it here (65 lines). And here is a gist of my proposed changes. Only the initialise
and filter
methods change.
2 Answers 2
Moving a decision into code that gets executed earlier, with later code merely telling an object to "do whatever it is we decided earlier" is one of the techniques of confident programming. It's often a good idea, and I think it is well applied here.
Is there a better name for the filter option?
There may be a more descriptive name for the option than "filter." The trouble is that "filter" might be interpreted as "here are the things I want", instead of "here are the things I don't want."
A less ambiguous name for this option might be "except," "reject", or "exclude". You get the idea.
Similarly, you might consider renaming the instance method @filter.
Set proc in constructor
Uri is correct that it is better, if you are going to do this, to create the proc just once. He does so using lazy evaluation. You might also consider just setting it in #initialize:
def initialize(string, options = {})
@filter = filter_proc(options[:filter])
...
end
# Your filter method, renamed, and made to take an argument
def exclude_proc(filter)
if filter.is_a?(String)
...
end
Use type coercion methods
Rather than testing the option's class, consider testing whether the object responds to a coercion method that converts it to that class. So, instead of:
if filter.is_a?(String)
# the code that converts a string to a proc
you would do this:
if filter.respond_to?(:to_str)
filter = filter.to_str
# the code that converts a string to a proc
String responds to #to_str, so filters of those types will continue to work fine. But now the caller can pass in, as well as a String, any object that responds to #to_str.
By the way: We use #to_str instead of #to_s because #to_s will convert pretty much anything to a string. #to_str only converts to a string objects which wish to be treated like strings.
You can use the same idea for regular expressions, but you cannot do it the same way. The convention is that an object that responds to #to_regexp can be treated as a Regexp, but for some strange reason Regexp does not respond to #to_regexp. Instead, you have to use Regexp.try_convert:
elsif Regexp.try_convert(filters)
filters = Regexp.try_convert(filters)
Proc.new { |w| w =~ filters }
If given a Regexp, .try_convert will return it. If given an object that responds to #to_regexp, .try_convert will return the result of that method. Otherwise, it returns nil.
Consider allowing procs
It's trivial to allow the caller to pass in a callable object, such as a method:
counter = Counter.new("That was magnificent, Trevor.",
filter: method(:exclude_word?))
or a lambda:
counter = Counter.new("That was magnificent, Trevor.",
filter: ->(word) { foo(word) || bar(word) })
In the code that makes the proc:
...
elsif filter.respond_to?(:to_proc)
filter.to_proc
...
By allowing this, you give the caller the ability to make decisions that don't fit comfortably in a list of words or a regular expression.
Consider allowing arrays
Once you've made method filter take an argument, you can use a bit of recursion to allow it to take arrays of filters:
def filter_proc(filter)
if filter.respond_to?(:to_a)
filter_procs = Array(filter).map(&method(:filter_proc))
Proc.new do |w|
filter_procs.any? { |p| p.call(w) }
end
...
end
Interestingly, nil responds to #to_a with an empty array. That works out fine, because the the result of #any? on an empty array is false.
The caller may now pass in an array of words:
Counter.new("That was magnificent, Trevor.",
filter: ["That", "was"])
An array of regular expressions sill also work:
Counter.new("That was magnificent, Trevor.",
filter: [/That/, /was/])
Also, since Enumerable responds to #to_a, the possibilities are endless. This makes the API flexible enough that the caller can do anything needed, but still rational enough to understand and reason about. Best of all, it's done pretty simply.
Catch incorrect filter types
If the caller passes in a filter type that the code does not understand, it acts as though there is no filter. Consider raising an exception instead:
def filter_proc(filter)
if ...
...
else
raise ArgumentError, "Incorrect filter type"
end
end
Use require_relative for spec_helper
In the spec, consider using require_relative to load the spec_helper:
require_relative '../spec_helper'
This will allow the spec to be run no matter what the current directory is.
Consider using lambdas instead of procs
Everything you are doing here with procs can be done with lambdas, and more concisely. For example, this:
Proc.new { |w| w =~ filter }
could be expressed as:
lambda { |w| w =~ filter }
or, using the "stabby" operator:
->(w) { w =~ filter }
-
\$\begingroup\$ Thank you for once more taking the time to critique my code. Because I work alone I have idiosyncratic and narrow coding habits. This type of information is invaluable to me. To be honest, some of this went over my head, and I still don't understand having reread it several times (the part about sending in a proc/array). This reflects my lack of experience, and not your excellent code review. Having not worked with Procs and lambdas before I have a hard time connecting the dots... or working the implementation details. I will comment again once I start implementing your concepts. \$\endgroup\$Mohamad– Mohamad2014年05月12日 00:17:58 +00:00Commented May 12, 2014 at 0:17
-
\$\begingroup\$ Regarding names:
reject
is Ruby standard, I believe.(1..10).reject(&:even?)
gives[1, 3, 5, 7, 9]
. \$\endgroup\$wchargin– wchargin2014年05月12日 00:23:08 +00:00Commented May 12, 2014 at 0:23 -
\$\begingroup\$ @Mohamad You are most welcome. Thank you for the very fun question! I had a blast answering it. If you have specific questions, then please ask here, or on SO, or ping me in chat--whichever is most appropriate. I don't think there's anything very far over your head here--a little explanation should be all that's needed. \$\endgroup\$Wayne Conrad– Wayne Conrad2014年05月12日 03:59:32 +00:00Commented May 12, 2014 at 3:59
-
\$\begingroup\$ @WChargin Enumerable#reject does increase the goodness of :reject as a name for that option. \$\endgroup\$Wayne Conrad– Wayne Conrad2014年05月12日 04:22:39 +00:00Commented May 12, 2014 at 4:22
Since you call filter
within each element in your word array, there is no point in each time building a Proc
object and returning it - it would be easier to simply return whether the word should be rejected or not:
def initialize(string, options = {})
# ...
@words = string.scan(regex).reject { |word| filter?(word) }
# ...
end
# The filter method checks to see if the filter is a string or a Regexp
# It returns a proc object that acts according.
def filter?(w)
filters = @options[:filter]
if filters.is_a?(String)
f = filters.split.collect { |word| word.downcase }
f.include?(w.downcase)
elsif filters.is_a?(Regexp)
w =~ filters
end
end
If you do want to use Proc
you can create it once, and use that:
def initialize(string, options = {})
# ...
@words = string.scan(regex).reject(&filter)
# ...
end
# The filter method checks to see if the filter is a string or a Regexp
# It returns a proc object that acts according.
def filter
@filter ||= begin
filters = @options[:filter]
if filters.is_a?(String)
f = filters.split.collect { |word| word.downcase }
Proc.new { |w| f.include?(w.downcase) }
elsif filters.is_a?(Regexp)
Proc.new { |w| w =~ filters }
else
Proc.new {}
end
end
end
-
\$\begingroup\$ Thank you for taking the time to review my code. Indeed a proc is not necessary here. \$\endgroup\$Mohamad– Mohamad2014年05月12日 00:19:19 +00:00Commented May 12, 2014 at 0:19