I solved this problem in Ruby:
Write an utility that takes 3 command-line parameters P1, P2 and P3. P3 is OPTIONAL (see below) P1 is always a file path/name. P2 can take the values:
- "lines"
- "words"
- "find"
Only P2 is "find", then P3 is relevant/needed, otherwise it is not.
So, the utility does the following:
- If P2 is "rows" it says how many lines it has
- If P2 is "words" it says how many words it has (the complete file)
- If P2 is "find" it prints out the lines where P3 is present
My solution looks like this:
#!/usr/bin/env ruby
def print_usage
puts "Usage: #{0ドル} <file> words|lines"
puts " #{0ドル} <file> find <what-to-find>"
end
class LineCounter
# Initialize instance variables
def initialize
@line_count = 0
end
def process(line)
@line_count += 1
end
def print_result
puts "#{@line_count} lines"
end
end
class WordCounter
# Initialize instance variables
def initialize
@word_count = 0
end
def process(line)
@word_count += line.scan(/\w+/).size
end
def print_result
puts "#{@word_count} words"
end
end
class WordMatcher
# Initialize instance variables, using constructor parameter
def initialize(word_to_find)
@matches = []
@word_to_find = word_to_find
end
def process(line)
if line.scan(/#{@word_to_find}/).size > 0
@matches << line
end
end
def print_result
@matches.each { |line|
puts line
}
end
end
# Main program
if __FILE__ == $PROGRAM_NAME
processor = nil
# Try to find a line-processor
if ARGV.length == 2
if ARGV[1] == "lines"
processor = LineCounter.new
elsif ARGV[1] == "words"
processor = WordCounter.new
end
elsif ARGV.length == 3 && ARGV[1] == "find"
word_to_find = ARGV[2]
processor = WordMatcher.new(word_to_find)
end
if not processor
# Print usage and exit if no processor found
print_usage
exit 1
else
# Process the lines and print result
File.readlines(ARGV[0]).each { |line|
processor.process(line)
}
processor.print_result
end
end
My questions are:
- Is there a more Ruby-esque way of solving it?
- More compact, but still readable / elegant?
It seems checking for correct command-line parameter combinations takes up a lot of space...
Contrast it to the Scala version found here:
https://gist.github.com/anonymous/93a975cb7aba6dae5a91#file-counting-scala
-
\$\begingroup\$ If you are satisfied with any of the answers, you should select the one that was most helpful to you. \$\endgroup\$Cary Swoveland– Cary Swoveland2014年02月27日 20:26:50 +00:00Commented Feb 27, 2014 at 20:26
3 Answers 3
Some notes:
- Those counter classes are probably overkill, keep it simple.
- Ruby is an OOP language, but it's not necessary to create a bunch of classes for simple scripts like this.
- Idiomatic:
if not x
->if !x
- Idiomatic:
{ ... }
for one-line blocks,do
/end
for multi-line.
I'd write:
fail("Usage: #{0} PATH (lines|words|find REGEXP)") unless ARGV.size >= 2
path, mode, optional_regexp = ARGV
open(path) do |fd|
case mode
when "lines"
puts(fd.lines.count)
when "words"
puts(fd.lines.map { |line| line.split.size }.reduce(0, :+))
when "find"
if optional_regexp
fd.lines.each { |line| puts(line) if line.match(optional_regexp) }
else
fail("mode find requires a REGEXP argument")
end
else
fail("Unknown mode: #{mode}")
end
end
-
2\$\begingroup\$ Thanks for the tips about idiomatic Ruby code. And thanks for the example. I know there was a "Ruby way" of doing it... short, compact, pragmatic, to the point, yet readable. \$\endgroup\$Sebastian N.– Sebastian N.2014年02月13日 08:48:54 +00:00Commented Feb 13, 2014 at 8:48
-
\$\begingroup\$ Upvoted. Great answer. One small suggestion for an improvement: Put all the argument checking and fail statements at the top. Then the program reads: 1. data validation 2. actual content. It has the added benefit of getting rid of all the "if... else" statements. \$\endgroup\$Jonah– Jonah2014年02月25日 07:29:57 +00:00Commented Feb 25, 2014 at 7:29
Formatting
Most Rubiest favor some white space between methods, such as:
class LineCounter
# Initialize instance variables
def initialize
@line_count = 0
end
def process(line)
@line_count += 1
end
def print_result
puts "#{@line_count} lines"
end
end
{...} vs do...end
For multi-line blocks, prefer do...end:
File.readlines(arguments.path).each do |line|
arguments.processor.process(line)
end
Comments
Comments, when used, should say something the code doesn't already say. This comment, and some of the others, can be eliminated without injuring the reader's ability to understand the code:
# Initialize instance variables
def initialize
@line_count = 0
end
Argument parsing
You are correct that argument parsing in this script has the potential to be improved. There are a few different ideas that could help here.
Separate class
I usually like to put argument parsing in its own class:
class Arguments
attr_reader :path
attr_reader :processor
def initialize(argv)
@path = argv[0]
if argv.length == 2
if argv[1] == "lines"
@processor = LineCounter.new
elsif argv[1] == "words"
@processor = WordCounter.new
end
elsif argv.length == 3 && argv[1] == "find"
word_to_find = argv[2]
@processor = WordMatcher.new(word_to_find)
end
if not @processor
print_usage
exit 1
end
end
private
def print_usage
puts "Usage: #{0ドル} <file> words|lines"
puts " #{0ドル} <file> find <what-to-find>"
end
end
The main program becomes:
if __FILE__ == $PROGRAM_NAME
arguments = Arguments.new(ARGV)
File.readlines(arguments.path).each { |line|
arguments.processor.process(line)
}
arguments.processor.print_result
end
I had more I was going to write, but after seeing the simplicity of @tokland's answer, I think the approaches I was going to take are not so good.
-
\$\begingroup\$ Thanks for the tips. Interesting approach with your Arguments class... Have you considered using a special library for command-line argument validation? \$\endgroup\$Sebastian N.– Sebastian N.2014年02月13日 08:49:50 +00:00Commented Feb 13, 2014 at 8:49
-
\$\begingroup\$ @Sebastian Yes, I did. optparse, of course, only takes care of switch (
--foo
) arguments, so it would be no help. I have often looked for libraries which do good handling of non-switch arguments; I am not aware of one that just parses arguments. The ones I've seen have strong opinions on parts of your program that are not argument parsing. \$\endgroup\$Wayne Conrad– Wayne Conrad2014年02月13日 13:30:22 +00:00Commented Feb 13, 2014 at 13:30
As you have not indicated whether you are looking for a quick and dirty--possibly one-off--solution, or production code, and have said nothing of file size, I decided to suggest something you could employ for the former purpose, when the file is not humongous (because I read it all into a string):
fname, op, regex = ARGV
s = File.read(fname)
case op
when 'rows'
puts s[-1] == $/ ? s.count($/) : s.count($/) + 1
when 'words'
puts s.split.size
when 'find'
regex = /#{regex}/
s.each_line {|l| puts l if l =~ regex}
end
where $/
is the end-of-line character(s). Let's create a file for demonstration purposes:
text =<<_
Now is the time
for all good
Rubiests to
spend some
time coding.
_
File.write('f1', text)
If the above code is in the file 'file_op.rb', we get these results:
ruby 'file_op.rb' 'f1' 'rows' #=> 5
ruby 'file_op.rb' 'f1' 'words' #=> 13
ruby 'file_op.rb' 'f1' 'find' 'time'
#=> Now is the time
# time coding.
-
\$\begingroup\$ Thanks for the super-compact solution. It is a good example and serves me well, however I would like to show an "usage" text in case of missing / incorrect arguments. But please don't change your example! I like it that it's so short. \$\endgroup\$Sebastian N.– Sebastian N.2014年02月13日 08:47:24 +00:00Commented Feb 13, 2014 at 8:47
-
\$\begingroup\$ I think you can remove the
+ [nil]
. Unlike Python, you can de-struct even if sizes do not match. \$\endgroup\$tokland– tokland2014年02月13日 10:04:35 +00:00Commented Feb 13, 2014 at 10:04 -
\$\begingroup\$ Sebastian, I figured you could add whatever data checks you wanted. @tokland, thank you-good to know that, edited my answer--and I'd also like to thank Ruby. \$\endgroup\$Cary Swoveland– Cary Swoveland2014年02月13日 17:48:53 +00:00Commented Feb 13, 2014 at 17:48