2
\$\begingroup\$

This is a simple Ruby program to analyze the number of lines characters speak in Shakespeare's Macbeth using Nokogiri and open-uri to parse a given url containing xml. I wrote this as practice getting used to Ruby and TDD.

I'm new to Ruby so I'm wondering if anything looks strange to more experienced Rubyists.

Note: I had trouble comparing hashes in the spec "correctly analyzes character line counts". Even after calling to_s on both, it looks equal but the test fails. I had trouble with that. I'm wondering if this is even a good test to make with large hashes?

Also: I was wondering if I should be testing the calls using open-uri open? I didn't want to download from the webpage in the tests. I'd appreciate it if someone might demonstrate how to do this. I tried using a Mock with something like expect(Kernel).to receive(:open).with(MACBETH_URL) but had trouble with it.

Given:

Write a command-line program that prints the number of lines spoken by each character in the play.

Sample usage/output (using made-up numbers):

$ ruby macbeth_analyzer.rb
 543 Macbeth
 345 Banquo
 220 Duncan
 (etc.)

You can find an XML-encoded version of Macbeth here: http://www.ibiblio.org/xml/examples/shakespeare/macbeth.xml. Your program should download and parse this file at runtime.

Your solution must be tested, preferably via TDD. Running your tests should not download the play from the ibiblio.org server.

Note: some lines are attributed to a speaker called "ALL". Your program should ignore these.

macbeth_analyzer.rb

#!/usr/bin/env ruby
require_relative 'parser'
require_relative 'printer'
class MacbethAnalyzer
 attr_reader :speaker_line_counts
 MACBETH_URL = "http://www.ibiblio.org/xml/examples/shakespeare/macbeth.xml"
 IGNORE_SPEAKERS = %w(ALL)
 def initialize
 @parser = Parser.new(MACBETH_URL)
 @speaker_line_counts = analyze_speaker_line_counts
 end
 def analyze_speaker_line_counts
 speaker_line_counts = Hash.new(0)
 speakers = @parser.parse(query: '//SPEAKER')
 speakers.map do |speaker|
 unless IGNORE_SPEAKERS.include? speaker.text
 speaker_line_counts[speaker.text] += @parser.parse(
 node: speaker, query: '../LINE').count
 end
 end
 speaker_line_counts
 end
end
if __FILE__ == 0ドル
 analyzer = MacbethAnalyzer.new
 Printer.print_hash_val_capitalized_key(
 hash: analyzer.speaker_line_counts, filter: 'ALL')
end

macbeth_analyzer_spec.rb

require 'macbeth_analyzer'
describe MacbethAnalyzer do
 describe "#analyze_speaker_line_counts" do
 it "analyzes speaker line counts and stores in hash" do
 SPEAKER_HASH_COUNT = 40
 analyzer = MacbethAnalyzer.new
 result = analyzer.analyze_speaker_line_counts
 expect(result.count).to eq(SPEAKER_HASH_COUNT)
 expect(result["MACBETH"]).to eq(719)
 expect(result["BANQUO"]).to eq(113)
 expect(result["DUNCAN"]).to eq(70)
 end
 it "should ignore speaker title 'ALL'" do
 analyzer = MacbethAnalyzer.new
 result = analyzer.analyze_speaker_line_counts
 expect(result["ALL"]).to eq(0)
 end
 xit "correctly analyzes character line counts" do
 macbeth_line_counts = {
 "First Witch": 62,
 "Second Witch": 27,
 "Third Witch": 27,
 "DUNCAN": 70,
 "MALCOLM": 212,
 "Sergeant": 35,
 "LENNOX": 74,
 "ROSS": 135,
 "MACBETH": 719,
 "BANQUO": 113,
 "ANGUS": 21,
 "LADY MACBETH": 265,
 "Messenger": 23,
 "FLEANCE": 2,
 "Porter": 46,
 "MACDUFF": 180,
 "DONALBAIN": 10,
 "Old Man": 11,
 "ATTENDANT": 1,
 "First Murderer": 30,
 "Second Murderer": 15,
 "Both Murderers": 2,
 "Servant": 5,
 "Third Murderer": 8,
 "Lords": 3,
 "HECATE": 39,
 "Lord": 21,
 "First Apparition": 2,
 "Second Apparition": 4,
 "Third Apparition": 5,
 "LADY MACDUFF": 41,
 "Son": 20,
 "Doctor": 45,
 "Gentlewoman": 23,
 "MENTEITH": 12,
 "CAITHNESS": 11,
 "SEYTON": 5,
 "SIWARD": 30,
 "Soldiers": 1,
 "YOUNG SIWARD": 7
 }
 macbeth = MacbethAnalyzer.new
 result = macbeth.analyze_speaker_line_counts
 # TODO: how to make this pass? tried to_s both
 expect(result).to eq(macbeth_line_counts)
 end
 end
end

parser.rb

require 'open-uri'
require 'nokogiri'
class Parser
 def initialize(url)
 @doc = Nokogiri::XML(open(url))
 end
 def parse(node: @doc, query:)
 node.xpath(query)
 end
end

parser_spec.rb

require 'parser'
describe Parser do
 describe "#parse" do
 it "parses speaker nodes" do
 SPEAKER_COUNT = 650
 parser = Parser.new('spec/fixtures/macbeth.xml')
 result = parser.parse(query: '//SPEAKER')
 expect(result.count).to eq(SPEAKER_COUNT)
 end
 end
end

printer.rb

class Printer
 def self.print_hash_val_capitalized_key(hash:, filter: nil)
 hash.each do |key, value|
 puts "#{value} #{key.capitalize}" unless key.to_s === filter
 end
 end
end

printer_spec.rb

require 'printer'
describe Printer do
 describe "#print_hash_val_capitalized_key" do
 it "prints a hash of name keys capitalized and count values" do
 test_hash = {
 "LENNOX": 74,
 "ROSS": 135,
 "MACBETH": 719,
 }
 expect{Printer.print_hash_val_capitalized_key(hash: test_hash)}.to output(
 "74 Lennox\n135 Ross\n719 Macbeth\n").to_stdout
 end
 it "excludes a given filter from output" do
 test_hash = {
 "LENNOX": 74,
 "ALL": 24,
 "ROSS": 135,
 "MACBETH": 719,
 }
 expect{Printer.print_hash_val_capitalized_key(hash: test_hash, filter: "ALL")}.to output(
 "74 Lennox\n135 Ross\n719 Macbeth\n").to_stdout
 end
 end
end
asked Oct 25, 2016 at 6:13
\$\endgroup\$
0

1 Answer 1

1
\$\begingroup\$

Naming

Choose Names at the Appropriate Level of Abstraction

I noticed an issue that reminded me of something I read in Clean Code, which says:

"Don’t pick names that communicate implementation; choose names the reflect the level of abstraction of the class or function you are working in." (Martin, 311)

Martin gives an example of a Modem interface that contains:

boolean dial(String phoneNumber);

He points out that in some applications, modems aren't connected by dailing but rather hardwiring or sending a port number, etc. The point is that using phoneNumber is the wrong level of abstraction. He suggests changing to:

boolean connect(String connectionLocator);

This better communicates the ways a consumer could use the interface.

The relation to my application is in Parser:

class Parser
 def initialize(url)
 @doc = Nokogiri::XML(open(url))
 end
 ...

open-uri's open() takes a uri (who would've thought?). A uri is a level of abstraction above a url. In the tests, I initialize a Parser object with a local xml fixture uri (not a url):

parser = Parser.new('spec/fixtures/macbeth.xml')

Therefore, Parser should become:

class Parser
 def initialize(uri)
 @doc = Nokogiri::XML(open(uri))
 end
 ...

So this may seem rather trivial in the context of my small application but it does reflect a valuable lesson from Clean Code in choosing names at the appropriate level of abstraction.

Make Meaningful Distinctions

When I started working on this problem, early on I decided what the responsibilities would be. I reasoned that we'd need a concrete runner class MacbethAnalyzer that's responsible for handling the specifics of the problem statement, a Parser, who's responsible for parsing, and a Printer to handle print.

Having completed a working implementation, I notice that Parser is only responsible for parsing Xml, which has a clear dependency in the constructor:

class Parser
 def initialize(uri)
 @doc = Nokogiri::XML(open(uri))
 end

The given problem is small, but it's easy to imagine a larger project where we parse from several sources of varying formats beyond XML. It seems clear that class Parser isn't specific enough and should be more meaningfully distinguished by renaming to XmlParser.


Dependency Inversion Principle

The constructor of MacbethAnalzyer has tight coupling in the dependency to the low-level implementation details of XmlParser:

class MacbethAnalyzer
 @parser = XmlParser.new(MACBETH_URL)
 @speaker_line_counts = analyze_speaker_line_counts
 end

What if XmlParser changes? What if we wanted to use a different kind of Parser? What if we wanted to construct a Fake object for testing and pass that to MacbethAnalyzer? These are issues related to the Dependency Inversion Principle.

The solution here is to use "dependency injection" to inject Parser into MacbethAnalyzer:

class MacbethAnalyzer
 attr_reader :speaker_line_counts
 MACBETH_URL = "http://www.ibiblio.org/xml/examples/shakespeare/macbeth.xml"
 IGNORE_SPEAKERS = %w(ALL)
 def initialize(parser)
 @parser = parser
 @speaker_line_counts = analyze_speaker_line_counts
 end
 def analyze_speaker_line_counts
 speaker_line_counts = Hash.new(0)
 speakers = parser.parse(query: '//SPEAKER')
 speakers.map do |speaker|
 unless IGNORE_SPEAKERS.include? speaker.text
 speaker_line_counts[speaker.text] += parser.parse(
 node: speaker, query: '../LINE').count
 end
 end
 speaker_line_counts
 end
 private
 attr_reader :parser
end
if __FILE__ == 0ドル
 analyzer = MacbethAnalyzer.new(XmlParser.new(MACBETH_URL))
 Printer.print_hash_val_capitalized_key(
 hash: analyzer.speaker_line_counts, filter: 'ALL')
end

Now MacbethAnalyzer doesn't depend on a lower-level implementation; we use duck typing to make things flexible, removing the dependency and resulting in much more loosely coupled code.

answered Oct 28, 2016 at 6:51
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.