16
\$\begingroup\$

I'd like to hear some thoughts on my simple anagram finder, written in Scala. I mainly wrote this as a first go at writing some serious Scala, and I wanted to try out the Scalaz library.

I'm coming from a Java background, with minimal experience with functional languages. Things I'd ideally like feedback on:

  • Any improvements to make it less imperative
  • Any other uses of library code, like Scalaz

I know that I've not covered all test cases, but this was enough to get me up and running.

General algorithm:

  • Parse a list of words. Turn each word into lower case, sort alphabetically and treat that as the 'signature' of the word.
  • Insert signature and word into a map, with the key being the signature, and the value being a set (or list) of words with the same signature
  • Retrieval is simply looking up a word based on its signature

Signature class

case class Stem(word: String) {
 val stem: List[Char] = word.toLowerCase.trim.toList.sortWith((e1, e2) => (e1 < e2))
 override def equals(p1: Any) = p1 match {
 case s: Stem => s.stem == stem
 case _ => false
 }
 override def hashCode = stem.hashCode
 override def toString = stem.mkString("Stem(", "", ")")
}

Notes:

I'm sure I'd be able to take the word as a parameter and manipulate it to be in alphabetical order, rather than having a stem field. I'm sure I'd have been able to omit the equals and hashCode methods then, but I couldn't work out how to do that. I think maybe because I'm using a case class here?

Anagram Finder class

import scalaz._
import Scalaz._
class AnagramFinder(words: List[String]) {
 
 implicit def string2Stem(s: String): Stem = Stem(s)
 
 val mapping = createMapping(words)
 
 def createMapping(words: List[String]) = {
 var anagrams: Map[Stem, Set[String]] = Map()
 
 words.foreach(s => anagrams = anagrams |+| Map(Stem(s) -> Set(s)))
 
 anagrams
 }
 
 def find(word: String): Option[Set[String]] = mapping.get(word)
}

Notes:

I really wanted to just have anagrams |+| Map(Stem(s) -> Set(s)) in my foreach, but that required (a) using a mutable map instead, which I'd be fine with, but (b) having to write my own semigroup implementation for the mutable Map, which I struggled with.

Also, anagrams |+| Map(Stem(s) -> Set(s)) isn't picking up my implicit String to Stem method. Why?

Anagrams object

object Anagrams {
 
 def apply(wordListFile: String, word: String) = {
 val finder = new AnagramFinder(scala.io.Source.fromFile(wordListFile).mkString.split('\n').toList)
 finder.find(word)
 }
}

Notes: I know I should initialise the word list into a val field on the object which can be reused between calls, but this was really just to get me started.

Appendix: Tests

class StemTest extends org.scalatest.FunSuite {
 
 test("Stem of noel is e,l,n,o") {
 assert(Stem("noel").stem == List('e','l','n','o'))
 }
 
 test("Stem of Hello is H,e,l,l,o") {
 assert(Stem("Hello").stem == List('H', 'e', 'l', 'l', 'o'))
 }
 
 test("Stems of abc and cba are equal") {
 assert(Stem("abc") == Stem("cba"))
 }
}
class AnagramFinderTest extends FunSuite {
 
 test("List of words should create appropriate map") {
 val testMap:Map[Stem, Set[String]] = Map(Stem("abc") -> Set("cba"), Stem("def") -> Set("def"), Stem("ggz") -> Set("zgg"))
 val givenList = List("cba", "def", "zgg")
 
 val mapping = new AnagramFinder(givenList).mapping
 assert(mapping == testMap)
 }
 
 test("Same stemmed words should both exist in the map") {
 val testMap:Map[Stem, Set[String]] = Map(Stem("abc") -> Set("abc", "cba"))
 val givenList = List("abc", "cba")
 
 val mapping = new AnagramFinder(givenList).mapping
 assert(mapping == testMap)
 }
 
 test("Finding words") {
 val givenList = List("abc", "cba", "bob", "bca")
 
 val anagramFinder = new AnagramFinder(givenList)
 
 val foundAnagrams = anagramFinder.find("abc").get
 
 val expected = Set("abc", "bca", "cba")
 
 assert(foundAnagrams.forall(s => expected contains s))
 assert(expected.forall(s => foundAnagrams contains s))
 }
 
 test("Full test") {
 
 val expected = Some(Set("act", "cat", "Cat"))
 val actual = Anagrams("/usr/share/dict/words", "cat")
 
 assert(expected == actual)
 }
}
asked Sep 8, 2012 at 17:10
\$\endgroup\$

1 Answer 1

14
\$\begingroup\$

Some suggestions to improve your code:

xs.sortWith((e1, e2) => (e1 < e2)) is identical to xs.sorted

If you are not interested in Stem.word you can place it into companions' apply-method:

object Stem {
 def apply(word: String): Stem = {
 val stem: List[Char] = word.toLowerCase.trim.toList.sorted
 new Stem(stem)
 }
}
case class Stem private(stem: List[Char])

source.mkString.split('\n') is identical to source.getLines


Whenever you want to sum-up something or use an accumulator, fold is what you are searching for:

(Map.empty[Stem, Set[String]] /: words) {
 case (map, word) => map + (Stem(word) -> Set(word))
}

With help of scalaz:

words.map(word => Map(Stem(word) -> Set(word))) reduceLeft { _|+|_ }

reduce is a fold: xs.tail.fold(xs.head)(f)

/: is a synonym for foldLeft: (init /: xs)(f) == (xs foldLeft init)(f)


Your implicit conversion String => Stem doesn't work in string -> set because -> already needs an implicit conversion and you are not allowed to apply multiple ones at once. When you use tuple notation (string, set), your implicit conversion is applied.

Donald.McLean
4,76732 silver badges51 bronze badges
answered Sep 9, 2012 at 11:02
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.