I want to map functions from my functions library (Map
called as chain
) to input string str
. Also that functions (Twitter.removeRT
, ...) are regex which replace substrings in input str
. I think it is better to save that regex-functions in Map like in my example.
So, the code:
Tail recursion variant
def filterTwitter2 (str: String): String = {
@scala.annotation.tailrec
def recFilter(str: String, chain: Map[String, (String) => String]): String = {
chain.headOption match {
case Some(v) =>
val filteredString = v._2(str)
recFilter(filteredString, chain.tail)
case None => str
}
}
val chain = Map[String,(String) => String](
"f1"-> Twitter.removeRT,
"f2"-> Twitter.removeNickName,
"f3"-> Twitter.removeURL,
"f6"-> Emoticons.removePunctRepetitions,
"f7"-> Emoticons.removeHorizontalEmoticons,
"f9"-> Emoticons.normalizeEmoticons,
"f10"-> Beautify.removeCharRepetitions,
"f12"-> Beautify.removeNSpaces
)
recFilter(str, chain)
}
foreach
variant
def filterTwitter (str: String): String = {
var tmp = str
val chain = Map[String,(String) => String](
"f1"-> Twitter.removeRT,
"f2"-> Twitter.removeNickName,
"f3"-> Twitter.removeURL,
"f6"-> Emoticons.removePunctRepetitions,
"f7"-> Emoticons.removeHorizontalEmoticons,
"f9"-> Emoticons.normalizeEmoticons,
"f10"-> Beautify.removeCharRepetitions,
"f12"-> Beautify.removeNSpaces
)
chain.foreach {
case (name, func) => tmp = func(tmp)
}
tmp
}
So, the questions:
- Is it normal to save functions in map? What can be better for it?
- What better: tail recursion variant or variant with
foreach
? - Maybe there is any better solution for that problem?
1 Answer 1
You never use your Map
to lookup a filter function by its key. If you don't need any of the keys then you don't really need a Map
.
def filterTwitter(str :String) :String =
List(Twitter.removeRT
,Twitter.removeNickName
,Twitter.removeURL
,Emoticons.removePunctRepetitions
,Emoticons.removeHorizontalEmoticons
,Emoticons.normalizeEmoticons
,Beautify.removeCharRepetitions
,Beautify.removeNSpaces
).foldRight(str)(_(_))
EXPLANATION
The 1st underscore is an element from the List
that is being folded. Because this is a fold Right, it will start with the last element (removeNSpaces
) and work toward the head (removeRT
).
The 2nd underscore is the result from the previous invocation and it is being passed as an argument to the filter function. (Actually it's a little more complicated than that, but this is an easy way to think about it.)
So this is what's going down:
removeNSpaces(str) ===> resStr1
removeCharRepetitions(resStr1) ===> resStr2
normalizeEmoticons(resStr2) ===> resStr3
. . .
removeRT(prevResStr) ===> finalResStr
-
\$\begingroup\$ Yes it is, so you here, with
List
, right. Also i will look atfoldRight(str)(_(_))
to understand how it works. \$\endgroup\$Gudsaf– Gudsaf2020年03月01日 04:44:02 +00:00Commented Mar 1, 2020 at 4:44 -
1\$\begingroup\$ It's a syntactic abbreviation of:
foldRight(str){case (f, s) => f(s)}
\$\endgroup\$jwvh– jwvh2020年03月01日 06:33:05 +00:00Commented Mar 1, 2020 at 6:33 -
\$\begingroup\$ can you explain what is 1st underscore in brackets and what is 2nd underscore which in brackets of brackets? as i suppose, 1st - is element of list which in my case is function (like
Twitter.removeRT
), but i can`t understand what that function from list applay in place of 2nd underscore. \$\endgroup\$Gudsaf– Gudsaf2020年03月01日 21:18:49 +00:00Commented Mar 1, 2020 at 21:18 -
1\$\begingroup\$ @Gudsaf; See my update. \$\endgroup\$jwvh– jwvh2020年03月02日 07:30:43 +00:00Commented Mar 2, 2020 at 7:30
foreach
is very easy to understand, use it. But it will be more clear if you definevar tmp
1 line beforechain.foreach
. \$\endgroup\$