Question
I wrote a function which will take a collection of strings — and return a comma-separated string which doesn't exceed the specified character limit.
It essentially works by joining the collection of strings over and over again, and removing the last item in the collection each time, until the comma-separated string doesn't exceed the specified character limit.
private static string Join(IEnumerable<string> collection, int characterLimit)
{
var values = new List<string>(collection);
var joined = string.Join(", ", values);
while (joined.Length > characterLimit)
{
values.RemoveAt(values.Count - 1);
joined = string.Join(", ", values);
}
return joined;
}
I feel like there must more efficient way of doing this ... and that's what I'd love some help with.
Example
private string[] words =
{
"able", "acid", "angry", "automatic", "awake", "bad", "beautiful", "bent", "bitter",
"black", "blue", "boiling", "bright", "broken", "brown", "certain", "cheap", "chemical", "chief", "clean",
"clear", "cold", "common", "complete", "complex", "conscious", "cruel", "cut", "dark", "dead", "dear",
"deep", "delicate", "dependent", "different", "dirty", "dry", "early", "elastic", "electric", "equal",
"false", "fat", "feeble", "female", "fertile", "first", "fixed", "flat", "foolish", "free", "frequent",
"full", "future", "general", "good", "gray", "great", "green", "hanging", "happy", "hard", "healthy", "high",
"hollow", "ill", "important", "kind", "last", "late", "left", "like", "living", "long", "loose", "loud",
"low", "male", "married", "material", "medical", "military", "mixed", "narrow", "natural", "necessary",
"new", "normal", "old", "open", "opposite", "parallel", "past", "physical", "political", "poor", "possible",
"present", "private", "probable", "public", "quick", "quiet", "ready", "red", "regular", "responsible",
"right", "rough", "round", "sad", "safe", "same", "second", "secret", "separate", "serious", "sharp",
"short", "shut", "simple", "slow", "small", "smooth", "soft", "solid", "special", "sticky", "stiff",
"straight", "strange", "strong", "sudden", "sweet", "tall", "thick", "thin", "tight", "tired", "true",
"violent", "waiting", "warm", "wet", "white", "wide", "wise", "wrong", "yellow", "young"
};
private void joinButton_Click(object sender, EventArgs e)
{
var characterLimit = 500; // the joined string's length will be <= 500
var joined = Join(words, characterLimit); // the joined string
Console.WriteLine("joined: " + joined);
Console.WriteLine("joined.Length: " + joined.Length);
}
Output
joined: able, acid, angry, automatic, awake, bad, beautiful, bent, bitter, black, blue, boiling, bright, broken, brown, certain, cheap, chemical, chief, clean, clear, cold, common, complete, complex, conscious, cruel, cut, dark, dead, dear, deep, delicate, dependent, different, dirty, dry, early, elastic, electric, equal, false, fat, feeble, female, fertile, first, fixed, flat, foolish, free, frequent, full, future, general, good, gray, great, green, hanging, happy, hard, healthy, high, hollow, ill
joined.Length: 495
-
\$\begingroup\$ Yes that is inefficient \$\endgroup\$paparazzo– paparazzo2017年05月19日 15:44:44 +00:00Commented May 19, 2017 at 15:44
-
1\$\begingroup\$ @Paparazzi That is precisely why I came to Code Review. :P \$\endgroup\$Owen– Owen2017年05月19日 16:59:33 +00:00Commented May 19, 2017 at 16:59
4 Answers 4
Like @Heslacher already suggested you should start by first getting only as many words as you need and you can express the same algorithm with two different LINQ expressions by making it even easier to understand.
Additionaly you should name the parameters appropriately. The upper limit is exclusive so add this to its name.
LINQ query syntax
The first alternative uses the query syntax:
private static IEnumerable<string> JoinableValues(this IEnumerable<string> values, int separatorLength, int maxLengthExclusive)
{
var currentLength = 0;
return
from x in values
where (currentLength += x.Length + separatorLength) < maxLengthExclusive
select x;
}
The other method then just consumes the result.
public static string Join(this IEnumerable<string> values, string separator, int maxLengthExclusive)
{
return string.Join(separator, values.JoinableValues(separator.Length, maxLengthExclusive));
}
Usually you would make only the Join
public
becasue the other one is just its helper.
There is still place for improvment and with C# 7 local functions you can encapsulate even the last part of it:
private static IEnumerable<string> JoinableValues(this IEnumerable<string> values, int separatorLength, int maxLengthExclusive)
{
var currentLength = 0;
return
from x in values
where IsJoinable(x)
select x;
bool IsJoinable(string value) => (currentLength += value.Length + separatorLength) < maxLengthExclusive;
}
LINQ with extensions
The second alternative can be implemented with TakeWhile
which agian, uses the same algorithm.
var maxLengthExclusive = 100;
var currentLength = 0;
var separator = ", ";
var joinableWords = words.TakeWhile(w => (currentLength += w.Length + separator.Length) < maxLengthExclusive);
The same as above encapsulated and with the new helper fuction:
private static IEnumerable<string> JoinableValues(this IEnumerable<string> values, int separatorLength, int maxLengthExclusive)
{
var currentLength = 0;
return values.TakeWhile(IsJoinable);
bool IsJoinable(string value) => (currentLength += value.Length + separatorLength) < maxLengthExclusive;
}
-
\$\begingroup\$ I thought extension methods had to be public. Is private a C# 7.0 feature? \$\endgroup\$Rick Davin– Rick Davin2017年05月19日 21:24:31 +00:00Commented May 19, 2017 at 21:24
-
\$\begingroup\$ @RickDavin they have to be static and the class has to be static too but they can have any access modifier you need them to have. It's not C# 7 ;-) \$\endgroup\$t3chb0t– t3chb0t2017年05月19日 21:25:48 +00:00Commented May 19, 2017 at 21:25
Although @TopinFrassi's solution is death plain and simple, I would separate the fetching of the values which fit in the limit and the actual joining with the separator like so
private static IEnumerable<string> FetchInRange(IEnumerable<string> collection, int characterLimit, int separationLength)
{
int currentlength = -separationLength;
foreach (var value in collection)
{
int length = value.Length + separationLength;
currentlength += length;
if (currentlength > characterLimit)
{
yield break;
}
yield return value;
}
}
private static string Join(IEnumerable<string> collection, int characterLimit, string separator = ", ")
{
return string.Join(separator, FetchInRange(collection, characterLimit, separator.Length));
}
Each method has only one responsibility and they can be used for a wider range of separators as well.
-
\$\begingroup\$ I really like this answer but you are low key on some improvements it has over original. For one, yours is more flexible since 'separator' is now an argument instead of hardcoded at ", ". Plus your method eliminates 2 variables: the 'values' list of string, as well 'joined'. The OP seems to be a newbie (no offense intended) so bringing up these points helps teach him. \$\endgroup\$Rick Davin– Rick Davin2017年05月19日 14:15:25 +00:00Commented May 19, 2017 at 14:15
-
\$\begingroup\$ mover currentlength += length; and use it in the if \$\endgroup\$paparazzo– paparazzo2017年05月19日 15:46:26 +00:00Commented May 19, 2017 at 15:46
-
\$\begingroup\$ @Paparazzi good idea. fixed. \$\endgroup\$Heslacher– Heslacher2017年05月19日 15:50:59 +00:00Commented May 19, 2017 at 15:50
I feel like you're doing a lot of work for a quite simple problem.
Using String.Join
is kind of an expensive operation and you call it a lot. Plus, you don't really need to remove values from your "original" list.
Let's try and "translate" your problem into simple lines of code.
I wrote a function which will take a collection of strings — and return a comma-separated string which doesn't exceed the specified character limit.
Let's try to write code that fits exactly that sentence, and see how it goes.
private static string Join(IEnumerable<string> collection, int characterLimit)
{
//Takes the collection of strings
var joined = string.Join(", ", collection.ToArray());
//Returns a comma-separated string
joined = joined.Substring(0,characterLimit);
//Trim the remaining characters
joined = joined.Substring(0, joined.LastIndexOf(','));
return joined;
}
This way, there's no loop and we join the strings only once.
The problem with your code is that it's, as you said, not the most efficient way to do what you want. When you write code, always try to be conscious of the "cost" of the methods you use, and the amount of loops you use. Though I guess it comes with experience!
Lastly, I don't think Join
is a good method name, considering the method doesn't only join strings. It's kind of hard for me to come up with a better name considering I don't have the context of your application. But you should try to find something more representative!
-
3\$\begingroup\$ You missed one edge case. Assume that excatly at the
characterLimit
a word ends, then your solution will cut the last word. By changing the firstSubtring()
tojoined = joined.Substring(0, characterLimit + 1);
and adding anif
before the secondSubstring()
likeif (joined[characterLimit] == ',') { return joined.TrimEnd(','); }
will fix that issue. \$\endgroup\$Heslacher– Heslacher2017年05月19日 05:39:55 +00:00Commented May 19, 2017 at 5:39 -
4\$\begingroup\$ There is no need to call
ToArray()
on the passed collection.String.Join()
has an overload which takes anIEnumerable<string>
as well. \$\endgroup\$Heslacher– Heslacher2017年05月19日 06:00:10 +00:00Commented May 19, 2017 at 6:00 -
\$\begingroup\$ Unfortunatelly it will crash if the character limit is less then the length of the first word. \$\endgroup\$t3chb0t– t3chb0t2017年05月19日 20:45:41 +00:00Commented May 19, 2017 at 20:45
Not much to say about the current code. It is clean enough but very inefficient. Under the covers Join uses StringBuilder.
private static string Join(IEnumerable<string> collection, int characterLimit)
{
StringBuilder sb = new StringBuilder(characterLimit);
foreach(string s in collection)
{
if (sb.Length + 2 + s.Length > characterLimit)
break;
if (sb.Length > 0)
sb.Append(", ");
sb.Append(s);
}
return sb.ToString();
//var values = new List<string>(collection);
//var joined = string.Join(", ", values);
//while (joined.Length > characterLimit)
//{
// values.RemoveAt(values.Count - 1);
// joined = string.Join(", ", values);
//}
//return joined;
}