I want to pass time interval as a string (to program args) and then I want to parse it to TimeSpan
. I created simple class:
public static class TimeSpanConverter
{
public static TimeSpan Convert(string input)
{
var units = new Dictionary<string, int>()
{
{@"(\d+)(ms|mili[|s]|milisecon[|s])", 1 },
{@"(\d+)(s|sec|second[|s])", 1000 },
{@"(\d+)(m|min[|s])", 60000 },
{@"(\d+)(h|hour[|s])", 3600000 },
{@"(\d+)(d|day[|s])", 86400000 },
{@"(\d+)(w|week[|s])", 604800000 },
};
var timespan = new TimeSpan();
foreach(var x in units)
{
var matches = Regex.Matches(input, x.Key);
foreach(Match match in matches)
{
var amount = System.Convert.ToInt32(match.Groups[1].Value);
timespan = timespan.Add(TimeSpan.FromMilliseconds(x.Value * amount));
}
}
return timespan;
}
}
Which is working, and I can easly parse like this:
var interval = TimeSpanConverter.Convert("1day13hour2min12s52ms");
What do you think about that? How can I improve that? I have a feeling, that it is not well coded.
3 Answers 3
public static class TimeSpanConverter { public static TimeSpan Convert(string input)
To people who use WPF a lot, the name hints at this class being a TypeConverter
or an IValueConverter
. I would say that it's really a parser. Similarly, I think that the method should be called Parse
.
var units = new Dictionary<string, int>() { {@"(\d+)(ms|mili[|s]|milisecon[|s])", 1 }, {@"(\d+)(s|sec|second[|s])", 1000 }, {@"(\d+)(m|min[|s])", 60000 }, {@"(\d+)(h|hour[|s])", 3600000 }, {@"(\d+)(d|day[|s])", 86400000 }, {@"(\d+)(w|week[|s])", 604800000 }, };
The prefix milli has two ls. millisecond ends in a d.
[|s]
is more idiomatically written as s?
.
Given the way the regexes are used (searching for matches without anchoring), most of them could be simplified. E.g. any match for min
would also match m
, so @"(\d+)m"
will find exactly the same multiples of a minute.
The numbers on the right are magic numbers, and at a glance I can't be completely certain that they're correct. I would prefer to use TimeSpan
instances and take advantage of TimeSpan.FromMilliseconds(1)
, TimeSpan.FromSeconds(1)
, etc.
I think that a real world example of a human-readable string would be "1 day, 13 hours, 2 minutes, 12 seconds, and 52 ms"
. At the very least, I would add \s*
between the digits and the units.
Another quite realistic example would be "2 minutes and 13.5 seconds"
.
It's also plausible that people will use :
as a separator: "2 days, 11:04:20"
Then you get ugly stuff like months and years, which don't have fixed lengths.
If you want to refuse to parse some of these, that's fine. But in that case, I think the method still needs to handle them by throwing an exception. At present they would all return TimeSpan.Zero
, which is misleading.
Have you given any thought to localisation? If you only want to use this in an English-language context, wouldn't [0-9]
make more sense than \d
? If you're matching all digits, you should have test cases for things like "1day١hour1minute1second"
.
var timespan = new TimeSpan();
Can you think of a more descriptive name for this variable?
var matches = Regex.Matches(input, x.Key); foreach(Match match in matches)
If the string has more than one match for a given regex, isn't that rather worrying? I wouldn't consider "4hours2minutes1hour"
to be a very realistic input.
var amount = System.Convert.ToInt32(match.Groups[1].Value); timespan = timespan.Add(TimeSpan.FromMilliseconds(x.Value * amount));
Again, amount
isn't a very descriptive name.
TimeSpan
supports the operator +
, and IMO total += addend;
is more readable than total = total.Add(addend);
.
-
\$\begingroup\$ I would add \s* between the digits and the units. - In such cases I usually normalize a string first by replacing all multiple-whitespaces by a single one to avoid using the
*
. \$\endgroup\$t3chb0t– t3chb0t2019年04月10日 08:02:57 +00:00Commented Apr 10, 2019 at 8:02
Adding to Peter Taylors answer:
@"(\d+)(ms|mili[|s]|milisecon[|s])"
@"(\d+)(m|min[|s])"
"5ms"
will be caught by both patterns above.
var amount = System.Convert.ToInt32(match.Groups[1].Value);
For weeks > 3
this will make an overflow when multiplied with 604800000
.
You should use double.Parse()
, because TimeSpan.FromMilliseconds()
takes a double as argument.
I don't have experience in programming in C#, but I do have some regex advice to share.
You can DRY your pattern by
(\d+)(?:ms|mili(?:secon)?s?)
(\d+)(?:s(?:ec)?|seconds?)
(\d+)(?:m|mins?)
(\d+)(?:h|hours?)
(\d+)(?:d|days?)
(\d+)(?:w|weeks?)
I am encouraging:
- non-capturing groups when there is no use for the matched substrings.
- I am using the zero or more quantifier (
?
) to reduce repetition and alternation (pipes). - the omission of pipes in your character classes because they are useless in your expected input.
s?
to make pluralization optional.
If you need to ensure that m
is not followed by s
when executing the minutes pattern, you will need to extend the pattern. Perhap m(?!s)|mins?
.
If you are in control of the incoming strings, please correct the spelling of milliseconds