I need to parse only valid timestamp formats:
- Hours followed by minutes followed by seconds:
t=1h2m3s
- Minutes followed by seconds:
t=2m3s
- Only hours:
t=3h
- Only minutes:
t=4m
- Only seconds:
t=5s
and escape other timestamp formats liket=2m1h
ort=3s2m
ort=3s1h2m
.
To get the hours, minutes and seconds, I use this Regex. Let me know any improvement can be done to make it simple.
function calculateInSeconds(timeStamp) {
var timeInSeconds=0;
if(timeStamp.match(/t=[0-9]*h?[0-9]*m?[0-9]*s?/g).toString()==timeStamp){
timeStamp.replace(/([0-9]+)[h|m|s]/g, function(match, value) {
if (match.indexOf("h") > -1) {
timeInSeconds += value * 60 * 60;
} else if (match.indexOf("m") > -1) {
timeInSeconds += value * 60;
} else if (match.indexOf("s") > -1) {
timeInSeconds += value * 1;
}
});
}
console.log("timeInSeconds"+timeInSeconds);
}
calculateInSeconds("t=3m59s1h");//invalid
calculateInSeconds("t=1m59s");//valid
calculateInSeconds("t=1h");//valid
4 Answers 4
First you may improve your regexps a bit using classes: replace [0-9]
by \d
.
In the other hand, you may strongly improve performance by:
- defining an object with the unit values:
units: {h: 3600, m: 60, s: 1};
- modifying the 2nd regexp to directly provide the current unit:
timeStamp.replace(/(\d+)(\w)/g, function(match, value, unit) {
(note that\w
is enough since your previous test eliminated any wrong value) - then using a unique computation in the function:
timeInSeconds += value * units[unit];
So the entire snippet becomes:
function calculateInSeconds(timeStamp) {
var timeInSeconds = 0,
units = {
h: 3600,
m: 60,
s: 1,
};
if (timeStamp.match(/t=\d*h?\d*m?\d*s?/g).toString() == timeStamp){
timeStamp.replace(/(\d+)(\w)/g, function(match, value, unit) {
timeInSeconds += value * units[unit];
});
}
console.log(timeStamp + ' -> ' + timeInSeconds + ' seconds');
}
calculateInSeconds("t=3m59s1h"); //invalid
calculateInSeconds("t=1m59s"); //valid
calculateInSeconds("t=1h"); //valid
BTW I like the idea of your first test to isolate valid timeStamps.
-
\$\begingroup\$ This was my first thought; I went down a different path to eliminate two separate Regex parses on the same string, but I think your conciseness wins. \$\endgroup\$Gallant– Gallant2015年07月22日 17:21:51 +00:00Commented Jul 22, 2015 at 17:21
Your regex matches things that make no sense, like t=9999
or t=00m9
Try this:
/t=(?=.)(?:\d+h)?(?:\d+m)?(?:\d+s)?$/
Your first regex allows things like "t=hms"
with no digits at all, or "t=123"
with no letters. The next regex, however, expects at least 1 of either.
Your regex isn't anchored either, so it'll match stuff anywhere in the string. I assume the string is supposed to be just the t=...
timecode in order to be valid.
There's also the question of whether you should allow second/minute values greater than 59. Your current code allows "t=61s"
just fine, though it'd be more natural to say "1m1s". But saying "61s" shouldn't cause trouble in the calculation, so let's keep it.
Structure-wise, I'd get by with one match call. But more importantly, I'd make the function return the number of seconds - not log it. The function's purpose is to parse the timecode, nothing else. If you want to log it, you can do that elsewhere; it's not the function's purpose.
function timecodeInSeconds(timecode) {
var seconds = 0;
if(match = timecode.match(/^t=(?:(\d+)h)?(?:(\d+)m)?(?:(\d+)s)?$/)) {
seconds += match[1] ? match[1] * 3600 : 0
seconds += match[2] ? match[2] * 60 : 0
seconds += match[3] ? match[3] * 1 : 0
return seconds;
}
return null;
}
The only issue with the above is that it accepts "t="
. That can be handled if a simple if
, though:
if(!match[1] && !match[2] && !match[3]) return null;
-
1\$\begingroup\$ You can simplify those ternaries to a null coalescing operator (e.g.,
seconds += match[1] * 3600 || 0;
). \$\endgroup\$Gallant– Gallant2015年07月22日 18:42:24 +00:00Commented Jul 22, 2015 at 18:42
Your Regex accepts the formats t=1h2s
and t=1h2m
. I'm not sure if this is intentional or not.
Your Regex also accepts t=1hms
, t=h1ms
, and t=hm1s
. Technically, t=hms
passes your matching Regex as well, though it will output 0 like an invalid match. These cases can be eliminated by changing your 0-or-more *
symbols to 1-or-more +
symbols. Since this is Code Review, I'll retain your original functionality in my answer.
Finally, you're using pipes within your character set [h|m|s]
when you shouldn't be: as characters within a character set are treated as literals, this includes |
as a valid matching character (i.e., \d*[h|m|s]
would successfully match 65|
). It should just be [hms]
.
timeStamp.match(...)==timeStamp
seems to be used to make sure the expression matches the whole string, but there are better ways to accomplish this. In Regex, you can specify ^
and $
as the start and end of the string respectively.
You can combine the match and replace into a single expression, since the replace won't do anything if it fails to match. Then you can use Regex groups to capture the hours, minutes, and seconds.
Finally, you can replace [0-9]
in your regular expression to the digit (\d
) character set, which represents the same thing. This is only really useful if you're interested in shortening the pattern; I don't think it improves readability.
Putting everything together, we get the following Regex pattern:
/^t=(?:(\d*)h)?(?:(\d*)m)?(?:(\d*)s)?$/
Breaking it down:
/^t= ## The start of the string
(?: ## Non-capturing group
(\d*)h ## Capturing hours in a group
)? ## Specifying the group as optional
(?:(\d*)m)? ## Same for minutes
(?:(\d*)s)? ## Same for seconds
$/ ## The end of the string
Resulting in the following JavaScript code:
function calculateInSeconds(timeStamp) {
var timeInSeconds = 0;
timeStamp.replace(/^t=(?:(\d*)h)?(?:(\d*)m)?(?:(\d*)s)?$/, function(match, hours, minutes, seconds) {
var hoursInSeconds, minutesInSeconds;
hoursInSeconds = hours * 60 * 60 || 0;
minutesInSeconds = minutes * 60 || 0;
seconds = seconds * 1 || 0;
timeInSeconds = hoursInSeconds + minutesInSeconds + seconds;
});
console.log("timeInSeconds"+timeInSeconds);
}