Python process tracklist, get cumulative timestamp of each track

Question 1

The code below parses a tracklist (sample input below) and generates a six-member tuple, including cumulative timestamps in mm:ss. (sample output at bottom).

couldn't easily see how to rewrite it from a for-loop to a generator get_track_with_cumulative_timestamp(itt)
I think we don't need a custom object, i.e. avoid this approach
generator will need to internally store state (tmm, tss) being the end-time of each song. So we yield each result with the start-time of each song; incrementing the time-arithmetic comes after that.
are there better idioms for the (mm,ss) modulo-60 counters? decimal.Decimal seems overkill. I suppose we could always pull a trick internally using one float to represent both (mm,ss)

Current not-so-great procedural code:

tracklist = """
1. Waiting For A Miracle 5:02
2. Bedroom Eyes 5:01
3. In Over My Head 4:31
4. Higher Ground / Written-By – S. Wonder* 3:38
5. Hot Blood 4:15
6. Running Away 4:28
7. I've Had Enough 3:47
8. Blacklisted / Guitar [Other] – Jonny Fean* 4:11
9. Last Thing At Night 2:49"""
tracklist = [t for t in tracklist.split('\n') if t]
import re
pat = re.compile(r'(?P<no>[0-9]+)\. (?P<name>.*) (?P<mm>[0-9]+):(?P<ss>[0-9]+)')
(tmm,tss) = (0,0)
result = []
for t in tracklist:
 m = pat.match(t)
 #print(m.groups())
 lmm, lss = int(m['mm']), int(m['ss'])
 result.append((int(m['no']), tmm, tss, lmm, lss, m['name']))
 #yield (int(m['no']), tmm, tss, lmm, lss, m['name']))
 tss += lss
 tmm += lmm
 if tss >= 60:
 tmm += 1
 tss -= 60
# Output looks like this:
for _ in result: print('{} | {:02d}:{:02d} | {}:{:02d} | {}'.format(*_))

1 | 00:00 | 5:02 | Waiting For A Miracle
2 | 05:02 | 5:01 | Bedroom Eyes
3 | 10:03 | 4:31 | In Over My Head
4 | 14:34 | 3:38 | Higher Ground / Written-By – S. Wonder*
5 | 18:12 | 4:15 | Hot Blood
6 | 22:27 | 4:28 | Running Away
7 | 26:55 | 3:47 | I've Had Enough
8 | 30:42 | 4:11 | Blacklisted / Guitar [Other] – Jonny Fean*
9 | 34:53 | 2:49 | Last Thing At Night

Question 2

Please do not edit your question with modified code. Once you've gathered sufficient feedback from this question, CRSE policy is that you open a new question.

Question 3

@Reinderien: I didn't gather 'sufficient feedback'; your answer did not actually address how to do better idiom for the (mm,ss) modulo-60 counter for this use-case. We're not talking about generalized timedeltas of anything from milliseconds to years.

Question 4

I'm not suggesting that my answer was conclusive... I'm suggesting that you leave this open until you're satisfied with my (or someone else's) feedback. Patience.

Question 5

I've rolled this back per codereview.meta.stackexchange.com/questions/1763/…

Question 6

@Reinderien: ok I wasn't aware of the no-code-revisions policy. And yes you're right about datetime.timedelta

Question 7

Implicit tuples

(tmm,tss) = (0,0)

This shouldn't need any parens.

Generator

couldn't easily see how to rewrite it from a for-loop to a generator

It actually is quite easy. Make your code into a function, delete result, and replace result.append with yield.

Time spans

are there better idioms for the (mm,ss) modulo-60 counters?

Yes!

Custom objects

I think we don't need a custom object

Named tuples take one line to declare, and suck less than unstructured data. So do that, at least.

Underscores

I just noticed that you're looping with an underscore as your loop variable. By convention this means "I'm not going to use this value"... but then you used it anyway. Give this variable a meaningful name.

Example

import re
from collections import namedtuple
from datetime import timedelta
TRACK_PAT = re.compile(r'(?P<no>[0-9]+)\. (?P<name>.*) (?P<mm>[0-9]+):(?P<ss>[0-9]+)$', re.M)
Track = namedtuple('Track', ('number', 'time', 'length', 'name'))
def parse_track(body):
 t_total = timedelta()
 for match in TRACK_PAT.finditer(body):
 length = timedelta(minutes=int(match['mm']), seconds=int(match['ss']))
 yield Track(int(match['no']), t_total, length, match['name'])
 t_total += length
for track in parse_track(
"""
1. Waiting For A Miracle 5:02
2. Bedroom Eyes 5:01
3. In Over My Head 4:31
4. Higher Ground / Written-By – S. Wonder* 3:38
5. Hot Blood 4:15
6. Running Away 4:28
7. I've Had Enough 3:47
8. Blacklisted / Guitar [Other] – Jonny Fean* 4:11
9. Last Thing At Night 2:49"""
):
 print(f'{track.number} | {track.time} | {track.length} | {track.name}')

Question 8

1) Yes I know parens are not needed around implicit tuples, I wanted to be explicit that it's an mm:ss counter 2) Done 3) Time spans with datetime.timedelta seem to be overkill for this use-case like I said: can you show equivalent code that is <=8 lines long? I doubt it.

Question 9

Given that your code is currently about 33 lines long, 8 lines seems like an unrealistic request. Regardless, I'll post example code.

Question 10

You're missing my point that datetime.timedelta is not good for this particular use-case and will be lomger, more clunky and less readable than the current.

Question 11

You asked for a better idiom. Doing one's own time math is a foul code smell. If you don't like it, don't listen to me ¯_(ツ)_/¯

Question 12

Ok thanks you were right and I was wrong, that is much more compact and elegant. Mind you the datetime.timedelta doc doesn't showcase that. As to the need for an object/ namedtuple Track, really we just want some sort of object or tuple which can take a custom __str__() method.

Question 13

This part is not that compact in terms of code lines, which you seek to decrease:

tss += lss
tmm += lmm
if tss >= 60:
 tmm += 1
 tss -= 60

One solution is to keep track of t (total seconds) instead of tss and tmm.

t = 0
result = []
for t in tracklist:
 m = pat.match(t)
 lmm, lss = int(m['mm']), int(m['ss'])
 result.append((int(m['no']), t // 60, t % 60, lmm, lss, m['name']))
 t += 60 * lmm + lss

Question 14

Also a good approach (to the arithmetic). Yes I get that it can equally be done in a generator.

Question 15

it can easily be transformed using a generator if you want. My point was only about that modulo logic :)

Question 16

Here was my updated v0.02 written as a generator. (Can combine with either Reinderien's datetime.timedelta aproach or dfhwze's modulo code)

import re
#tracklist = """... as defined above ..."""
tracklist = iter(t for t in tracklist.split('\n') if t)
pat = re.compile(r'(?P<no>[0-9]+)\. (?P<name>.*) (?P<mm>[0-9]+):(?P<ss>[0-9]+)')
def gen_cumulative_timestamp(itracklist):
 tmm, tss = 0, 0
 for t in itracklist:
 m = pat.match(t)
 lmm, lss = int(m['mm']), int(m['ss'])
 yield ( int(m['no']), tmm, tss, lmm, lss, m['name'] )
 tss += lss
 tmm += lmm
 if tss >= 60:
 tmm += 1
 tss -= 60
for t in gen_cumulative_timestamp(tracklist):
 # Ideally should have a custom object/NamedTuple which has a custom __str__() method
 print('{} | {:02d}:{:02d} | {}:{:02d} | {}'.format(*t))

Reinderien Reinderien 70.9k5 gold badges76 silver badges256 bronze badges · Accepted Answer · 2019-09-03 03:16:46Z

Implicit tuples

(tmm,tss) = (0,0)

This shouldn't need any parens.

Generator

couldn't easily see how to rewrite it from a for-loop to a generator

It actually is quite easy. Make your code into a function, delete result, and replace result.append with yield.

Time spans

are there better idioms for the (mm,ss) modulo-60 counters?

Yes!

Custom objects

I think we don't need a custom object

Named tuples take one line to declare, and suck less than unstructured data. So do that, at least.

Underscores

I just noticed that you're looping with an underscore as your loop variable. By convention this means "I'm not going to use this value"... but then you used it anyway. Give this variable a meaningful name.

Example

import re
from collections import namedtuple
from datetime import timedelta
TRACK_PAT = re.compile(r'(?P<no>[0-9]+)\. (?P<name>.*) (?P<mm>[0-9]+):(?P<ss>[0-9]+)$', re.M)
Track = namedtuple('Track', ('number', 'time', 'length', 'name'))
def parse_track(body):
 t_total = timedelta()
 for match in TRACK_PAT.finditer(body):
 length = timedelta(minutes=int(match['mm']), seconds=int(match['ss']))
 yield Track(int(match['no']), t_total, length, match['name'])
 t_total += length
for track in parse_track(
"""
1. Waiting For A Miracle 5:02
2. Bedroom Eyes 5:01
3. In Over My Head 4:31
4. Higher Ground / Written-By – S. Wonder* 3:38
5. Hot Blood 4:15
6. Running Away 4:28
7. I've Had Enough 3:47
8. Blacklisted / Guitar [Other] – Jonny Fean* 4:11
9. Last Thing At Night 2:49"""
):
 print(f'{track.number} | {track.time} | {track.length} | {track.name}')

1) Yes I know parens are not needed around implicit tuples, I wanted to be explicit that it's an mm:ss counter 2) Done 3) Time spans with datetime.timedelta seem to be overkill for this use-case like I said: can you show equivalent code that is <=8 lines long? I doubt it.
Given that your code is currently about 33 lines long, 8 lines seems like an unrealistic request. Regardless, I'll post example code.
You're missing my point that datetime.timedelta is not good for this particular use-case and will be lomger, more clunky and less readable than the current.
You asked for a better idiom. Doing one's own time math is a foul code smell. If you don't like it, don't listen to me ¯_(ツ)_/¯
Ok thanks you were right and I was wrong, that is much more compact and elegant. Mind you the datetime.timedelta doc doesn't showcase that. As to the need for an object/ namedtuple Track, really we just want some sort of object or tuple which can take a custom __str__() method.

Stack Exchange Network

Python process tracklist, get cumulative timestamp of each track

3 Answers 3

Implicit tuples

Generator

Time spans

Custom objects

Underscores

Example

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Python process tracklist, get cumulative timestamp of each track

3 Answers 3

Implicit tuples

Generator

Time spans

Custom objects

Underscores

Example

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions