2
\$\begingroup\$

So I've been working on a tracklist generator that scrapes from Amazon Music using a url using Python for albums with one artist. I've made uhhh this, I'm really new to this requests and beautifulsoup4 thing. I wonder if I can improve it to make it more efficient.

import requests
from bs4 import BeautifulSoup
Amazon=str(input("Please enter an Amazon music url:"))
r=requests.get(Amazon)
soup = BeautifulSoup(r.text,'html.parser')
name=soup.find_all('a', attrs={'class':'a-link-normal a-color-base TitleLink a-text-bold'}) #find out the names of the track title
time=soup.find_all('td',attrs={'class':'a-text-right a-align-center'}) #find the duration of the track
artist= soup.find('a', attrs={'id':'ProductInfoArtistLink'}) #find the creator of the track, which for now can only take one
for i in range(1,len(name),2):
 print(str(int(i/2+1))+'. '+name[int(i)].text+' - '+ artist.text + ' (' + time[int((i-1)/2)].text[12:16] + ')') 
#first int produces a placeholder number for the track e.g 1., 2.
#second int produces track name, which len shows twice of number of tracks
#artist text gives artist name
#time gives time and puts it in brackets
301_Moved_Permanently
29.4k3 gold badges48 silver badges98 bronze badges
asked Nov 8, 2018 at 15:51
\$\endgroup\$
3
  • \$\begingroup\$ 80 character line limit and comments need to be on their own line about the line they're supposed to be commenting about. I'm not listing this as an answer because these two things are literally the only things I can point out. Check out pep8 \$\endgroup\$ Commented Nov 8, 2018 at 22:08
  • 1
    \$\begingroup\$ Why do you need to do these shenanigans with twice the length and halving the integer? Is each track included twice in the results? \$\endgroup\$ Commented Nov 9, 2018 at 10:43
  • \$\begingroup\$ Yeah, the stuff is doubled. \$\endgroup\$ Commented Nov 10, 2018 at 12:28

1 Answer 1

2
\$\begingroup\$

Whenever you find yourself writing long (or sometimes even short) comments explaining a single line/a block of lines, you should ask yourself if this was not better placed in a function. Functions can be given a meaningful name and you can add a docstring to them (which can be considerably longer than a comment practically can). It also give you one obvious place to change if, for example, the Amazon Music website is changed at some point.

IMO, here the function names should already be self-explanatory enough, so I did not add any docstrings.

import requests
from bs4 import BeautifulSoup
from itertools import count
def get_soup(url):
 r = requests.get(url)
 r.raise_for_status()
 return BeautifulSoup(r.text, 'lxml')
def track_titles(soup):
 attrs = {'class': 'a-link-normal a-color-base TitleLink a-text-bold'}
 return [a.text for a in soup.find_all('a', attrs=attrs)[::2]]
def track_durations(soup):
 attrs = {'class': 'a-text-right a-align-center'}
 return [td.text.strip() for td in soup.find_all('td', attrs=attrs)]
def track_artist(soup):
 return soup.find('a', attrs={'id':'ProductInfoArtistLink'}).text
if __name__ == "__main__":
 url = input("Please enter an Amazon music url:")
 soup = get_soup(url)
 titles = track_titles(soup)
 durations = track_durations(soup)
 artist = track_artist(soup)
 for i, title, duration in zip(count(1), titles, durations):
 print(f"{i}. {title} - {artist} ({duration})")

Other things I changed:

answered Nov 9, 2018 at 10:49
\$\endgroup\$
6
  • \$\begingroup\$ i wanted to make something that parsed from Amazon music and generated a tracklist for musicbrainz to help add to their database tracklists easier. A format that goes like this \$\endgroup\$ Commented Nov 10, 2018 at 12:26
  • \$\begingroup\$ No. Title - Artist (mm:ss) (as in 1. Planet Telex - Radiohead (4:21)) \$\endgroup\$ Commented Nov 10, 2018 at 12:27
  • \$\begingroup\$ @DurianJaykin I have never used Amazon Music. Is it possible to post an example link or would I need an account for it to work properly? \$\endgroup\$ Commented Nov 10, 2018 at 13:18
  • 1
    \$\begingroup\$ amazon.com/End-Time-EP-Jim-Yosef/dp/B01DPY73E8 \$\endgroup\$ Commented Nov 11, 2018 at 14:35
  • 1
    \$\begingroup\$ There is no need for an account, just view page source and you should find the html needed. \$\endgroup\$ Commented Nov 11, 2018 at 14:35

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.