I am writing a python script which queries the database for a URL string. Below is my snippet.
db.execute('select sitevideobaseurl,videositestring '
'from site, video '
'where siteID =1 and site.SiteID=video.VideoSiteID limit 1')
result = db.fetchall()
filename = '/home/Site_info'
output = open(filename, "w")
for row in result:
videosite= row[0:2]
link = videosite[0].format(videosite[1])
full_link = link.replace("http://","https://")
print full_link
output.write("%s\n"%str(full_link))
output.close()
The query basically gives a URL link.It gives me baseURL from a table and the video site string from another table.
output: https://www.youtube.com/watch?v=uqcSJR_7fOc
SiteID is the primary key which is int and not in sequence.
I wish to loop this sql query to pick a new siteId for every execution so that i have unique site URL everytime and write all the results to a file.
desired output: https://www.youtube.com/watch?v=uqcSJR_7fOc
https://www.dailymotion.com/video/hdfchsldf0f
There are about 1178 records.
Thanks for your time and help in advance.
1 Answer 1
I'm not sure if I completely understand what you're trying to do. I think your goal is to get a list of all links to videos. You get a link to a video by joining the sitevideobaseurl from site and videositestring from video.
From my experience it's much easier to let the database do the heavy lifting, it's build for that. It should be more efficient to join the tables, return all the results and then looping trough them instead of making subsequent queries to the database for each row.
The code should look something like this: (Be careful, I didn't test this)
query = """
select s.sitevideobaseurl,
v.videositestring
from video as v
join site as s
on s.siteID = v.VideoSiteID
"""
db.execute(query)
result = db.fetchall()
filename = '/home/Site_info'
output = open(filename, "w")
for row in result:
link = "%s%s" % (row[0],row[1])
full_link = link.replace("http://","https://")
print full_link
output.write("%s\n" % str(full_link))
output.close()
If you have other reasons for wanting to fetch these ony by one an idea might be to fetch a list of all SiteIDs and store them in a list. Afterwards you start a loop for each item in that list and insert the id into the query via a parameterized query.
select distinct...?