I have a list like this:
[(ip1, video1, 12345.00000),(ip1, video1, 12346.12362),(ip1, video1, 12347.12684),(ip1, video2,12367.12567),(ip2, video1, 14899.93736), (ip2,video1, 24566.12345).....]
It records the video id and the time when the video was requested by each user.
Now I want to go through the list, and calculate the time interval between the first and the last request for each video,my list has already sorted by ip address.
The result I want to get is like:
ip1, video1, 2.12684
ip1, video2, 0
0 means, the requests never repeats.
Can anyone help?
The following is the code I create the dictionary:
for line in fd_in.readlines():
(time, addr, iptp, userag, usertp, hash, vlanid) = line.split()
if addr not in client_dict:
client_dict[addr] = {}
hash_dict = client_dict[addr]
if hash not in hash_dict:
hash_dict[hash] = []
hash_dict[hash].append((float(time), addr, iptp, userag, usertp, hash, vlanid))
for addr, hash_dict in client_dict.items():
for hash, hits_list in hash_dict.items():
hits_list_sorted = sorted(hits_list, key=lambda item: item[0])
for (time, addr, iptp, userag,usertp,hash,vlanid) in hits_list_sorted:
xxxxxxxx[Dont know how to do the calculation]
fd_out.write("%f\t%s\t%s\t%s\n" % (addr, hash, timeinternal))
asked Feb 20, 2012 at 9:26
manxing
3,33512 gold badges49 silver badges57 bronze badges
1 Answer 1
Somethig like this
from itertools import groupby
for video, group in groupby(sorted(data, key=lambda x: x[1]), key=lambda x: x[1]):
times = [x[2] for x in group]
print 'Video: %s, interval: %f' % (video, max(times) - min(times))
answered Feb 20, 2012 at 9:33
DrTyrsa
32.1k7 gold badges88 silver badges88 bronze badges
Sign up to request clarification or add additional context in comments.
Comments
lang-py
client_dictcan be filled withclient_dict.setdefault(addr, {}).setdefault(hash, []).append((float(time), addr, iptp, userag, usertp, hash, vlanid)). Also,hashis a function, so don't overwrite it with your variables.