fastest method

Wed Jun 20 14:17:52 EDT 2012

I am looking for the fastest way to parse a log file.
currently I have this... Can I speed this up any? The script is written to
be a generic log file parser so I can't rely on some predictable pattern.
def check_data(data,keywords):
 #get rid of duplicates
 unique_list = list(set(data))
 string_list=' '.join(unique_list)
 #print string_list
 for keyword in keywords:
 if keyword in string_list:
 return True
I am currently using file seek and maintaining a last byte count file:
with open(filename) as f:
 print "Here is filename:%s" %filename
 f.seek(0, 2)
 eof = f.tell()
 print "Here is eof:%s" %eof
 if last is not None:
 print "Here is last:%s" %last
 # if last is less than current
 last = int(last)
 if (eof - last > 0):
 offset = eof - last
 offset = offset * -1
 print "Here is new offset:%s" %offset
 f.seek(offset, 2)
 mylist = f.readlines()
 else:
 # if last doesn't exist or is greater than current
 f.seek(0)
 bof = f.tell()
 print "Here is bof:%s" %bof
 mylist = f.readlines()
Thanks,
-- 
David Garvey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120620/58621752/attachment.html>