Log monitoring with pythonic tail -f and process killing

Question 1

The purpose of this cluster of functions is to tail the log file and kill/restart processes from shell (the latter is not yet implemented).

import sys
import time
import datetime
import os
import re
def read_log(f):
 """
 Basically tail -f
 """
 logfile = open(f)
 logfile.seek(0, os.SEEK_END)
 while True:
 new_line = logfile.readline()
 if new_line:
 yield new_line
 else:
 time.sleep(0.1)
def find_error(line, base_time, line_time):
 """
 Search for ERROR keyword in log line. Return True if found, False if not.
 """
 match = re.search('.*ERROR.*', line)
 if match and line_time < base_time:
 return True
 return False
def parse_time(line):
 """
 Parse string to datetime
 """
 date_regex = r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'
 date_string = re.search(date_regex, line).group(0)
 return datetime.datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
def increase_base_time(line_time):
 """
 Return a datetime that is line_time plus one minute
 """
 return line_time + datetime.timedelta(minutes=1)
def monitor(filename):
 """
 Read from a log file.
 For each line check if log line time is later than base comparison time.
 If so, update the base time and set error count to 0.
 Check for errors in line. Increment error count if any are found.
 Check the total error count. If it's greater than five, restart the logged 
 process. 
 Check if process has been restarted before. If so, kill the logged process.
 """
 count = 0
 base_time = datetime.datetime.min
 log = read_log(filename)
 restarted = False
 for line in log:
 line_time = parse_time(line)
 if line_time > base_time:
 base_time = increase_base_time(line_time)
 count = 0
 if find_error(line, base_time, line_time):
 count += 1
 if count >= 5:
 if restarted:
 print("Kill the process") # A placeholder, sorry
 else: 
 print("Restart the process") # Also a placeholder, sorry
 count = 0
 restarted = True
monitor(sys.argv[1])

There are a lot of if-statement for doing checks. I thought about making this class based, but it really doesn't make sense to do it (functions don't need to share states and it is a rather small program). Is the number of if's acceptable? How could I refactor the code to make it flow a bit better?

EDIT: As this was brought up, is it necessary that the tailing function would be non blocking? As I understand I wont be doing nothing until a line is yielded. What would be the benefits of going one way or the other?

Question 2

Using regex for such a simple task as finding if a line contains a string is not worth it. The built-in str.__contains__, being called by the keyword in is faster in this case.

Here are some worst case (i.e. a string that does not contain the string "ERROR") timings for the two functions:

import re
def using_in(s):
 return "ERROR" in s
def using_re(s):
 return bool(re.search('.*ERROR.*', s))

enter image description here

Using re is always slower in this case. It does, of course, have its uses for more complex patterns. Note that the re module also compiles the regex automatically (and caches it), so that is not a way to make it faster in this case, since it is always the same regex being used.

Let's just hope your lines are less than 10,000 characters long.

In general, I think you still have to find a good balance between not having enough functions and having too many. IMO, instead of having a increase_base_time function, I would have a global constant and inline the call:

INCREMENT = datetime.timedelta(minutes=1)
...
for line in log:
 ...
 if line_time > base_time:
 base_time = line_time + INCREMENT
 ...

If read_log is a tail -f equivalent, you could just call it tail_f. In any case, I would make the sleep time an optional parameter.

Note that your function does not ensure that the file is properly closed, which might cause issues. You should always use the with keyword to avoid this.

def read_log(f, sleep=0.1):
 """
 Basically tail -f with a configurable sleep
 """
 with open(f) as logfile:
 logfile.seek(0, os.SEEK_END)
 while True:
 new_line = logfile.readline()
 if new_line:
 yield new_line
 else:
 time.sleep(sleep)

This way the file is closed even if the user presses Ctrl+C.

You should only call your function under a if __name__ == "__main__": guard to allow importing from this script without it (trying and failing to) run:

if __name__ == "__main__":
 monitor(sys.argv[1])

Question 3

No with open(f) as logfile: for file handling ?

Question 4

Dunno where you live, but over here it's 5 PM on a friday. It happens shrug

Question 5

@Gloweye It is here as well. And just before the friday beer. I blame it on that.

Question 6

I wonder if calling select.select([logfile], [], []) would work and be better than this ugly sleep...

Question 7

Lol, guess I will never fully understand backtracking then.

Question 8

... about making this class based, but it really doesn't make sense to do it ...

I'd not agree with that, because:

same arguments filename/f and line are passed around across multiple functions
with custom class approach the things can be initialized/defined at once (on class definition or instantiation phase)
with OOP approach I'm confident that functions like parse_time(line) or find_error(line, ...) are not passed with undesirable or "tricky" line argument, as the line being currently read can be encapsulated.

List of issues (with support of OOP approach):

pattern date_regex = r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'.
Instead of generating regex pattern on each call of parse_time function - we'll just make it a class LogMonitor constant called DATE_REGEX.
Furthermore, we can precompile the regex pattern with re.complie function:

using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program

 DATE_REGEX = re.compile(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}')

increase_base_time function.
datetime.timedelta(minutes=1) is easily extracted to another constant:
```
INC_MINUTE = datetime.timedelta(minutes=1)
```
find_error(line, base_time, line_time) function.
Besides of simplifying ERROR pattern search (as @Graipher mentioned) the function has another issue as it introduces an unnecessary coupling/dependency on two external identifiers base_time and line_time - whereas the sufficient function's responsibility is "check if ERROR occurs within the current log line" (returning boolean result)
That dependency should be eliminated.
To "beat" that, the crucial conditional within monitor function that calls the find_error function can be restructured to present a natural order of mutually exclusive conditions:
```
 ...
 if line_time > base_time:
 base_time = self.increase_base_time(line_time)
 count = 0
 elif self.find_error():
 count += 1
```

In below OOP implementation the input filename is passed at once to LogMonitor constructor as LogMonitor(sys.argv[1]), the functions are logically renamed, the needed state/behavior is encapsulated, the above mentioned OOP benefits/arguments included.

import sys
import time
import datetime
import os
import re
class LogMonitor:
 DATE_REGEX = re.compile(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}')
 INC_MINUTE = datetime.timedelta(minutes=1)
 def __init__(self, filename):
 self._fname = filename
 self._log = self._read_log() # generator
 self._curr_line = ''
 def _read_log(self):
 """Basically tail -f"""
 with open(self._fname) as logfile:
 logfile.seek(0, os.SEEK_END)
 while True:
 new_line = logfile.readline()
 if new_line:
 self._curr_line = new_line
 yield self._curr_line
 else:
 time.sleep(0.1)
 def _has_error(self):
 """
 Search for ERROR keyword in log line. Returns boolean result"""
 return 'ERROR' in self._curr_line
 def _parse_datetime(self):
 """Parse string to datetime"""
 date_string = self.DATE_REGEX.search(self._curr_line).group(0)
 return datetime.datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
 def increase_base_time(self, line_time):
 """Return a datetime that is line_time plus one minute"""
 return line_time + self.INC_MINUTE
 def run(self):
 """
 Read from a log file.
 For each line check if log line time is later than base comparison time.
 If so, update the base time and set error count to 0.
 Check for errors in line. Increment error count if any are found.
 Check the total error count. If it's greater than five, restart the logged
 process.
 Check if process has been restarted before. If so, kill the logged process.
 """
 count = 0
 base_time = datetime.datetime.min
 restarted = False
 for line in self._log:
 line_time = self._parse_datetime()
 if line_time > base_time:
 base_time = self.increase_base_time(line_time)
 count = 0
 elif self._has_error():
 count += 1
 if count >= 5:
 if restarted:
 print("Kill the process") # A placeholder, sorry
 else:
 print("Restart the process") # Also a placeholder, sorry
 count = 0
 restarted = True
if __name__ == "__main__":
 monitor = LogMonitor(sys.argv[1])
 monitor.run()

Question 9

Thank you! Those are good points and I think I am actually switching over to using a class. Just to clarify the line_time and base_time thing in find_error: the idea was to look for errors within a set time frame (like if there has been five or more errors within a minute). I really should have made it more clear, sorry. Also a question: could I call the read_log inside the run function instead? Because that seems to be the only one using it? Or is there a important reason why to call it in the class variables (self._log = self._read_log())? Thanks!

Question 10

@sjne, self._read_log() is just scheduling a generator. Making it on class instantiation phase is a generic way. If your implementation is expected to be simplified and never be extended - you may move it.

Question 11

Here is my contribution.

You should not make the while True loop, instead of that you should use inotify in order to been notify only when changes happen on the file that you are monitoring, here is a short code.

import inotify.adapters
def _read_log(self):
 i = inotify.adapters.Inotify()
 i.add_watch(f)
 with open(f, 'w'):
 pass
 for event in i.event_gen(yield_nones=False):
 (_, type_names, path, filename) = event
 do_work()

Basically your process will be only unblock until there are some changes on the file that you are monitoring.

Question 12

Thank you! Just started to look into it. As far as I understand it at the moment, I'd be using inotify to track an event and then read the log file lines if there's a change? So doWork() would mean something like yielding or returning logfile.readline()? Sorry, I'm a bit lost as to how to actually get the changed line (cause inotify only returns a generator about the event?). Could you please give me some hints?

Question 13

For example could I just replace do_work() with yield logfile.readline()? And can I add logfile.seek(0, os.SEEK_END) after with open(f, 'w') as logfile?

Question 14

I tried it as described above (plus replaced the pass with the for event block and used the add_watch() with the constant IN_MODIFY). A loop was logging two lines every 30 seconds to the log file and inotify did pick it up but often it would just skip one of the lines and print it 30 seconds later.

Graipher Graipher 41.6k7 gold badges70 silver badges134 bronze badges · Answer 1 · 2019-11-08 15:27:02Z

Using regex for such a simple task as finding if a line contains a string is not worth it. The built-in str.__contains__, being called by the keyword in is faster in this case.

Here are some worst case (i.e. a string that does not contain the string "ERROR") timings for the two functions:

import re
def using_in(s):
 return "ERROR" in s
def using_re(s):
 return bool(re.search('.*ERROR.*', s))

enter image description here

Using re is always slower in this case. It does, of course, have its uses for more complex patterns. Note that the re module also compiles the regex automatically (and caches it), so that is not a way to make it faster in this case, since it is always the same regex being used.

Let's just hope your lines are less than 10,000 characters long.

In general, I think you still have to find a good balance between not having enough functions and having too many. IMO, instead of having a increase_base_time function, I would have a global constant and inline the call:

INCREMENT = datetime.timedelta(minutes=1)
...
for line in log:
 ...
 if line_time > base_time:
 base_time = line_time + INCREMENT
 ...

If read_log is a tail -f equivalent, you could just call it tail_f. In any case, I would make the sleep time an optional parameter.

Note that your function does not ensure that the file is properly closed, which might cause issues. You should always use the with keyword to avoid this.

def read_log(f, sleep=0.1):
 """
 Basically tail -f with a configurable sleep
 """
 with open(f) as logfile:
 logfile.seek(0, os.SEEK_END)
 while True:
 new_line = logfile.readline()
 if new_line:
 yield new_line
 else:
 time.sleep(sleep)

This way the file is closed even if the user presses Ctrl+C.

You should only call your function under a if __name__ == "__main__": guard to allow importing from this script without it (trying and failing to) run:

if __name__ == "__main__":
 monitor(sys.argv[1])

Dunno where you live, but over here it's 5 PM on a friday. It happens shrug
@Gloweye It is here as well. And just before the friday beer. I blame it on that.
I wonder if calling select.select([logfile], [], []) would work and be better than this ugly sleep...

score 2 · Answer 2 · 2019-11-08 17:04:42Z

... about making this class based, but it really doesn't make sense to do it ...

I'd not agree with that, because:

same arguments filename/f and line are passed around across multiple functions
with custom class approach the things can be initialized/defined at once (on class definition or instantiation phase)
with OOP approach I'm confident that functions like parse_time(line) or find_error(line, ...) are not passed with undesirable or "tricky" line argument, as the line being currently read can be encapsulated.

List of issues (with support of OOP approach):

pattern date_regex = r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'.
Instead of generating regex pattern on each call of parse_time function - we'll just make it a class LogMonitor constant called DATE_REGEX.
Furthermore, we can precompile the regex pattern with re.complie function:

using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program

 DATE_REGEX = re.compile(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}')

increase_base_time function.
datetime.timedelta(minutes=1) is easily extracted to another constant:
```
INC_MINUTE = datetime.timedelta(minutes=1)
```
find_error(line, base_time, line_time) function.
Besides of simplifying ERROR pattern search (as @Graipher mentioned) the function has another issue as it introduces an unnecessary coupling/dependency on two external identifiers base_time and line_time - whereas the sufficient function's responsibility is "check if ERROR occurs within the current log line" (returning boolean result)
That dependency should be eliminated.
To "beat" that, the crucial conditional within monitor function that calls the find_error function can be restructured to present a natural order of mutually exclusive conditions:
```
 ...
 if line_time > base_time:
 base_time = self.increase_base_time(line_time)
 count = 0
 elif self.find_error():
 count += 1
```

In below OOP implementation the input filename is passed at once to LogMonitor constructor as LogMonitor(sys.argv[1]), the functions are logically renamed, the needed state/behavior is encapsulated, the above mentioned OOP benefits/arguments included.

import sys
import time
import datetime
import os
import re
class LogMonitor:
 DATE_REGEX = re.compile(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}')
 INC_MINUTE = datetime.timedelta(minutes=1)
 def __init__(self, filename):
 self._fname = filename
 self._log = self._read_log() # generator
 self._curr_line = ''
 def _read_log(self):
 """Basically tail -f"""
 with open(self._fname) as logfile:
 logfile.seek(0, os.SEEK_END)
 while True:
 new_line = logfile.readline()
 if new_line:
 self._curr_line = new_line
 yield self._curr_line
 else:
 time.sleep(0.1)
 def _has_error(self):
 """
 Search for ERROR keyword in log line. Returns boolean result"""
 return 'ERROR' in self._curr_line
 def _parse_datetime(self):
 """Parse string to datetime"""
 date_string = self.DATE_REGEX.search(self._curr_line).group(0)
 return datetime.datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
 def increase_base_time(self, line_time):
 """Return a datetime that is line_time plus one minute"""
 return line_time + self.INC_MINUTE
 def run(self):
 """
 Read from a log file.
 For each line check if log line time is later than base comparison time.
 If so, update the base time and set error count to 0.
 Check for errors in line. Increment error count if any are found.
 Check the total error count. If it's greater than five, restart the logged
 process.
 Check if process has been restarted before. If so, kill the logged process.
 """
 count = 0
 base_time = datetime.datetime.min
 restarted = False
 for line in self._log:
 line_time = self._parse_datetime()
 if line_time > base_time:
 base_time = self.increase_base_time(line_time)
 count = 0
 elif self._has_error():
 count += 1
 if count >= 5:
 if restarted:
 print("Kill the process") # A placeholder, sorry
 else:
 print("Restart the process") # Also a placeholder, sorry
 count = 0
 restarted = True
if __name__ == "__main__":
 monitor = LogMonitor(sys.argv[1])
 monitor.run()

Thank you! Those are good points and I think I am actually switching over to using a class. Just to clarify the line_time and base_time thing in find_error: the idea was to look for errors within a set time frame (like if there has been five or more errors within a minute). I really should have made it more clear, sorry. Also a question: could I call the read_log inside the run function instead? Because that seems to be the only one using it? Or is there a important reason why to call it in the class variables (self._log = self._read_log())? Thanks!
@sjne, self._read_log() is just scheduling a generator. Making it on class instantiation phase is a generic way. If your implementation is expected to be simplified and never be extended - you may move it.

camp0 camp0 8655 silver badges7 bronze badges · Answer 3 · 2019-11-08 20:11:16Z

1

\$\begingroup\$

Here is my contribution.

You should not make the while True loop, instead of that you should use inotify in order to been notify only when changes happen on the file that you are monitoring, here is a short code.

import inotify.adapters
def _read_log(self):
 i = inotify.adapters.Inotify()
 i.add_watch(f)
 with open(f, 'w'):
 pass
 for event in i.event_gen(yield_nones=False):
 (_, type_names, path, filename) = event
 do_work()

Basically your process will be only unblock until there are some changes on the file that you are monitoring.

Share

answered Nov 8, 2019 at 20:11

camp0's user avatar

camp0 camp0

8655 silver badges7 bronze badges

\$\endgroup\$

3

\$\begingroup\$ Thank you! Just started to look into it. As far as I understand it at the moment, I'd be using inotify to track an event and then read the log file lines if there's a change? So doWork() would mean something like yielding or returning logfile.readline()? Sorry, I'm a bit lost as to how to actually get the changed line (cause inotify only returns a generator about the event?). Could you please give me some hints? \$\endgroup\$

user212865
– user212865

2019年11月10日 13:54:11 +00:00
Commented Nov 10, 2019 at 13:54
\$\begingroup\$ For example could I just replace do_work() with yield logfile.readline()? And can I add logfile.seek(0, os.SEEK_END) after with open(f, 'w') as logfile? \$\endgroup\$

user212865
– user212865

2019年11月10日 14:08:41 +00:00
Commented Nov 10, 2019 at 14:08
\$\begingroup\$ I tried it as described above (plus replaced the pass with the for event block and used the add_watch() with the constant IN_MODIFY). A loop was logging two lines every 30 seconds to the log file and inotify did pick it up but often it would just skip one of the lines and print it 30 seconds later. \$\endgroup\$

user212865
– user212865

2019年11月11日 21:27:18 +00:00
Commented Nov 11, 2019 at 21:27

Add a comment |

Stack Exchange Network

Log monitoring with pythonic tail -f and process killing

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Log monitoring with pythonic tail -f and process killing

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions