I am running an IoT fleet, and the need for checking the empty disk usage space has increased.
So I am using the following python code.
In addition to best practices, efficiency, readability, pythonic, etc. I would love you to share your thoughts about: Is it compliance with: "Robert Martin Single Responsibility Principle" I tried to be consistent and divide my script functions in a logical matter.
import os
from psutil import disk_usage
import sys
import logging
import subprocess
THRESHOLD_PERCENTAGE = 50
# Defining logger
def make_logger():
log = logging.getLogger(__name__)
log.setLevel(logging.INFO)
formatter = logging.Formatter('%(filename)s - %(asctime)s - %(levelname)s - %(message)s')
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
log.addHandler(handler)
return log
def display_disk_usage():
disk = disk_usage(os.path.realpath('/'))
logger.info("Total disk space: %d GiB" % (disk.total // (2 ** 30)))
logger.info("Used disk space: %d GiB" % (disk.used // (2 ** 30)))
logger.info("Free disk space: %d GiB" % (disk.free // (2 ** 30)))
logger.info("Used disk percentage: %d" % disk.percent + '%')
return disk.percent
def disk_usage_threshold(in_use_disk_percentage):
if in_use_disk_percentage < THRESHOLD_PERCENTAGE:
return 'Disk usage is under the threshold level'
logger.warning(f'\nWarning your disk usage above the threshold it is {in_use_disk_percentage}% full')
ask_user_for_action = input('If you wish to delete logs and local postgres data print "yes"\n')
if ask_user_for_action == 'yes':
clean_disk_usage()
logger.info('clean disk usage')
else:
logger.info(f'Be careful your disk usage is {in_use_disk_percentage}%')
def clean_disk_usage():
# Stop docker and PM2
logger.info('\nStopping Dockers and PM2 processes\n')
subprocess.run('sudo sh /home/pi/Desktop/speedboatBox/scripts/stop_speedboatbox.sh', shell=True, capture_output=True,
check=True)
logger.info('\nDeleting log.out directory\'s content ...\n')
subprocess.run('sudo truncate -s 0 /home/pi/Desktop/log.out', shell=True, capture_output=True, check=True)
logger.info('\nDeleting local postgres data-base ...\n')
try:
subprocess.run('sudo sh /home/pi/Desktop/speedboatBox/postgresql_service/delete_db.sh', shell=True,
capture_output=True)
except subprocess.CalledProcessError as e:
logger.info(e)
display_disk_usage()
# Start dockers and PM2 processes
logger.info('\nStarting dockers and PM2 processes ...\n')
subprocess.run('sudo sh /home/pi/Desktop/speedboat/scripts/start_speedboatbox.sh', shell=True, capture_output=True,
check=True)
if __name__ == '__main__':
logger = make_logger()
in_use_disk_percentage = display_disk_usage()
disk_usage_threshold(in_use_disk_percentage)
```
-
3\$\begingroup\$ FYI, python-2.x says: "Do not mix this tag with the python-3.x tag" and python-3.x says: "Do not mix this tag with the python-2.x tag." Please pick a version, or if the program is intended to work in both environments, remove both and maybe mention that intent in the post text. Thanks. \$\endgroup\$ggorlen– ggorlen2021年06月06日 18:38:55 +00:00Commented Jun 6, 2021 at 18:38
2 Answers 2
First, some minor issues:
logger
is a global variable, and all your functions are completely dependent on it being present -- and if youimport
this file it won't be. You may want to create the logger outside theif __name__ == '__main__':
block, or pass the logger around as a parameter, or have your functions calllogging.getLogger(__name__)
themselves- When logging disk usage percentage, you're using the
%
operator to format the message but then concatenating another string onto it. Including the percent sign in the format string like"Used disk percentage: %d%%"
feels more natural - How does the stop script act if the processes it's supposed to stop aren't running? Is it correct to restart them even if they weren't running before you did the cleanup?
- In case truncating the log file fails, the restart script doesn't run. Maybe you should use
try
/finally
to make sure the restart script runs even if one of the cleanup actions throws an exception? - Why
sudo sh {script}
instead of justsudo {script}
? Are you sure the scripts will always be sh scripts? - I don't think you need to have "Warning" at the start of the warning message -- you seem to have configured the logger to print the message level before each message anyway, so you'd end up with "WARNING - Warning your disk usage..."
When disk_usage_threshold
returns early, it returns a string, which is then discarded. When it doesn't, it returns None
. This feels weird -- returning nothing is fine, returning some useful information is good. But pick an approach and commit to it so the caller has some idea what to do
Your display_disk_usage
function is a bit odd. It fetches all kinds of disk usage information, but discards almost all of it -- what if the caller wants more of it? It also kind of has two responsibilities -- to get disk usage information, and to log it. Those feel like two separate steps, and should be handled accordingly -- in one place you call that function to get the data, and the logging is a nice side benefit, in another you call it only for the logging and discard the data entirely, and that makes its usage feel inconsistent
clean_disk_usage
is very tightly coupled to your system, with hard-coded absolute paths into a user's home folder. If that's where those scripts live you might need to have those paths in the code somewhere, sure, but if you turn those paths into parameters of the clean_disk_usage
function (and maybe even command-line parameters of the program itself), you get a lot more flexibility
On a related note, your start script isn't in the same location as the stop script. Maybe there's a reason for that, but it looks like it might be a typo, so I'm pointing it out just in case
Finally, your subprocesses raise some questions about your file structure in general
- Who is meant to run this program anyway? If it's
pi
, why do they needsudo
to access files in their own home directory? And why is some other user's processes modifying those files -- or ifpi
owns those processes, why do they usesudo
to stop and restart them? If this program isn't meant to be run bypi
at all, well that raises different questions - Is this data even user-specific? If so, why are we only concerned with
pi
's copy? If not, why is it in a user's home directory? - Even if this data really is user-specific, who's putting it on the desktop? They seem to be generated by other programs, and I'd usually expect those to put things somewhere like
$XDG_CACHE_DIR
or$XDG_DATA_HOME
, and those typically don't point to locations under~/Desktop
-
4\$\begingroup\$
logger
being a global upon which all of a module's functions depend is pretty typical. So yes: I think the one option you proposed, initializing it outside of the main guard, is the right way to go. \$\endgroup\$Reinderien– Reinderien2021年06月07日 16:45:12 +00:00Commented Jun 7, 2021 at 16:45 -
\$\begingroup\$ would be happy to know if I got the point of
Your display_disk_usage function is a bit odd.
would that be a better way to approach it?def get_disk_usage_info(mount_point_path='/'): return disk_usage(os.path.realpath(mount_point_path)) def log_disk_usage_info(): logger.info("Total disk space: %d GiB" % (disk_info.total // (2 ** 30))) logger.info("Used disk space: %d GiB" % (disk_info.used // (2 ** 30))) logger.info("Free disk space: %d GiB" % (disk_info.free // (2 ** 30))) logger.info("Used disk percentage: %d%%" % disk_info.percent)
\$\endgroup\$Monty_emma– Monty_emma2021年06月07日 20:01:34 +00:00Commented Jun 7, 2021 at 20:01
Overall not bad! You've managed to avoid many of the "usual suspects" when it comes to Python scripting practices.
make_logger
could use a return typehintdisplay_disk_usage
, trivially modified to accept a mountpoint path, would be a more flexible implementationdisplay_disk_usage
is doing two things: displaying (what it says on the tin), and calculating and returning (not what it says on the tin). Separate out thedisk_usage
call, and instead accept a disk instance.disk_usage_threshold
either returns what looks to be a warning string, orNone
. This function has a number of problems:- The return value, as a string, is not useful - replace it with a boolean and let the caller decide whether to print a warning. Or better yet return nothing at all, since you don't actually use the return value.
- Be careful your disk usage is sounds like a
logger.warning
, notlogger.info
- The function would be better-called
check_disk_usage_threshold
- there needs to be a verb somewhere
This:
logger.info('\nDeleting log.out directory\'s content ...\n')
should use a double-quoted string so that you don't have to escape the inner single quotes.
This:
except subprocess.CalledProcessError as e:
logger.info(e)
should likely be logged at level error
.
This:
subprocess.run('sudo sh /home/pi/Desktop/speedboatBox/scripts/stop_speedboatbox.sh', shell=True, capture_output=True,
is troublesome for a number of reasons:
- You're asking to capture output, but then discarding it. Consider at least logging it. If it's too noisy, demote it to logging level
debug
- Don't hard-code absolute paths. This path should be parametric or at least extracted to a constant.
- System-level scripts should NOT exist in the
/home
directory. shell=True
has vulnerabilities. Best to useshell=False
and call into a specific executable instead.
This:
/home/pi/Desktop/log.out
similarly should not be hard-coded and should not live in /home
, instead going in /var/log
.
This:
/home/pi/Desktop/speedboatBox/postgresql_service
is a thoroughly ruinous place to install PostgreSQL, if that's what this is.
Broadly: think of /home
as a human person's desktop. If all of the files in it were to be deleted, the computer - with no human files - should still be able to run your software. Unix has a standard directory layout - /var
for logs and runtime files, /etc
for config, etc. Part of making this script robust and secure will be following this layout.
-
\$\begingroup\$ Truly beneficial !!!! I would be happy if you could elaborate for me on those issues: 1. what is the proper type hint return value for
make_looger
function. should it looks like that:# Defining logger def make_logger(): -> str
Don't hard-code absolute paths. This path should be parametric or at least extracted to a constant.
I have to know why it is a bad thing. Is it for keeping the code DRY? or for some other reason ??shell=True has vulnerabilities.
Isn't just another way to use subprocess and not to pass the arguments has a list, what is the core diff \$\endgroup\$Monty_emma– Monty_emma2021年06月07日 19:29:10 +00:00Commented Jun 7, 2021 at 19:29
Explore related questions
See similar questions with these tags.