I'm creating own script for backup user's directories.
CentOS 6.5; Python 2.6.
Directories looks like:
# tree -L 2 -d /var/www/vhosts/
/var/www/vhosts/
├── freeproxy
│ └── freeproxy.in.ua
├── hudeem
│ └── hudeem.kiev.ua
├── profy
│ └── profy.kiev.ua
├── rtfm
│ └── rtfm.co.ua
├── setevoy
│ ├── forum.setevoy.kiev.ua
│ ├── postfixadmin.setevoy.org.ua
│ ├── setevoy.org.ua
│ └── webmail.setevoy.org.ua
├── worlddesign
│ └── worlddesign.org.ua
└── zabbix
└── zabbix.setevoy.org.ua
As script big enought, I'll post only three 'most important' functions. So please note, that they using some additional functions.
First - function to create backup directories:
# global const for user's directory
VHOSTSPATH = '/var/www/vhosts/'
def back_dir_create(user, day):
backdir = '/home/setevoy/backups/temp_new_test/'
# weekly are kept 4 last, daily - 7 copies
# thus - better have separate directories
if day == 'Sun':
type = '/weekly/'
else:
type = '/daily/'
dirname = ('%s%s%s%s-files/' % (backdir, user, type, time.strftime('%Y-%m-%d')))
if not os.path.exists(dirname):
print('Creating directory: %s' % dirname)
os.makedirs(dirname)
return(dirname)
Second - 'incremental' backup, for files changed last twenty-four hours:
def inc_backup(dir_to_backup, backupname):
now = time.time()
cutoff = 86400
print('Creating archive %s...' % backupname)
out = tarfile.open(backupname, mode='w:bz2')
# start walk via all files to find changed last 24 hours
for root, dirs, files in os.walk(dir_to_backup, followlinks=True):
for file in files:
file = os.path.join(root, file)
try:
filemodtime = os.stat(file).st_mtime
if now - filemodtime < cutoff:
if os.path.isfile(file):
print('Adding file: %s...' % file)
out.add(file)
print('File modified: %s' % time.ctime(os.path.getmtime(file)))
except OSError as error:
print('ERROR: %s' % error)
print 'Closing archive.'
out.close()
And both this function used in 'main' function:
def run_backup():
# here is two functions, not listed here
# both rerurns lists[] - with users
# like just username
# and dirs
# like /var/www/vhosts/username
userlist = arch_users_list(VHOSTSPATH)
userpaths = arch_users_path(VHOSTSPATH, userlist)
# day of week
curday = time.strftime('%a')
for hostdir, username in zip(userpaths, userlist):
os.chdir(hostdir)
print('\nWorking in: %s' % os.getcwd())
print('Under user: %s' % username)
# call first function, listed above
backup_dir = back_dir_create(username, curday)
if backup_dir:
virtual_hosts = arch_vhosts_names(hostdir)
for host in virtual_hosts:
archname = (backup_dir + host + '.tar.bz2')
# once in week I want have full backup
# other time - 'incremental'
if curday == 'Sun':
full_backup(host, archname)
else:
inc_backup(host, archname)
else:
print('Backup already present, skip.')
It works fine for now, but I'm newbie in Python, so - what can be improved here or fixed?
1 Answer 1
Just like you have defined the VHOSTSPATH
global variable at the very top,
it will make sense to do the same for this too:
backdir = '/home/setevoy/backups/temp_new_test/'
So that if you move your script to another machine, all the local customizations will be in one very visible place.
For even more benefits, move these files to a dedicated settings.py
module and make your script import these values from it.
then it will be really perfectly clear where to customize the script parameters,
and it will be separated from the rest of the code, which will be easily redistributable.
This could be done better:
if day == 'Sun': type = '/weekly/' else: type = '/daily/' dirname = ('%s%s%s%s-files/' % (backdir, user, type, time.strftime('%Y-%m-%d')))
type
is a keyword in Python. Intelligent code editors should highlight it for you.
It's better to avoid variable names that overlap with keywords. How about subdir
?
The way you assign dirname
is a bit hard to read,
and just by reading that line it's not clear that this is intended as a nested directory layout. If I look at the type
variable I see that all possible values are prefixed and suffixed with the /
directory separator, but the code doesn't make it obvious why it is this way. A better way would be like this:
if day == 'Sun':
subdir = 'weekly'
else:
subdir = 'daily'
dirname = os.path.join(backdir, user, subdir, '%s-files' % time.strftime('%Y-%m-%d'))
In this version, thanks to the os.path.join
,
now it is explicit that we're talking about a nested hierarchy.
Also, you don't need to worry anymore if all components have a /
prefix or suffix or not.
Just make all components not have such prefix or suffix and let os.path.join
take care of it.
Finally, os.path.join
is portable: it will work on any operating system.
Sure, you may not care about that now, but you never know.
Don't reassign the value of a loop variable:
for file in files: file = os.path.join(root, file)
Use a different name for it, for example:
for file in files:
path = os.path.join(root, file)
The reason is that this can be confusing in various ways. The same is true for function parameters: don't reassign them, use different names for local variables.
The if
condition here seems pointless:
for file in files: file = os.path.join(root, file) # ... if os.path.isfile(file): # ...
By virtue of the way os.walk
works, I think os.path.join(root, file)
is always a valid file.
Except in some extreme conditions where the file might get removed in the middle of walking.
If you are in such environment, then it's fine.
I'm not familiar with tarfile.open
, but instead of this:
out = tarfile.open(backupname, mode='w:bz2') # ... do work out.close()
you can probably do this, which will be better:
with tarfile.open(backupname, mode='w:bz2') as out:
# ... do work
If this indeed works, then you don't need to worry about closing the out
handle,
it will be properly closed automtically by Python.
In the original code, if an error happens after out =
, the out.close()
might never be reached.
The with open(...) as fh:
idiom is the recommended way to work with resources that need to be closed when you're done with them.
-
\$\begingroup\$
backdir
- yep, already changed;type
- I'm writing in VIM, so there no highlights of warnings like this :-) thanks, will change;os.path.join
- good point, doesn't thought about such solution. \$\endgroup\$setevoy– setevoy2014年10月13日 15:51:34 +00:00Commented Oct 13, 2014 at 15:51 -
\$\begingroup\$ with tarfile.open(backupname, mode='w:bz2') as out: AttributeError: 'TarFile' object has no attribute 'exit' // if change to with-open() \$\endgroup\$setevoy– setevoy2014年10月13日 19:02:48 +00:00Commented Oct 13, 2014 at 19:02
-
\$\begingroup\$ Ah, that's too bad... \$\endgroup\$janos– janos2014年10月13日 21:46:49 +00:00Commented Oct 13, 2014 at 21:46