I just found this great wget wrapper and I'd like to rewrite it as a python script using the subprocess module. However it turns out to be quite tricky giving me all sorts of errors.
download()
{
local url=1ドル
echo -n " "
wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", 2ドル)}'
echo -ne "\b\b\b\b"
echo " DONE"
}
Then it can be called like this:
file="patch-2.6.37.gz"
echo -n "Downloading $file:"
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"
Any ideas?
Source: http://fitnr.com/showing-file-download-progress-using-wget.html
5 Answers 5
I think you're not far off. Mainly I'm wondering, why bother with running pipes into grep and sed and awk when you can do all that internally in Python?
#! /usr/bin/env python
import re
import subprocess
TARGET_FILE = "linux-2.6.0.tar.xz"
TARGET_LINK = "http://www.kernel.org/pub/linux/kernel/v2.6/%s" % TARGET_FILE
wgetExecutable = '/usr/bin/wget'
wgetParameters = ['--progress=dot', TARGET_LINK]
wgetPopen = subprocess.Popen([wgetExecutable] + wgetParameters,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in iter(wgetPopen.stdout.readline, b''):
match = re.search(r'\d+%', line)
if match:
print '\b\b\b\b' + match.group(0),
wgetPopen.stdout.close()
wgetPopen.wait()
10 Comments
awk -W interactive in the bash script. I'll poke at this some more later and see if I need to do something special to force line-buffered output in subprocess.wgetPopen.stdout might be destroyed (I expect so, but I don't know). As well as with ordinary files, it is better to close them explicitly (with-statement is used for the files) without relying on garbage collection (that is complex and hard to reason about). if not obj says "if obj empty or zero" (the test for None should be written as if obj is None) without concerning with types e.g., in Python 3 pipe.readline() may return b'' or '' that are different types and if not line works for both. And It supports both Python 2/3 from the same source.If you are rewriting the script in Python; you could replace wget by urllib.urlretrieve() in this case:
#!/usr/bin/env python
import os
import posixpath
import sys
import urllib
import urlparse
def url2filename(url):
"""Return basename corresponding to url.
>>> url2filename('http://example.com/path/to/file?opt=1')
'file'
"""
urlpath = urlparse.urlsplit(url).path # pylint: disable=E1103
basename = posixpath.basename(urllib.unquote(urlpath))
if os.path.basename(basename) != basename:
raise ValueError # refuse 'dir%5Cbasename.ext' on Windows
return basename
def reporthook(blocknum, blocksize, totalsize):
"""Report download progress on stderr."""
readsofar = blocknum * blocksize
if totalsize > 0:
percent = readsofar * 1e2 / totalsize
s = "\r%5.1f%% %*d / %d" % (
percent, len(str(totalsize)), readsofar, totalsize)
sys.stderr.write(s)
if readsofar >= totalsize: # near the end
sys.stderr.write("\n")
else: # total size is unknown
sys.stderr.write("read %d\n" % (readsofar,))
url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else url2filename(url)
urllib.urlretrieve(url, filename, reporthook)
Example:
$ python download-file.py http://example.com/path/to/file
It downloads the url to a file. If the file is not given then it uses basename from the url.
You could also run wget if you need it:
#!/usr/bin/env python
import sys
from subprocess import Popen, PIPE, STDOUT
def urlretrieve(url, filename=None, width=4):
destination = ["-O", filename] if filename is not None else []
p = Popen(["wget"] + destination + ["--progress=dot", url],
stdout=PIPE, stderr=STDOUT, bufsize=1) # line-buffered (out side)
for line in iter(p.stdout.readline, b''):
if b'%' in line: # grep "%"
line = line.replace(b'.', b'') # sed -u -e "s,\.,,g"
percents = line.split(None, 2)[1].decode() # awk 2ドル
sys.stderr.write("\b"*width + percents.rjust(width))
p.communicate() # close stdout, wait for child's exit
print("\b"*width + "DONE")
url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else None
urlretrieve(url, filename)
I have not noticed any buffering issues with this code.
Comments
I've done something like this before. and i'd love to share my code with you:)
#!/usr/bin/python2.7
# encoding=utf-8
import sys
import os
import datetime
SHEBANG = "#!/bin/bash\n\n"
def get_cmd(editor='vim', initial_cmd=""):
from subprocess import call
from tempfile import NamedTemporaryFile
# Create the initial temporary file.
with NamedTemporaryFile(delete=False) as tf:
tfName = tf.name
tf.write(initial_cmd)
# Fire up the editor.
if call([editor, tfName], shell=False) != 0:
return None
# Editor died or was killed.
# Get the modified content.
fd = open(tfName)
res = fd.read()
fd.close()
os.remove(tfName)
return res
def main():
initial_cmd = "wget " + sys.argv[1]
cmd = get_cmd(editor='vim', initial_cmd=initial_cmd)
if len(sys.argv) > 1 and sys.argv[1] == 's':
#keep the download infomation.
t = datetime.datetime.now()
filename = "swget_%02d%02d%02d%02d%02d" %\
(t.month, t.day, t.hour, t.minute, t.second)
with open(filename, 'w') as f:
f.write(SHEBANG)
f.write(cmd)
f.close()
os.chmod(filename, 0777)
os.system(cmd)
main()
# run this script with the optional argument 's'
# copy the command to the editor, then save and quit. it will
# begin to download. if you have use the argument 's'.
# then this script will create another executable script, you
# can use that script to resume you interrupt download.( if server support)
so, basically, you just need to modify the initial_cmd's value, in your case, it's
wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", 2ドル)}'
this script will first create a temp file, then put shell commands in it, and give it execute permissions. and finally run the temp file with commands in it.
2 Comments
call(filename) instead of os.system(cmd). To format datetime, you could use .strftime() method. with-statement closes files automatically that is the point of using it in the first place, no need to call f.close() by hand (unindent chmod in this case). If you want to make script executable by your user: os.chmod(filename, os.stat(filename).st_mode | stat.S_IEXEC) (or | 0111 for +x). To avoid leaking files, move code inside with Named..File() as tf: call tf.flush() before call([editor..) then tf.seek(0); res=tf.read()vim download.py
#!/usr/bin/env python
import subprocess
import os
sh_cmd = r"""
download()
{
local url=1ドル
echo -n " "
wget --progress=dot $url 2>&1 |
grep --line-buffered "%" |
sed -u -e "s,\.,,g" |
awk '{printf("\b\b\b\b%4s", 2ドル)}'
echo -ne "\b\b\b\b"
echo " DONE"
}
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"
"""
cmd = 'sh'
p = subprocess.Popen(cmd,
shell=True,
stdin=subprocess.PIPE,
env=os.environ
)
p.communicate(input=sh_cmd)
# or:
# p = subprocess.Popen(cmd,
# shell=True,
# stdin=subprocess.PIPE,
# env={'file':'xx'})
#
# p.communicate(input=sh_cmd)
# or:
# p = subprocess.Popen(cmd, shell=True,
# stdin=subprocess.PIPE,
# stdout=subprocess.PIPE,
# stderr=subprocess.PIPE,
# env=os.environ)
# stdout, stderr = p.communicate(input=sh_cmd)
then you can call like:
file="xxx" python dowload.py
3 Comments
sh as the command, and use shell=True? Why not run sh_cmd directly?sh to run it. In linux shell, we can use sh script.sh , and we can also use a PIPE or stdin to run some command, such as:cat some_file | sh or curl http://xxx.xx | sh and so on. For shell=Ture, From the docs, is says:The shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.shell=True a shell is used to run the command you pass in. You quoted the documentation yourself there.In very simple words, considering you have script.sh file, you can execute it and print its return value, if any:
import subprocess
process = subprocess.Popen('/path/to/script.sh', shell=True, stdout=subprocess.PIPE)
process.wait()
print process.returncode
2 Comments
script.sh has execute permission(chmod +x script.sh) or Popen('sh /path/to/script.sh', shell=True ...)Explore related questions
See similar questions with these tags.
wgetExecutable = '/usr/bin/wget' grepExecutable = '/usr/grep' wgetParameters = ['--progress=dot', "link_to_file"] grepParameters = ['--line-buffered', "%"] wgetPopen = subprocess.Popen([wgetExecutable] + wgetParameters, stdout=subprocess.PIPE)grepPopen = subprocess.Popen([grepExecutable] + grepParameters, stdin=wgetPopen.stdout)however I get an error instdin=wgetPopen.stdoutOSError: [Errno 2] No such file or directoryshmodule (with that name) that can take care of the bridge between bash and python!