Is it fine to retry a web service call like this in Python? Would there be any scenarios the exception might not work the way we expect it? I expect it to continue only if it's fine, otherwise just retry, and if that doesn't work, just throw the error out.
def getblob(html):
num_of_retries = 2
time_interval = 2
while num_of_retries:
try:
request_id = str(uuid.uuid4())
params = [('requestId', request_id)]
url = 'service.com?'
url = url + urllib.urlencode(params)
request = urllib2.Request(url, data=html)
request.add_header('Accept', 'application/pdf')
request.add_header('Content-Type', 'text/html')
handle = urllib2.urlopen(request)
pdf_blob = handle.read()
break
except Exception:
typ, val, tb = sys.exc_info()
logger.error(traceback.format_exception(typ, val, tb))
time.sleep(time_interval)
num_of_retries -= 1
# If there aren't any retries - propogate
if not num_of_retries:
raise
return pdb_blob
I wnated to review this because recently I bumped into an issue where I hit an Internal Server Exception on the server side and the retry gave me an HTTP 200 response with incomplete file (it was a PDF without EOF Marker).
Would this snippet cause anything like this? It does work and I get PDF back as well. However from time to time I hit an HTTP 500 (some unknown server issue) and then HTTP 200 with incomplete blob. The incomplete blob is what makes me investigate if this loop can cause that problem. This HTTP 500 HTTP 200 (Incomplete Blob) is very much a pattern. Before I added this retry, I used to get HTTP 500 but never an incomplete blob.
-
\$\begingroup\$ I could see a down vote, is this not a question for Code Review? Should this be asked elsewhere on the SO network? \$\endgroup\$Nishant– Nishant2015年12月04日 19:51:03 +00:00Commented Dec 4, 2015 at 19:51
-
\$\begingroup\$ I downvoted as this question is very modified code of what you actually use. There have been two minor, but crucial edits to make it work, and currently it doesn't actually return anything either. You should take care before posting, that the code is actually working in the presented form. \$\endgroup\$holroy– holroy2015年12月04日 20:21:51 +00:00Commented Dec 4, 2015 at 20:21
-
\$\begingroup\$ Sorry about that. Will try to make sure I post the real code. I added the return. I couldn't paste the code because it was having some properitary links and also it wasn't the cleanest code. Hence I decide to format it for the purpose of SO but unfortunately missed some crucial points. \$\endgroup\$Nishant– Nishant2015年12月07日 13:55:38 +00:00Commented Dec 7, 2015 at 13:55
3 Answers 3
Instead of using while
to count a specific number of times and manually handle the number, you should use for _ in range(num_of_retries)
. It works the same except that you don't need to manually change num_of_retries
. And this way num_of_retries
wont be modified from the initial value, in case you need to access it again for any reason.
There is one small change to make this work. You need to use the for ... else
syntax to raise
if the loop is not broken. An else
block after a for
block will be executed if the for
loop never reaches a break
statement. This fits your case perfectly as break
is called on success, but if there's never success then it will run through all the iterations and reach the else block where you can raise
:
for _ in range(num_of_retries):
try:
request_id = str(uuid.uuid4())
params = [('requestId', request_id)]
url = 'service.com'
url = url + urllib.urlencode(params)
request = urllib2.Request(url, data=html)
request.add_header('Accept', 'application/pdf')
request.add_header('Content-Type', 'text/html')
handle = urllib2.urlopen(request)
pdf_blob = handle.read()
break
except Exception:
typ, val, tb = sys.exc_info()
logger.error(traceback.format_exception(typ, val, tb))
time.sleep(time_interval)
else:
raise
Without any explicit argument, raise
will just use the last exception that has been raised and handled. See the SO answer here about it. Basically when you try except
an exception, its details are being retained by Python and are passed to raise
in the absence of any explicit exception passed to raise
.
-
\$\begingroup\$ Any idea on what would be the cleanest way to avoid
time.sleep(time_interval)
on the last retry? \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2015年12月04日 16:14:20 +00:00Commented Dec 4, 2015 at 16:14 -
\$\begingroup\$ You don't need to be in the exception block to raise? Does that mean it will raise the last known exception? \$\endgroup\$Nishant– Nishant2015年12月04日 16:26:30 +00:00Commented Dec 4, 2015 at 16:26
-
1\$\begingroup\$ @Nishant Correct, I added a brief explanation that links to a post on SO that explains it well. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年12月04日 16:47:55 +00:00Commented Dec 4, 2015 at 16:47
-
\$\begingroup\$ @MathiasEttinger Hm. I guess I'd just use
for i in range
so I could then callif i == num_of_retries - 1
. It wouldn't be entirely clear because of that- 1
. Anything else I can think of involves adding more to the code than I feel should be necessary. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年12月04日 16:54:54 +00:00Commented Dec 4, 2015 at 16:54
You should try to reduce the amount of code protected by the try .. except
construct. To me, the only line that should be protected as such is urllib2.urlopen(request)
, the rest being before the try
or in an else
clause.
You should also specialize a bit more your exception handling: trying again and again to reach an invalid url is a bad idea. But except Exception
will permit that, since it catches ValueError
s raised by urllib2.urlopen
when the url entered is invalid or blank. The only exceptions you should care about in your case are urllib2.URLError
(and possibly urllib2.HTTPError
its subclass). You will thus fail sooner if the cause of failure is not network-related.
The body of the loop would look like:
request_id = str(uuid.uuid4())
params = [('requestId', request_id)]
url = 'service.com'
url = url + urllib.urlencode(params)
request = urllib2.Request(url, data=html)
request.add_header('Accept', 'application/pdf')
request.add_header('Content-Type', 'text/html')
try:
handle = urllib2.urlopen(request)
except urllib2.URLError:
typ, val, tb = sys.exc_info()
logger.error(traceback.format_exception(typ, val, tb))
num_of_retries -= 1
# If there aren't any retries - propogate
if not num_of_retries:
raise
time.sleep(time_interval)
else:
pdf_blob = handle.read()
break
Note that I also moved the call to time.sleep(time_interval)
to avoid waiting before re-raising the last exception.
-
\$\begingroup\$ Its request not req - sorry I made a typo. Don't I need a new request object each time? Also I need a new requestId each time to track it however. \$\endgroup\$Nishant– Nishant2015年12月04日 16:21:40 +00:00Commented Dec 4, 2015 at 16:21
The code would be clearer if you used a function decorator for the retry logic. One such decorator is retrying, but there are other similar libraries, and it wouldn't be hard to write one yourself.
@retry(stop_max_attempt_number=2, wait_fixed=2000)
def getblob(html):
request_id = str(uuid.uuid4())
...
-
\$\begingroup\$ That is much more cleaner. \$\endgroup\$Nishant– Nishant2015年12月07日 14:47:07 +00:00Commented Dec 7, 2015 at 14:47
Explore related questions
See similar questions with these tags.