2
\$\begingroup\$

Is it fine to retry a web service call like this in Python? Would there be any scenarios the exception might not work the way we expect it? I expect it to continue only if it's fine, otherwise just retry, and if that doesn't work, just throw the error out.

def getblob(html):
 num_of_retries = 2 
 time_interval = 2 
 while num_of_retries:
 try: 
 request_id = str(uuid.uuid4())
 params = [('requestId', request_id)]
 url = 'service.com?' 
 url = url + urllib.urlencode(params)
 request = urllib2.Request(url, data=html)
 request.add_header('Accept', 'application/pdf')
 request.add_header('Content-Type', 'text/html')
 handle = urllib2.urlopen(request)
 pdf_blob = handle.read() 
 break
 except Exception:
 typ, val, tb = sys.exc_info()
 logger.error(traceback.format_exception(typ, val, tb))
 time.sleep(time_interval)
 num_of_retries -= 1
 # If there aren't any retries - propogate
 if not num_of_retries: 
 raise
 return pdb_blob

I wnated to review this because recently I bumped into an issue where I hit an Internal Server Exception on the server side and the retry gave me an HTTP 200 response with incomplete file (it was a PDF without EOF Marker).

Would this snippet cause anything like this? It does work and I get PDF back as well. However from time to time I hit an HTTP 500 (some unknown server issue) and then HTTP 200 with incomplete blob. The incomplete blob is what makes me investigate if this loop can cause that problem. This HTTP 500 HTTP 200 (Incomplete Blob) is very much a pattern. Before I added this retry, I used to get HTTP 500 but never an incomplete blob.

asked Dec 4, 2015 at 13:41
\$\endgroup\$
3
  • \$\begingroup\$ I could see a down vote, is this not a question for Code Review? Should this be asked elsewhere on the SO network? \$\endgroup\$ Commented Dec 4, 2015 at 19:51
  • \$\begingroup\$ I downvoted as this question is very modified code of what you actually use. There have been two minor, but crucial edits to make it work, and currently it doesn't actually return anything either. You should take care before posting, that the code is actually working in the presented form. \$\endgroup\$ Commented Dec 4, 2015 at 20:21
  • \$\begingroup\$ Sorry about that. Will try to make sure I post the real code. I added the return. I couldn't paste the code because it was having some properitary links and also it wasn't the cleanest code. Hence I decide to format it for the purpose of SO but unfortunately missed some crucial points. \$\endgroup\$ Commented Dec 7, 2015 at 13:55

3 Answers 3

5
\$\begingroup\$

Instead of using while to count a specific number of times and manually handle the number, you should use for _ in range(num_of_retries). It works the same except that you don't need to manually change num_of_retries. And this way num_of_retries wont be modified from the initial value, in case you need to access it again for any reason.

There is one small change to make this work. You need to use the for ... else syntax to raise if the loop is not broken. An else block after a for block will be executed if the for loop never reaches a break statement. This fits your case perfectly as break is called on success, but if there's never success then it will run through all the iterations and reach the else block where you can raise:

for _ in range(num_of_retries):
 try: 
 request_id = str(uuid.uuid4())
 params = [('requestId', request_id)]
 url = 'service.com' 
 url = url + urllib.urlencode(params)
 request = urllib2.Request(url, data=html)
 request.add_header('Accept', 'application/pdf')
 request.add_header('Content-Type', 'text/html')
 handle = urllib2.urlopen(request)
 pdf_blob = handle.read() 
 break
 except Exception:
 typ, val, tb = sys.exc_info()
 logger.error(traceback.format_exception(typ, val, tb))
 time.sleep(time_interval)
else:
 raise

Without any explicit argument, raise will just use the last exception that has been raised and handled. See the SO answer here about it. Basically when you try except an exception, its details are being retained by Python and are passed to raise in the absence of any explicit exception passed to raise.

answered Dec 4, 2015 at 15:43
\$\endgroup\$
4
  • \$\begingroup\$ Any idea on what would be the cleanest way to avoid time.sleep(time_interval) on the last retry? \$\endgroup\$ Commented Dec 4, 2015 at 16:14
  • \$\begingroup\$ You don't need to be in the exception block to raise? Does that mean it will raise the last known exception? \$\endgroup\$ Commented Dec 4, 2015 at 16:26
  • 1
    \$\begingroup\$ @Nishant Correct, I added a brief explanation that links to a post on SO that explains it well. \$\endgroup\$ Commented Dec 4, 2015 at 16:47
  • \$\begingroup\$ @MathiasEttinger Hm. I guess I'd just use for i in range so I could then call if i == num_of_retries - 1. It wouldn't be entirely clear because of that - 1. Anything else I can think of involves adding more to the code than I feel should be necessary. \$\endgroup\$ Commented Dec 4, 2015 at 16:54
4
\$\begingroup\$

You should try to reduce the amount of code protected by the try .. except construct. To me, the only line that should be protected as such is urllib2.urlopen(request), the rest being before the try or in an else clause.

You should also specialize a bit more your exception handling: trying again and again to reach an invalid url is a bad idea. But except Exception will permit that, since it catches ValueErrors raised by urllib2.urlopen when the url entered is invalid or blank. The only exceptions you should care about in your case are urllib2.URLError (and possibly urllib2.HTTPError its subclass). You will thus fail sooner if the cause of failure is not network-related.

The body of the loop would look like:

request_id = str(uuid.uuid4())
params = [('requestId', request_id)]
url = 'service.com' 
url = url + urllib.urlencode(params)
request = urllib2.Request(url, data=html)
request.add_header('Accept', 'application/pdf')
request.add_header('Content-Type', 'text/html')
try:
 handle = urllib2.urlopen(request)
except urllib2.URLError:
 typ, val, tb = sys.exc_info()
 logger.error(traceback.format_exception(typ, val, tb))
 num_of_retries -= 1
 # If there aren't any retries - propogate
 if not num_of_retries: 
 raise
 time.sleep(time_interval)
else:
 pdf_blob = handle.read()
 break

Note that I also moved the call to time.sleep(time_interval) to avoid waiting before re-raising the last exception.

answered Dec 4, 2015 at 16:12
\$\endgroup\$
1
  • \$\begingroup\$ Its request not req - sorry I made a typo. Don't I need a new request object each time? Also I need a new requestId each time to track it however. \$\endgroup\$ Commented Dec 4, 2015 at 16:21
4
\$\begingroup\$

The code would be clearer if you used a function decorator for the retry logic. One such decorator is retrying, but there are other similar libraries, and it wouldn't be hard to write one yourself.

@retry(stop_max_attempt_number=2, wait_fixed=2000)
def getblob(html):
 request_id = str(uuid.uuid4())
 ...
answered Dec 4, 2015 at 21:25
\$\endgroup\$
1
  • \$\begingroup\$ That is much more cleaner. \$\endgroup\$ Commented Dec 7, 2015 at 14:47

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.