I'm developing a Python library to consume the Zendesk search API.
Problem is, Zendesk limits search results to 1000 items. Thus in order to get all the results if they are greater than 1000 I need to call the endpoint several times with a recursive function storing the latest created datetime
for a ticket once the 1000 limit is reached.
My code at the moment works but it feels wrong. Could you help me to improve this recursive function? For instance I'd prefer not to declare an empty list at the constructor but instead inside the function.
Here is the code:
init.py
import requests
import base64
from zdesktools.lib.api import SearchApi, TicketsApi
class ZdeskTools(object):
def __init__(self, zendeskApiUrl=None,
username=None,
password=None):
session = self._begin_session(username, password)
config = {
'zendeskApiUrl': zendeskApiUrl,
'session': session
}
self.search = SearchApi(**config)
self.tickets = TicketsApi(**config)
# self.metrics = MetricsApi(**config)
def _begin_session(self, username, password):
session = requests.Session()
username_password = f'{username}/token:{password}'
encoded_bytes = base64.b64encode(username_password.encode('utf-8'))
encoded_string = str(encoded_bytes, 'utf-8')
headers = {
'Authorization': f'Basic {encoded_string}'
}
session.headers.update(headers)
return session
api.py
from urllib.parse import urlencode
import json
import pprint
class SearchApi(object):
def __init__(self, **config):
self.session = config.get('session')
self.url = config.get('zendeskApiUrl')+'/search.json?'
self.created_at = []
self.tickets = []
def get_ticket_numbers(self, search_params):
response = self.session.get(f'{self.url}{urlencode(search_params)}')
data = response.json()
if 'results' in data:
for result in data['results']:
self.created_at.append(result['created_at'])
self.tickets.append(result['id'])
if 'next_page' in data:
if data['next_page'] is not None:
self.url = data['next_page']
self.get_ticket_numbers(search_params)
else:
self.url = self.url[:self.url.index('?')+1]
search_params['query'] = search_params['query'] + \
f' created_at>={self.created_at[-1]}'
self.get_ticket_numbers(search_params)
return self.tickets
1 Answer 1
My usual advice for implementors of recursive algorithms Is this: figure out how to do it without recursion (although i dont know how to prove it Is Always better, i can prove it Is never worse. And i can prove it Is Always possible.)
This advice Is yet stronger in your case because there Is nothing recursive about pagination. Its you who made it recursive artificially.
Further, the api has reasons to provide the data with a page limit. They dont want to run to memory issues when serving those data to you.
Nor should you want to run into memory issues when processing the data. If Its not restricting you, you should also process the data page by page discarding the previous page before loading another one.
-
1\$\begingroup\$ I, on the other hand, disagree. A recursive algorithm allows you to express loops immutably, passing the state as a parameter, and if the language has halfway-decent tail-call optimisation then it'll all get optimised down to the same thing. Liberating yourself from the mutation of state often leads to simpler reasoning! \$\endgroup\$Patrick Stevens– Patrick Stevens2020年01月03日 12:40:06 +00:00Commented Jan 3, 2020 at 12:40
-
\$\begingroup\$ I take it back: Python does not have halfway-decent tail-call optimisation, and never will (stackoverflow.com/questions/13591970/…). Carry on. \$\endgroup\$Patrick Stevens– Patrick Stevens2020年01月03日 12:43:30 +00:00Commented Jan 3, 2020 at 12:43
-
\$\begingroup\$ @PatrickStevens "The optimizer turns the recurisve version into machine code that is equivalent to that of the non-recurisve version." only confirms that the recursive version was not optimal. You just got the benefit of the optimizer opmizing it automaticaly for you. If optimizer is available, they yes it is probably a matter of preference. If it is not and the algorithm needs to stack everything, then difference would probably be neglegable and it is again matter of preference. In all other cases it depends what you want. If you want performance, go non-recursive. \$\endgroup\$slepic– slepic2020年01月03日 13:01:25 +00:00Commented Jan 3, 2020 at 13:01
-
1\$\begingroup\$ It depends what you mean by "optimal". In most situations, I prefer readability and correctness to speed; computers are fast enough for nearly everything I ever want to do anyway. In the OP's example, network requests will massively dominate the running time no matter what kind of looping construct is used. \$\endgroup\$Patrick Stevens– Patrick Stevens2020年01月03日 13:03:16 +00:00Commented Jan 3, 2020 at 13:03
-
\$\begingroup\$ @PatrickStevens Well I usualy refer to performance when I say "optimal". But readability is also important. But I am just so used to non-recursivity that I actualy find it easier to read and understand. Anyway, in case of OP's code, I don't even see the reasoning for recursion from the logical point of view. You dont hold all the pages that you have read until you finish the book so you can turn those pages back one by one until you reach the first chanpter, the introduction and finally close the book... That's what it reminds me... \$\endgroup\$slepic– slepic2020年01月03日 13:08:48 +00:00Commented Jan 3, 2020 at 13:08
null
? \$\endgroup\$