Recursive function to get all Zendesk tickets

Question 1

I'm developing a Python library to consume the Zendesk search API. Problem is, Zendesk limits search results to 1000 items. Thus in order to get all the results if they are greater than 1000 I need to call the endpoint several times with a recursive function storing the latest created datetime for a ticket once the 1000 limit is reached.

My code at the moment works but it feels wrong. Could you help me to improve this recursive function? For instance I'd prefer not to declare an empty list at the constructor but instead inside the function.

Here is the code:

init.py

import requests
import base64
from zdesktools.lib.api import SearchApi, TicketsApi
class ZdeskTools(object):
 def __init__(self, zendeskApiUrl=None,
 username=None,
 password=None):
 session = self._begin_session(username, password)
 config = {
 'zendeskApiUrl': zendeskApiUrl,
 'session': session
 }
 self.search = SearchApi(**config)
 self.tickets = TicketsApi(**config)
 # self.metrics = MetricsApi(**config)
 def _begin_session(self, username, password):
 session = requests.Session()
 username_password = f'{username}/token:{password}'
 encoded_bytes = base64.b64encode(username_password.encode('utf-8'))
 encoded_string = str(encoded_bytes, 'utf-8')
 headers = {
 'Authorization': f'Basic {encoded_string}'
 }
 session.headers.update(headers)
 return session

api.py

from urllib.parse import urlencode
import json
import pprint
class SearchApi(object):
 def __init__(self, **config):
 self.session = config.get('session')
 self.url = config.get('zendeskApiUrl')+'/search.json?'
 self.created_at = []
 self.tickets = []
 def get_ticket_numbers(self, search_params):
 response = self.session.get(f'{self.url}{urlencode(search_params)}')
 data = response.json()
 if 'results' in data:
 for result in data['results']:
 self.created_at.append(result['created_at'])
 self.tickets.append(result['id'])
 if 'next_page' in data:
 if data['next_page'] is not None:
 self.url = data['next_page']
 self.get_ticket_numbers(search_params)
 else:
 self.url = self.url[:self.url.index('?')+1]
 search_params['query'] = search_params['query'] + \
 f' created_at>={self.created_at[-1]}'
 self.get_ticket_numbers(search_params)
 return self.tickets

Question 2

I'm a little unclear on the purpose of that last else-clause. Are you trying to find all the tickets that were created since the search originally began?

Question 3

This goes about the maximum pool size of a search which is limited to 1000 tickets paginated in 10 pages of 100 tickets each. However if the total results goes beyond 1000 I need to query the api again but this time including in the search parameter the latest ticket previously found creation date (that is the else clause)

Question 4

So on the 10th page, is the 'next_page' key actually missing? Or does it have a value of null?

Question 5

It´s missing =)

Question 6

And in which cases do you expect the 'results' key to be missing? I can see from the API documentation that it will be missing if there is an error, but perhaps you had some other case in mind.

Question 7

My usual advice for implementors of recursive algorithms Is this: figure out how to do it without recursion (although i dont know how to prove it Is Always better, i can prove it Is never worse. And i can prove it Is Always possible.)

This advice Is yet stronger in your case because there Is nothing recursive about pagination. Its you who made it recursive artificially.

Further, the api has reasons to provide the data with a page limit. They dont want to run to memory issues when serving those data to you.

Nor should you want to run into memory issues when processing the data. If Its not restricting you, you should also process the data page by page discarding the previous page before loading another one.

Question 8

I, on the other hand, disagree. A recursive algorithm allows you to express loops immutably, passing the state as a parameter, and if the language has halfway-decent tail-call optimisation then it'll all get optimised down to the same thing. Liberating yourself from the mutation of state often leads to simpler reasoning!

Question 9

I take it back: Python does not have halfway-decent tail-call optimisation, and never will (stackoverflow.com/questions/13591970/…). Carry on.

Question 10

@PatrickStevens "The optimizer turns the recurisve version into machine code that is equivalent to that of the non-recurisve version." only confirms that the recursive version was not optimal. You just got the benefit of the optimizer opmizing it automaticaly for you. If optimizer is available, they yes it is probably a matter of preference. If it is not and the algorithm needs to stack everything, then difference would probably be neglegable and it is again matter of preference. In all other cases it depends what you want. If you want performance, go non-recursive.

Question 11

It depends what you mean by "optimal". In most situations, I prefer readability and correctness to speed; computers are fast enough for nearly everything I ever want to do anyway. In the OP's example, network requests will massively dominate the running time no matter what kind of looping construct is used.

Question 12

@PatrickStevens Well I usualy refer to performance when I say "optimal". But readability is also important. But I am just so used to non-recursivity that I actualy find it easier to read and understand. Anyway, in case of OP's code, I don't even see the reasoning for recursion from the logical point of view. You dont hold all the pages that you have read until you finish the book so you can turn those pages back one by one until you reach the first chanpter, the introduction and finally close the book... That's what it reminds me...

slepic slepic 5,6272 gold badges9 silver badges27 bronze badges · Answer 1 · 2020-01-03 07:22:07Z

3

\$\begingroup\$

My usual advice for implementors of recursive algorithms Is this: figure out how to do it without recursion (although i dont know how to prove it Is Always better, i can prove it Is never worse. And i can prove it Is Always possible.)

This advice Is yet stronger in your case because there Is nothing recursive about pagination. Its you who made it recursive artificially.

Further, the api has reasons to provide the data with a page limit. They dont want to run to memory issues when serving those data to you.

Nor should you want to run into memory issues when processing the data. If Its not restricting you, you should also process the data page by page discarding the previous page before loading another one.

Share

edited Jan 3, 2020 at 8:19

answered Jan 3, 2020 at 7:22

slepic's user avatar

slepic slepic

5,6272 gold badges9 silver badges27 bronze badges

\$\endgroup\$

8

1

\$\begingroup\$ I, on the other hand, disagree. A recursive algorithm allows you to express loops immutably, passing the state as a parameter, and if the language has halfway-decent tail-call optimisation then it'll all get optimised down to the same thing. Liberating yourself from the mutation of state often leads to simpler reasoning! \$\endgroup\$

Patrick Stevens
– Patrick Stevens

2020年01月03日 12:40:06 +00:00
Commented Jan 3, 2020 at 12:40
\$\begingroup\$ I take it back: Python does not have halfway-decent tail-call optimisation, and never will (stackoverflow.com/questions/13591970/…). Carry on. \$\endgroup\$

Patrick Stevens
– Patrick Stevens

2020年01月03日 12:43:30 +00:00
Commented Jan 3, 2020 at 12:43
\$\begingroup\$ @PatrickStevens "The optimizer turns the recurisve version into machine code that is equivalent to that of the non-recurisve version." only confirms that the recursive version was not optimal. You just got the benefit of the optimizer opmizing it automaticaly for you. If optimizer is available, they yes it is probably a matter of preference. If it is not and the algorithm needs to stack everything, then difference would probably be neglegable and it is again matter of preference. In all other cases it depends what you want. If you want performance, go non-recursive. \$\endgroup\$

slepic
– slepic

2020年01月03日 13:01:25 +00:00
Commented Jan 3, 2020 at 13:01
1

\$\begingroup\$ It depends what you mean by "optimal". In most situations, I prefer readability and correctness to speed; computers are fast enough for nearly everything I ever want to do anyway. In the OP's example, network requests will massively dominate the running time no matter what kind of looping construct is used. \$\endgroup\$

Patrick Stevens
– Patrick Stevens

2020年01月03日 13:03:16 +00:00
Commented Jan 3, 2020 at 13:03
\$\begingroup\$ @PatrickStevens Well I usualy refer to performance when I say "optimal". But readability is also important. But I am just so used to non-recursivity that I actualy find it easier to read and understand. Anyway, in case of OP's code, I don't even see the reasoning for recursion from the logical point of view. You dont hold all the pages that you have read until you finish the book so you can turn those pages back one by one until you reach the first chanpter, the introduction and finally close the book... That's what it reminds me... \$\endgroup\$

slepic
– slepic

2020年01月03日 13:08:48 +00:00
Commented Jan 3, 2020 at 13:08

| Show 3 more comments

Stack Exchange Network

Recursive function to get all Zendesk tickets

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Recursive function to get all Zendesk tickets

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions