I have a code that calls many times a function returning a list of the dates between two dates formatted as
('0001-01-01 00:01:00'),('0001-01-02 00:01:00'), ...
The current solution is:
import numpy as np
import time
from datetime import datetime
beg = datetime.strptime('01 01 0001', '%d %M %Y') #datetime
end = datetime.strptime('01 01 2001', '%d %M %Y')
def get_days(date1, date2):
day_diff = (date1 - date2).days + 1
days = [str(start_date + dt.timedelta(d)) for d in range(day_diff)]
dates = "('" + "'),('".join(days) + "')"
return dates
Is there a faster way to achieve this?
#timing
t0 = time.time()
dates_list = get_days(beg, end) #feed datetime
t1 = time.time()
total_create = t1-t0
print("list comprehension: ", total_create,'s')
1 Answer 1
With the list comprehension you are filling a whole list with your values, and then you are sending that list to join
.
Instead of a generating a list and then sending it, you can send a generator instead: similar to the list comprehension, but generates the values on-demand. With your old approach, if you had 10000 dates you would have them all at a list; with a generator it generates one at a time, so at least you will be consuming less memory.
With a generator, you would directly do:
dates = "('" + "'),('".join(str(start_date + dt.timedelta(d)) for d in range(day_diff)) + "')"
On a side note, the parameter names date1, date2
are not very explicit; it should be clear from the names which is the start and which is the end date.
Explore related questions
See similar questions with these tags.