##Performance
Performance
General review
##General review
II don't see why you yield
a single value, rather than simply returning. Is there a need for Median()
to be iterable?
##Performance
##General review
I don't see why you yield
a single value, rather than simply returning. Is there a need for Median()
to be iterable?
Performance
General review
I don't see why you yield
a single value, rather than simply returning. Is there a need for Median()
to be iterable?
I got better results by using Python's built-in sorted()
. We don't need any of the extra functionality of SortedList
(maintaining the invariant through adds and deletes), and a single sort (mostly in native code) gives these results:
def median(values):
sortedlist = sorted(values)
length = len(sortedlist)
half = length // 2
# return not yield; see "General review"
if length % 2 == 0:
return (sortedlist[half] + sortedlist[half - 1]) // 2
else:
return sortedlist[half]
def main():
m, n = 1000, 15000000
start_time = time.time()
data = [random.randrange(m) for i in range(n)]
print("--- %s seconds to generate" % (time.time() - start_time))
# print("Random Data: ", data)
start_time = time.time()
result = sorted(data)
print("--- %s seconds to sort" % (time.time() - start_time))
start_time = time.time()
result = median(data)
print("Result: ", result)
print("--- %s seconds ---" % (time.time() - start_time))
--- 7.638948202133179 seconds to generate
--- 3.118924617767334 seconds to sort
Result: 500
--- 3.3397886753082275 seconds ---
Given that length
is an integer, if length % 2
is not zero0, it must be one1 - so the elif
could easily be else
.
Given that length
is an integer, if length % 2
is not zero, it must be one - so the elif
could easily be else
.
I got better results by using Python's built-in sorted()
. We don't need any of the extra functionality of SortedList
(maintaining the invariant through adds and deletes), and a single sort (mostly in native code) gives these results:
def median(values):
sortedlist = sorted(values)
length = len(sortedlist)
half = length // 2
# return not yield; see "General review"
if length % 2 == 0:
return (sortedlist[half] + sortedlist[half - 1]) // 2
else:
return sortedlist[half]
def main():
m, n = 1000, 15000000
start_time = time.time()
data = [random.randrange(m) for i in range(n)]
print("--- %s seconds to generate" % (time.time() - start_time))
# print("Random Data: ", data)
start_time = time.time()
result = sorted(data)
print("--- %s seconds to sort" % (time.time() - start_time))
start_time = time.time()
result = median(data)
print("Result: ", result)
print("--- %s seconds ---" % (time.time() - start_time))
--- 7.638948202133179 seconds to generate
--- 3.118924617767334 seconds to sort
Result: 500
--- 3.3397886753082275 seconds ---
Given that length
is an integer, if length % 2
is not 0, it must be 1 - so the elif
could easily be else
.
##Performance
I instrumented your code differently:
def main():
m, n = 1000, 15000000
start_time = time.time()
data = [random.randrange(m) for i in range(n)]
print("--- %s seconds to generate" % (time.time() - start_time))
# print("Random Data: ", data)
start_time = time.time()
result = SortedList(data)
print("--- %s seconds to sort" % (time.time() - start_time))
start_time = time.time()
result = list(Median(data))
print("Result: ", result)
print("--- %s seconds ---" % (time.time() - start_time))
For me, the results were:
--- 7.407598257064819 seconds to generate
--- 4.535749673843384 seconds to sort
Result: [500]
--- 5.01109504699707 seconds ---
This shows that the generation of the random input takes longer than finding the median, and that 90% of Median()
is spent sorting (probably most of the rest is caused by converting between lists and iterators). You're unlikely to get large gains through modifications to your own part of the code.
##General review
I don't see why you yield
a single value, rather than simply returning. Is there a need for Median()
to be iterable?
Given that length
is an integer, if length % 2
is not zero, it must be one - so the elif
could easily be else
.