Write a program to determine the mean, median, and standard deviation of a list of numbers
Count the duplicate numbers in the code, listing any that occur more than once with the number of times it occurs.
My implementation:
import math
def mean(input):
return sum(input) / float(len(input))
def median(input):
sorted_input = sorted(input)
length = len(input)
if length < 1:
return None
if length % 2 == 0:
return (sorted_input[length / 2] + sorted_input[length / 2 - 1]) / float(2)
else:
return sorted_input[length / 2]
def duplicate_counts(input):
distinct_input = set(input)
for distinct in set(input):
count = input.count(distinct)
if (count > 1):
print("{} occurs {} times".format(distinct, count))
def standard_deviation(input):
return math.sqrt(mean([(x - mean(input)) ** 2 for x in input]))
def main():
#sample run/test
the_list = [3, 6, 7, 2, 8, 9, 2, 3, 7, 4, 5, 9, 2, 1, 6, 9, 6]
print("The mean is: {}".format(mean(the_list)))
print("The median is: {}".format(median(the_list)))
print("The standard deviation is: {}".format(standard_deviation(the_list)))
duplicate_counts(the_list)
if __name__ == "__main__":
main()
Sample output:
The mean is: 5.23529411765 The median is: 6 The standard deviation is: 2.64640514805 2 occurs 3 times 3 occurs 2 times 6 occurs 3 times 7 occurs 2 times 9 occurs 3 times
Simple exercise I decided to do with Python for practice and to test/try all the excellent feedback I received on my previous two questions. There may be built in functions for these, and feel free to mention them, but for the sake of practice I implemented them this way.
-
1\$\begingroup\$ Are you using Python2 or 3? \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年10月28日 11:34:20 +00:00Commented Oct 28, 2015 at 11:34
2 Answers 2
A variable name
input
is confusing. After all,input
is a Python built-in.standard_deviation
callsmean
too many times (add a debug printout tomean
to see). Better dodef standard_deviation(input): input_mean = mean(input) return math.sqrt(mean([(x - input_mean) ** 2 for x in input]))
Median can be calculated in \$ O(\log n)\$. Sorting a list in \$O(n \log n)\$ just for a median is an overkill.
-
1\$\begingroup\$ I may be missing something (I probably am) but, in your second tip, it seems to me that all you are doing is storing one of the mean calls in a variable. Was tip a tip for increasing readability? \$\endgroup\$SirPython– SirPython2015年10月27日 00:28:44 +00:00Commented Oct 27, 2015 at 0:28
-
\$\begingroup\$ Just to clarify, for calculating the median, you're suggesting I create a better performing sorting algorithm? \$\endgroup\$Legato– Legato2015年10月27日 05:01:32 +00:00Commented Oct 27, 2015 at 5:01
-
\$\begingroup\$ @Legato Not sorting, but equal partitioning. To find a median you don't need partitions to be sorted. \$\endgroup\$vnp– vnp2015年10月27日 06:49:01 +00:00Commented Oct 27, 2015 at 6:49
-
1\$\begingroup\$ @SirPython the storage into a variable seems mainly to avoid computing the mean for each element in the input list, thus saving on CPU. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2015年10月27日 09:31:30 +00:00Commented Oct 27, 2015 at 9:31
-
\$\begingroup\$ @MathiasEttinger Ah, that makes perfect sense! Thank you for clearing that up. \$\endgroup\$SirPython– SirPython2015年10月27日 22:39:29 +00:00Commented Oct 27, 2015 at 22:39
In mean
, you're wrapping the result of len
in a float. I'm guessing this is because of Python 2's odd division that will always return an int and not a float.
>>> 12/5
2
In Python 3.x this is no longer an issue as it will return proper float values, and you can get this in Python 2 as well, by using from __future__ import division
. from __future__
is a way of importing new additions in Python 3 that were added to 2 but not in a way that overrides the old functionality. By importing this, division returns floats in Python 2 as well:
>>> from __future__ import division
>>> 12/5
2.4
Also I notice you use float
later on the literal value 2. But you could just use 2.0 and get the same effect even if you don't do the import.
>>> 12/5.0
2.4
Should median
really return None
for input of length 0? Shouldn't that raise ValueError
rather than silently return a non result?
Also you should indent by 4 spaces rather than 2. That's the commonly accepted Python style, and since it affects flow people could easily misread your code this way.