6
\$\begingroup\$

Write a program to determine the mean, median, and standard deviation of a list of numbers
Count the duplicate numbers in the code, listing any that occur more than once with the number of times it occurs.

My implementation:

import math
def mean(input):
 return sum(input) / float(len(input))
def median(input):
 sorted_input = sorted(input)
 length = len(input)
 if length < 1:
 return None
 if length % 2 == 0:
 return (sorted_input[length / 2] + sorted_input[length / 2 - 1]) / float(2)
 else:
 return sorted_input[length / 2]
def duplicate_counts(input):
 distinct_input = set(input)
 for distinct in set(input):
 count = input.count(distinct)
 if (count > 1):
 print("{} occurs {} times".format(distinct, count))
def standard_deviation(input):
 return math.sqrt(mean([(x - mean(input)) ** 2 for x in input]))
def main():
 #sample run/test
 the_list = [3, 6, 7, 2, 8, 9, 2, 3, 7, 4, 5, 9, 2, 1, 6, 9, 6]
 print("The mean is: {}".format(mean(the_list)))
 print("The median is: {}".format(median(the_list)))
 print("The standard deviation is: {}".format(standard_deviation(the_list)))
 duplicate_counts(the_list)
if __name__ == "__main__":
 main()

Sample output:

The mean is: 5.23529411765 
The median is: 6 
The standard deviation is: 2.64640514805 
2 occurs 3 times 
3 occurs 2 times 
6 occurs 3 times 
7 occurs 2 times 
9 occurs 3 times 

Simple exercise I decided to do with Python for practice and to test/try all the excellent feedback I received on my previous two questions. There may be built in functions for these, and feel free to mention them, but for the sake of practice I implemented them this way.

asked Oct 26, 2015 at 23:10
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Are you using Python2 or 3? \$\endgroup\$ Commented Oct 28, 2015 at 11:34

2 Answers 2

5
\$\begingroup\$
  • A variable name input is confusing. After all, input is a Python built-in.

  • standard_deviation calls mean too many times (add a debug printout to mean to see). Better do

    def standard_deviation(input):
     input_mean = mean(input)
     return math.sqrt(mean([(x - input_mean) ** 2 for x in input]))
    
  • Median can be calculated in \$ O(\log n)\$. Sorting a list in \$O(n \log n)\$ just for a median is an overkill.

answered Oct 27, 2015 at 0:20
\$\endgroup\$
6
  • 1
    \$\begingroup\$ I may be missing something (I probably am) but, in your second tip, it seems to me that all you are doing is storing one of the mean calls in a variable. Was tip a tip for increasing readability? \$\endgroup\$ Commented Oct 27, 2015 at 0:28
  • \$\begingroup\$ Just to clarify, for calculating the median, you're suggesting I create a better performing sorting algorithm? \$\endgroup\$ Commented Oct 27, 2015 at 5:01
  • \$\begingroup\$ @Legato Not sorting, but equal partitioning. To find a median you don't need partitions to be sorted. \$\endgroup\$ Commented Oct 27, 2015 at 6:49
  • 1
    \$\begingroup\$ @SirPython the storage into a variable seems mainly to avoid computing the mean for each element in the input list, thus saving on CPU. \$\endgroup\$ Commented Oct 27, 2015 at 9:31
  • \$\begingroup\$ @MathiasEttinger Ah, that makes perfect sense! Thank you for clearing that up. \$\endgroup\$ Commented Oct 27, 2015 at 22:39
2
\$\begingroup\$

In mean, you're wrapping the result of len in a float. I'm guessing this is because of Python 2's odd division that will always return an int and not a float.

>>> 12/5
2

In Python 3.x this is no longer an issue as it will return proper float values, and you can get this in Python 2 as well, by using from __future__ import division. from __future__ is a way of importing new additions in Python 3 that were added to 2 but not in a way that overrides the old functionality. By importing this, division returns floats in Python 2 as well:

>>> from __future__ import division
>>> 12/5
2.4

Also I notice you use float later on the literal value 2. But you could just use 2.0 and get the same effect even if you don't do the import.

>>> 12/5.0
2.4

Should median really return None for input of length 0? Shouldn't that raise ValueError rather than silently return a non result?

Also you should indent by 4 spaces rather than 2. That's the commonly accepted Python style, and since it affects flow people could easily misread your code this way.

answered Oct 28, 2015 at 11:46
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.