10
\$\begingroup\$

This is a script I have written to calculate the population standard deviation. I feel that this can be simplified and also be made more pythonic.

from math import sqrt
def mean(lst):
 """calculates mean"""
 sum = 0
 for i in range(len(lst)):
 sum += lst[i]
 return (sum / len(lst))
def stddev(lst):
 """calculates standard deviation"""
 sum = 0
 mn = mean(lst)
 for i in range(len(lst)):
 sum += pow((lst[i]-mn),2)
 return sqrt(sum/len(lst)-1)
numbers = [120,112,131,211,312,90]
print stddev(numbers)
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Feb 21, 2012 at 11:03
\$\endgroup\$

2 Answers 2

14
\$\begingroup\$

The easiest way to make mean() more pythonic is to use the sum() built-in function.

def mean(lst):
 return sum(lst) / len(lst)

Concerning your loops on lists, you don't need to use range(). This is enough:

for e in lst:
 sum += e

Other comments:

  • You don't need parentheses around the return value (check out PEP 8 when you have a doubt about this).
  • Your docstrings are useless: it's obvious from the name that it calculates the mean. At least make them more informative ("returns the mean of lst").
  • Why do you use "-1" in the return for stddev? Is that a bug?
  • You are computing the standard deviation using the variance: call that "variance", not sum!
  • You should type pow(e-mn,2), not pow((e-mn),2). Using parentheses inside a function call could make the reader think he's reading a tuple (eg. pow((e,mn),2) is valid syntax)
  • You shouldn't use pow() anyway, ** is enough.

This would give:

def stddev(lst):
 """returns the standard deviation of lst"""
 variance = 0
 mn = mean(lst)
 for e in lst:
 variance += (e-mn)**2
 variance /= len(lst)
 return sqrt(variance)

It's still way too verbose! Since we're handling lists, why not using list comprehensions?

def stddev(lst):
 """returns the standard deviation of lst"""
 mn = mean(lst)
 variance = sum([(e-mn)**2 for e in lst]) / len(lst)
 return sqrt(variance)

This is not perfect. You could add tests using doctest. Obviously, you should not code those functions yourself, except in a small project. Consider using Numpy for a bigger project.

answered Feb 21, 2012 at 13:01
\$\endgroup\$
7
  • \$\begingroup\$ Thank you Cygal for your answer. I realize things like tests and validation need to be added, but I think you put me in the right direction. \$\endgroup\$ Commented Feb 22, 2012 at 8:23
  • \$\begingroup\$ @mad, I realize you're not able to comment due to your reputation, but if you see a problem in a post and want to fix it, you'll either have to be patient and wait until you have 50 reputation or go out, answer a question, and get five upvotes (or ask a good question and get 10). Please don't try to circumvent the system. Third-party edits should only edit the content of the post (as opposed to formatting, grammar, spelling, pasting in content from links etc.) with explicit approval from the poster. \$\endgroup\$ Commented Dec 30, 2015 at 20:09
  • \$\begingroup\$ Looks like you forgot to divide the variance by N before taking the sqrt in the last/least verbose example. \$\endgroup\$ Commented Sep 29, 2016 at 16:41
  • \$\begingroup\$ @CodyA.Ray Your Rev 2 corrected the result, but it was not the right fix. \$\endgroup\$ Commented Sep 29, 2016 at 19:19
  • \$\begingroup\$ @200_success can you elaborate? Yeah, variance is the wrong variable name there. I could've just divided in the "return" line. But the equation seems correct for non-sampled std dev: libweb.surrey.ac.uk/library/skills/Number%20Skills%20Leicester/… \$\endgroup\$ Commented Sep 29, 2016 at 22:09
6
\$\begingroup\$

You have some serious calculation errors...


Assuming that this is Python 2, you also have bugs in the use of division: if both operands of / are integers, then Python 2 performs integer division. Possible remedies are:

(Assuming that this is Python 3, you can just use statistics.stdev().


The formula for the sample standard deviation is

$$ s = \sqrt{\frac{\sum_{i=1}^{n}\ (x_i - \bar{x})^2}{n - 1}}$$

In return sqrt(sum/len(lst)-1), you have an error with the precedence of operations. It should be

return sqrt(float(sum) / (len(lst) - 1))
answered Aug 24, 2014 at 16:56
\$\endgroup\$
2
  • \$\begingroup\$ Source for formula? \$\endgroup\$ Commented Mar 26, 2015 at 23:40
  • \$\begingroup\$ @Agostino It's basically common knowledge in statistics. \$\endgroup\$ Commented Mar 26, 2015 at 23:42

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.