7
\$\begingroup\$

For my CS class I the task given was to take a given Person Class with a name and age attributes.

Then, with an existing list of Person Objects, create a function that returns a tuple of (mean, median, mode) on the given age attribute.

Simple enough, but I wondered if my solution is really all that efficient. I basically iterated a new list of ages from an existing list of Person objects, and that almost feels like there's a better opportunity to optimize a solution...

Wondering if I'm missing something obvious.

from statistics import mean, median, mode
class Person:
 def __init__(self, name, age):
 """
 initializes a person objects name and age
 """
 self._name = name
 self._age = age
 def get_age(self):
 """
 returns the private data member _age
 """
 return self._age
def basic_stats(person_list):
 """
 creates a list of ages from a list of person objects then calculates
 mean, median, and mode from the statistics module
 """
 age_list = [i.get_age() for i in person_list]
 return (mean(age_list), median(age_list), mode(age_list))
asked Jan 3, 2021 at 22:29
\$\endgroup\$
1
  • 1
    \$\begingroup\$ The iteration is unavoidable and list generators are pretty cheap. Overall, the get_age() attribute seems quite roundabout but OOP is pretty poorly taught so I suspect they're enforcing it and you just have to go with it \$\endgroup\$ Commented Jan 3, 2021 at 22:47

2 Answers 2

6
\$\begingroup\$

In terms of performance, to create a list of ages, [i.get_age() for i in person_list] is about as good as you'll get. The only real way to avoid this is to adjust the mean, median, and mode functions to accept a key parameter (much like the built-in sort function does) so that each function will access the correct attribute of the object. I wouldn't worry about removing that list comprehension though. Unless person_list is huge, and you require this code to run often, and as fast as possible, that would be a premature optimization.

Another way to potentially speed it up would be to have a single function that iterates the list once, and calculates all three statistics in one go. Again though, I would consider that to be premature unless you had a very good reason to optimize that.

Your solution is fine until you run into a situation in which it isn't performing well. Favor readability and simplicity until performance becomes an actual concern (or you have good reason to believe that it will become an actual concern in the near future).

answered Jan 4, 2021 at 0:25
\$\endgroup\$
3
\$\begingroup\$

For efficiency's sake, without breaking the parameters of your assignment, the only other thing you can add is __slots__.

Beyond that - in the "real world" - if this were a truly massive list, you would want to do away with the class representation and instead use a Numpy matrix, for which the calculation of summary statistics will be faster due to vectorization.

answered Jan 4, 2021 at 15:59
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.