Program to count vowels

Question 1

After reading a certain question on the Stack Overflow website, I tried to write a solution to the problem just for fun. I'm, however, left with the nagging feeling there is a beautiful one-liner that can be used instead.

The question's premises:

Create a function that will receive a string. The function will count each vowel in the string, and return a count of all the vowels, whether found or not, each in a tuple pair. Each tuple parir will be stored in a list.

Example:

>>> vowels_finder("This has some vowels")
[('a', 1), ('o', 2), ('u', 0), ('e', 2), ('i', 1)] # tuple pair of each vowel.
>>>

My attempt:

def vowels_finder(s):
 vowels = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}
 for el in s:
 if el in {'a', 'e', 'i', 'o', 'u'}:
 vowels[el]+=1
 vowels = [(key, pair) for key, pair in vowels.items()]
 return vowels

My code above is not commented, but I'm confident that its brevity will allow me to pass on this.

Questions:

Is there _any way, besides using a python library, that this can be condensed into a one-liner?
Would there be a way to not have to convert the vowels key's back into tuple pairs, and just have them be tuples in the beginning. eg: vowels = [('a', 0), ('e', 0), ('i', 0), ('o', 0), ('u', 0)] ?

_{language: Python 3.4}

Question 2

return vowels.items() instead of using that list comprehension.

Question 3

That's a fine line that you're asking us to walk between a "one-liner" and "not using a builtin".

Question 4

@200_success Oh wait, I just got what you meant. I've edited my question as that was a typo. I meant without using library function.

Question 5

It is good that you didn't write any comments in the code. The code, as it is, is crystal clear in what it does and therefore does not need any comments.

Question 6

With Python it can be tempting to want to write one-liners; but, short code does not necessarily make for better code, and I'd like to review your code in a way that I think would make it more maintainable, flexible and professional, rather than shorter. (maybe someone else will address the one-liner request)

Type hints

Since you are using Python 3.x you could take advantage of the new type hints. According to PEP 484:

This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information.

Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.

Even if you don't use static code analysis at the moment, type hints still have the advantage of making the code easier to read and understand.

In your case:

def vowels_finder(s: str) -> list:
 # ...

Reusable functions

The first thought I had looking at your function is that some logic could be extracted for more general reuse. For instance, this function could come in handy for other things:

def is_vowel(ch: chr, include_y: bool=False) -> bool:
 if include_y:
 return ch in ('a', 'e', 'i', 'o', 'u', 'y') 
 else:
 return ch in ('a', 'e', 'i', 'o', 'u')

You will notice I also added support for optionally including "Y" as a vowel, which can be useful for certain contexts.

Note that I also used a tuple instead of set, since vowels won't change anyways and tuples are generally faster since they are immutable, and we don't need set operations in this case other than in membership, which tuples support as well.

Now we can simply do this in your vowels_finder function:

for el in s:
 if is_vowel(el):
 vowels[el]+=1

Main function improvements

I would expect a function named vowels_finder to do just that: look for a vowel, and return True if it finds one. Furthermore, I would expect a name like this to be an object/class "Thing", rather than a function which is usually named like "do something".

Let's call it count_individual_vowels instead. Also, now that we have a function for vowels with added functionality for "Y", we can very easily add this option to this function. Note that I have changed some of the variable names a bit to make them more clear:

def count_individual_vowels(input_str: str, include_y: bool = False) -> list:
 vowel_counts = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}
 if include_y:
 vowel_counts['y'] = 0
 for el in input_str:
 if is_vowel(el, include_y):
 vowel_counts[el] += 1
 return [(key, pair) for key, pair in vowel_counts.items()]

Bug / overlooked problem

After refactoring this I noticed a problem, see for illustration:

string1 = "My phrase has some vowels, pretty cool don't you think?"
print(count_individual_vowels(string1))
print(count_individual_vowels(string1, True))
string2 = string1.upper()
print('UPPER CASE')
print(count_individual_vowels(string2))
print(count_individual_vowels(string2, True))

Results:

[('u', 1), ('o', 6), ('i', 1), ('e', 4), ('a', 2)]
[('e', 4), ('a', 2), ('u', 1), ('y', 3), ('o', 6), ('i', 1)]
UPPER CASE
[('u', 0), ('o', 0), ('i', 0), ('e', 0), ('a', 0)]
[('e', 0), ('a', 0), ('u', 0), ('y', 0), ('o', 0), ('i', 0)]

This could of course cause problems, but thankfully the fix is extremely simple, just use .lower() method on strings in the functions

Here in the new helper function...

return ch.lower() in ('a', 'e', 'i', 'o', 'u')

And also when adding it into the dict, so that uppercase vowels get grouped into the lowercase count:

for el in input_str:
 if is_vowel(el, include_y):
 # here:
 vowel_counts[el.lower()] += 1

OrderedDict to keep vowels in order

As you've noticed, using a regular dict to store the counts results in the output vowels returns the values in arbitrary order. You could make it return always in the same order by using from collections import OrderedDict and just replacing this:

vowel_counts = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}
if include_y:
 vowel_counts['y'] = 0

By this a bit more verbose, but much nicer output:

vowel_counts = OrderedDict()
for vow in 'a', 'e','i', 'o', 'u':
 vowel_counts[vow] = 0
if include_y:
 vowel_counts['y'] = 0

See:

[('a', 2), ('e', 4), ('i', 1), ('o', 6), ('u', 1)]
[('a', 2), ('e', 4), ('i', 1), ('o', 6), ('u', 1), ('y', 3)]

Finally, here is a working demo on repl.it with all the above suggestions applied.

Question 7

Na, don't worry about my request. It's really is just an after thought. I know very well the "I can slap every single thing on one line!" ideology. It really was more just to see if it could be done. Regardless, You have given an excellent answer. Thanks!

Question 8

You can do your contains checks using in 'aeiou' rather than building a tuple for the occasion.

Question 9

This is a great answer, and I +1d, but I'd like to be picky about your if include_y == False: line. Direct comparisons to bools are almost never useful, and I don't think this one is an exception to that.

Question 10

@Graipher. Pure conjecture, haven't looked too deeply into the internals or actually benchmarked this, but a Python set is similar to a dict under the hood. In the worst case, in tuple() will be 6 chr comparisons, whilst in set will be a hash compute (over an interned chr, granted, so actually just a cache lookup) followed by a hash lookup. I wouldn't be overly surprised if these lookups actually took longer than the 6 comparisons. That said, if we're optimising that hard, we're solving the wrong problem.

Question 11

If you're going to include optional vowels, you should probably add w, just in case someone wants to run this code in a Welsh locale. ;)

Question 12

Whenever you want to count elements, you can use a collections.Counter to do so rather than building the dictionnary yourself. What's left in the end is extracting the count of letters you're interested in out of said Counter:

from collections import Counter
def vowels_finder(sentence, vowels='aeiou'):
 count = Counter(sentence)
 return [(letter, count[letter]) for letter in vowels]

Question 13

Counter is a dict, so it has .items(). Thus, you can return list(count.items()) instead of building it manually.

Question 14

@Daerdemandt Except that doing so you will get all the consonnants too, defeating the purpose of the function.

Question 15

Oh, bummer. I've confused it with other code that only counts vowels.

Question 16

MathiasEttinger's answer already went into the right direction of suggesting to use a Counter and making the vowels a function parameter with default value. Here are some more improvements:

vowels_finder is not a particularly good name for the function. Functions are typically named after the result they return, or the side effect they invoke. The function also doesn't only "find" vowels, its main purpose is to count them. I propose the name vowel_count.
Instead of counting all letters and then filtering out the vowels afterwards, filter the vowels first and count only these. This makes the function one line shorter and about ×ばつ faster (depending on how common vowels are in the input text).
Add a docstring explaining the purpose of the function, and giving you the opportunity to include short examples which are automatically testable using doctest.

def vowel_count(sentence, vowels='aeiou'):
 '''Return a Counter of all vowels contained in the sentence.
 >>> vowels = vowel_count('This has some vowels, ALSO IN UPPERCASE.')
 >>> vowels['a']
 3
 >>> vowels.most_common()
 [('e', 4), ('a', 3), ('o', 3), ('i', 2), ('u', 1)]
 '''
 return Counter(c for c in sentence.lower() if c in vowels)

(I also incorporated the conversion to lower-case, as suggested in Phrancis' answer.)

Returning the Counter object directly gives you the freedom of converting it to the "list of pairs" format, using the most_common method, or keeping the dictionary interface.

Question 17

Yeah the name I gave to my function sucked, but in my defense it was midnight! Seriously though, thanks for the great suggestions. About my naming though; I usually think I'll be to verbose if I choose a name that fully describes what my function returns/does. But in this case, the name you proposed does not seem overly wordy, so I'll use it. Cheers!

Question 18

It might be interesting to use in set(vowels) and check if it increases speed with large amount of vowels in sentence.

Question 19

Apart from the awesome answer @Phrancis just gave, I'll just address the part of the question where it asks for a one-liner. You can have something like this, if that's really necessary:

def find_vowels_1(data):
 return [(k, v) for k, v in {v: data.lower().count(v) for v in 'aeiou'}.items()]
print(find_vowels_1('dexter'))

Which will print:

[('o', 0), ('e', 2), ('i', 0), ('a', 0), ('u', 0)]

Bear with the fact that I don't recommend this solution as it won't bring anything good. I'd just follow everything it was stated in the above answer.

As others mentioned, is redundant to have a dict which will then be converted to a list. So you can build a list from the start. More, you can also make another function argument vowels in which you can pass the vowels specific to a different region:

def find_vowels_1(data, vowels='aeiou'):
 # return the number of occurrences of each vowel in a string
 return [(v, data.count(v)) for v in vowels]
print(find_vowels_1('dexter'))

Question 20

Why make a dict, to then make a dict_items, and then a list, why not make a list off the bat? For example [(v, data.count(v)) for v in 'aeiou']. Also the special y case can be achieved using a ternary, 'aeiouy' if include_y else 'aeiou'.

Question 21

I think the y special case is easier in this example. Just take 'aeiou' as a paramenter. This allows me to pass in 'aeiouyæøå', which is nice.

Phrancis Phrancis 20.5k6 gold badges69 silver badges155 bronze badges · Accepted Answer · 2016-10-13 05:20:16Z

With Python it can be tempting to want to write one-liners; but, short code does not necessarily make for better code, and I'd like to review your code in a way that I think would make it more maintainable, flexible and professional, rather than shorter. (maybe someone else will address the one-liner request)

Type hints

Since you are using Python 3.x you could take advantage of the new type hints. According to PEP 484:

This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information.

Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.

Even if you don't use static code analysis at the moment, type hints still have the advantage of making the code easier to read and understand.

In your case:

def vowels_finder(s: str) -> list:
 # ...

Reusable functions

The first thought I had looking at your function is that some logic could be extracted for more general reuse. For instance, this function could come in handy for other things:

def is_vowel(ch: chr, include_y: bool=False) -> bool:
 if include_y:
 return ch in ('a', 'e', 'i', 'o', 'u', 'y') 
 else:
 return ch in ('a', 'e', 'i', 'o', 'u')

You will notice I also added support for optionally including "Y" as a vowel, which can be useful for certain contexts.

Note that I also used a tuple instead of set, since vowels won't change anyways and tuples are generally faster since they are immutable, and we don't need set operations in this case other than in membership, which tuples support as well.

Now we can simply do this in your vowels_finder function:

for el in s:
 if is_vowel(el):
 vowels[el]+=1

Main function improvements

I would expect a function named vowels_finder to do just that: look for a vowel, and return True if it finds one. Furthermore, I would expect a name like this to be an object/class "Thing", rather than a function which is usually named like "do something".

Let's call it count_individual_vowels instead. Also, now that we have a function for vowels with added functionality for "Y", we can very easily add this option to this function. Note that I have changed some of the variable names a bit to make them more clear:

def count_individual_vowels(input_str: str, include_y: bool = False) -> list:
 vowel_counts = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}
 if include_y:
 vowel_counts['y'] = 0
 for el in input_str:
 if is_vowel(el, include_y):
 vowel_counts[el] += 1
 return [(key, pair) for key, pair in vowel_counts.items()]

Bug / overlooked problem

After refactoring this I noticed a problem, see for illustration:

string1 = "My phrase has some vowels, pretty cool don't you think?"
print(count_individual_vowels(string1))
print(count_individual_vowels(string1, True))
string2 = string1.upper()
print('UPPER CASE')
print(count_individual_vowels(string2))
print(count_individual_vowels(string2, True))

Results:

[('u', 1), ('o', 6), ('i', 1), ('e', 4), ('a', 2)]
[('e', 4), ('a', 2), ('u', 1), ('y', 3), ('o', 6), ('i', 1)]
UPPER CASE
[('u', 0), ('o', 0), ('i', 0), ('e', 0), ('a', 0)]
[('e', 0), ('a', 0), ('u', 0), ('y', 0), ('o', 0), ('i', 0)]

This could of course cause problems, but thankfully the fix is extremely simple, just use .lower() method on strings in the functions

Here in the new helper function...

return ch.lower() in ('a', 'e', 'i', 'o', 'u')

And also when adding it into the dict, so that uppercase vowels get grouped into the lowercase count:

for el in input_str:
 if is_vowel(el, include_y):
 # here:
 vowel_counts[el.lower()] += 1

OrderedDict to keep vowels in order

As you've noticed, using a regular dict to store the counts results in the output vowels returns the values in arbitrary order. You could make it return always in the same order by using from collections import OrderedDict and just replacing this:

vowel_counts = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}
if include_y:
 vowel_counts['y'] = 0

By this a bit more verbose, but much nicer output:

vowel_counts = OrderedDict()
for vow in 'a', 'e','i', 'o', 'u':
 vowel_counts[vow] = 0
if include_y:
 vowel_counts['y'] = 0

See:

[('a', 2), ('e', 4), ('i', 1), ('o', 6), ('u', 1)]
[('a', 2), ('e', 4), ('i', 1), ('o', 6), ('u', 1), ('y', 3)]

Finally, here is a working demo on repl.it with all the above suggestions applied.

Na, don't worry about my request. It's really is just an after thought. I know very well the "I can slap every single thing on one line!" ideology. It really was more just to see if it could be done. Regardless, You have given an excellent answer. Thanks!
You can do your contains checks using in 'aeiou' rather than building a tuple for the occasion.
This is a great answer, and I +1d, but I'd like to be picky about your if include_y == False: line. Direct comparisons to bools are almost never useful, and I don't think this one is an exception to that.
@Graipher. Pure conjecture, haven't looked too deeply into the internals or actually benchmarked this, but a Python set is similar to a dict under the hood. In the worst case, in tuple() will be 6 chr comparisons, whilst in set will be a hash compute (over an interned chr, granted, so actually just a cache lookup) followed by a hash lookup. I wouldn't be overly surprised if these lookups actually took longer than the 6 comparisons. That said, if we're optimising that hard, we're solving the wrong problem.
If you're going to include optional vowels, you should probably add w, just in case someone wants to run this code in a Welsh locale. ;)

Stack Exchange Network

Program to count vowels

The question's premises:

My attempt:

Questions:

4 Answers 4

Type hints

Reusable functions

Main function improvements

Bug / overlooked problem

OrderedDict to keep vowels in order

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Program to count vowels

The question's premises:

My attempt:

Questions:

4 Answers 4

Type hints

Reusable functions

Main function improvements

Bug / overlooked problem

OrderedDict to keep vowels in order

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions