I'm a newbie. I'm at Chapter 4 of Automate the Boring Stuff with Python.
Comma Code
Say you have a list value like this:
spam = ['apples', 'bananas', 'tofu', 'cats']
Write a function that takes a list value as an argument and returns a string with all the items separated by a comma and a space, with and inserted before the last item. For example, passing the previous spam list to the function would return
'apples, bananas, tofu, and cats'
. But your function should be able to work with any list value passed to it.
Just want to know if there's anything I can do to clean the code up?
Trying to stick with what I've learned so far but any suggestions would be helpful.
def list_to_string(some_list):
new_string = ''
if not some_list:
return ''
elif len(some_list) == 1:
return str(some_list[0])
else:
for i in range(len(some_list) - 2):
new_string += str(some_list[i]) + ', '
new_string += str(some_list[len(some_list) - 2]) + ' and ' + str(some_list[len(some_list) - 1])
return new_string
5 Answers 5
Return values
def list_to_string(some_list):
new_string = ''
if not some_list:
return ''
...
Why are you returning ''
here? You could return new_string
instead, since you've initialized it.
In fact, the last statement in the function is return new_string
. Why not make that the only place you return from the function?
def list_to_string(some_list):
new_string = ''
if some_list:
if len(some_list) == 1:
new_string = str(some_list[0])
else:
for i in range(len(some_list) - 2):
new_string += str(some_list[i]) + ', '
new_string += str(some_list[len(some_list) - 2]) + ' and ' + str(some_list[len(some_list) - 1])
return new_string
Loop over values, not indices
Python is slow at indexing. Not terribly slow, but slow. In general, you want to avoid for i in range(len(container)):
type loops, especially when you only use i
in the expression container[i]
inside the loop.
So this code:
for i in range(len(some_list) - 2):
new_string += str(some_list[i]) + ', '
can become
for term in some_list[:-2]:
new_string += str(term) + ', '
Too many str calls
You are calling str(...)
way too many times. You need to call it once for each element of some_list
, which you are doing, but you've written the call to str(...)
four times, which is three times too many in my opinion. Better would be to convert all the terms to strings once, at the start.
some_list = [str(term) for term in some_list]
Now you can safely rely on the fact that all elements of the list are already strings.
Last & Second Last elements
some_list[len(some_list) - 1]
and some_list[len(some_list) - 2]
are clumsy ways of accessing the last and second last elements. Python allows negative indexing, which returns elements counting from the end of the list. some_list[-1]
is the last element and some_list[-2]
is the second last.
One element and last element
You've got two special cases. The one element case, which you just return directly, and the last element of a list of multiple items, which is handled differently.
Recognizing that a single element is the last element of a list of one element allows you to eliminate one special case.
def list_to_string(some_list):
new_string = ''
if some_list:
some_list = [str(term) for term in some_list]
if len(some_list) > 1:
for term in some_list[:-2]:
new_string += term + ', '
new_string += some_list[-2] + ' and '
new_string += some_list[-1]
return new_string
Type-hints, doc-strings, and testing oh my!
You might want to add some type-hints, """docstrings"""
and doctests to help your users understand how to use/call your function.
import doctest
def list_to_string(some_list: list) -> str:
"""
Convert a list of objects into a string of comma separated items.
The last two items are separated with ' and ' instead of a comma.
>>> list_to_string([])
''
>>> list_to_string(['apples'])
'apples'
>>> list_to_string(['apples', 'bananas'])
'apples and bananas'
>>> list_to_string(['apples', 'bananas', 'tofu', 'cats'])
'apples, bananas, tofu, and cats'
"""
new_string = ''
if some_list:
some_list = [str(term) for term in some_list]
if len(some_list) > 1:
for term in some_list[:-2]:
new_string += term + ', '
new_string += some_list[-2] + ' and '
new_string += some_list[-1]
return new_string
if __name__ == '__main__':
doctest.testmod()
The some_list: list
tells prospective callers that the function expects a list for the first (and only) argument. The -> str
tells prospective callers the function returns a string.
The """docstring"""
is help text which will be displayed if the user types help(list_to_string)
.
Finally, the lines starting with >>>
in the docstring are found by the doctest
module, and executed and compared to the following lines to verify the function operates as expected. Here we see:
**********************************************************************
File "/Users/aneufeld/Documents/Stack Exchange/Code Review/comma.py", line 17, in __main__.list_to_string
Failed example:
list_to_string(['apples', 'bananas', 'tofu', 'cats'])
Expected:
'apples, bananas, tofu, and cats'
Got:
'apples, bananas, tofu and cats'
**********************************************************************
1 items had failures:
1 of 4 in __main__.list_to_string
***Test Failed*** 1 failures.
... which tells us that something is amiss. Your problem text says your function should return one thing, but your code returns another!
-
2\$\begingroup\$ Looking at the spec, I think
list_to_string(['apples', 'bananas'])
should returnapples, and bananas
. Also, the firstfor
loop could be', '.join(some_list[:-2])
. \$\endgroup\$Ken Y-N– Ken Y-N2021年10月12日 07:39:02 +00:00Commented Oct 12, 2021 at 7:39 -
1\$\begingroup\$ Thanks for including how to run doctest. I never knew what framework all those nice test examples were for! \$\endgroup\$Zachary Vance– Zachary Vance2021年10月12日 20:27:54 +00:00Commented Oct 12, 2021 at 20:27
In addition to AJNeufeld's comprehensive answer there are a couple of additions that you could make to make your function more useful in the future.
Parameters
At the moment you have:
def list_to_string(some_list):
...
and you're always joining on a comma and using "and". But in the future you might need to use a semicolon and "or" instead. A good addition would be some default parameters that you can change without having to re-write the whole function:
def list_to_string(some_list, separator=", ", conjunction="and"):
...
Now, instead of the ", "
and " and "
in your code you can use the variables separator
and conjunction
to tweak the output.
Using builtin methods
This is skipping ahead (to chapter 6) a little bit, where the str.join()
method is introduced. Using this you can make your code a lot more succinct.
str.join()
can take a list of strings (they must be strings) and concatenate them all around a separator:
>>> my_list = ["a", "b", "c"]
>>> " - ".join(my_list)
'a - b - c'
Using this along side string formatting and list slicing we can create the right output. For example, joining the my_list
can become:
>>> separator = ", "
>>> conjunction = "and"
>>> "{} {} {}".format(separator.join(my_list[:-1]), conjunction, my_list[-1])
'a, b and c'
In the last example, each {}
in the string gets replaced with the corresponding positional argument to format
. In this instance I've used the str.format()
style; in python 3.6 and newer there is a new way of formatting strings, called f-strings. With the newer style you can incorporate the variable or expression that will be printed inside the braces ({}
):
>>> f"{separator.join(my_list[:-1])} {conjunction} {my_list[-1]}"
'a, b and c'
So bringing this all together, along with @AJNeufeld's suggestions, you get:
def list_to_string(
some_list: list, separator: str = ", ", conjunction: str = "and"
) -> str:
"""Join lists into friendly strings
Parameters
----------
some_list : list
A sequence to join nicely, items must have a string representation
separator : str, optional
Delimiter to use, default: ", "
conjunction : str, optional
Conjunction to use between final two strings, default: "and"
Returns
-------
str
Examples
--------
>>> list_to_string([])
''
>>> list_to_string(['apples'])
'apples'
>>> list_to_string(['apples', 'bananas'])
'apples and bananas'
>>> list_to_string(['apples', 'tofu', 'cats'], separator="; ", conjunction="or")
'apples; tofu or cats'
>>> list_to_string(['apples', 'bananas', 'tofu', 'cats'])
'apples, bananas, tofu and cats'
"""
# Make sure our list is strings
some_list = [str(s) for s in some_list]
if not some_list:
# Special case, no items
new_string = ""
elif len(some_list) == 1:
# Special case, only one item so we don't
# need the separator or conjunction
new_string = some_list[0]
else:
# All other cases, more than one item so we
# join using the separator and format with
# the conjunction and final list item
new_string = "{} {} {}".format(
separator.join(some_list[:-1]), conjunction, some_list[-1]
)
return new_string
if __name__ == "__main__":
import doctest
doctest.testmod(verbose=True)
For reference, here I have used NumPy style docstrings and a "string representation" means a python object that can be converted to a string, usually by calling str()
.
Like @AJNeufeld's answer I haven't added the serial comma, this would be a good thing to play around with to add yourself.
-
4\$\begingroup\$ Starting with Python 3.9 (and Python 3.10 is already released),
from typing import List
is deprecated. You can simply uselist
. Since the function is supposed to accept a list of any objects, not just strings, like["hello", 1, 2, 3]
for example, usinglist[str]
is not the right type-hint;list[object]
is, but that may be reduced to simplylist
. Otherwise, nice improvements: +1. \$\endgroup\$AJNeufeld– AJNeufeld2021年10月12日 16:48:24 +00:00Commented Oct 12, 2021 at 16:48 -
\$\begingroup\$ Thanks for catching that! I've updated my answer. I haven't kept up with the latest in typing, so I've got some reading to do. \$\endgroup\$Alex– Alex2021年10月13日 09:08:50 +00:00Commented Oct 13, 2021 at 9:08
-
2\$\begingroup\$ You might also want to suggest using [f-strings]{python.org/dev/peps/pep-0498} instead of string formatting. Personally I found them to be much more intuitive and readable, but not sure if that also applies to people new to python. \$\endgroup\$Ivo Merchiers– Ivo Merchiers2021年10月13日 11:29:47 +00:00Commented Oct 13, 2021 at 11:29
-
1\$\begingroup\$ @IvoMerchiers, good idea, I completely agree that they are more intuitive. I was going for a broadly working implementation, but I'll add a note about f-strings. \$\endgroup\$Alex– Alex2021年10月13日 12:26:18 +00:00Commented Oct 13, 2021 at 12:26
-
1\$\begingroup\$ @AJNeufeld — bear in mind that if you use Mypy
--strict
, it will flag a barelist
in a type hint as being an unparameterised generic. So, probably better to be explicit and write it aslist[object]
rather than simplylist
. \$\endgroup\$Alex Waygood– Alex Waygood2021年10月14日 14:23:45 +00:00Commented Oct 14, 2021 at 14:23
I don't know if this is a line of code you want to keep in your program, maybe your method was more simple and readable.
What I do here is create a list comprehension with all the values in the spam list but I add the item as "{spam[i]}," if is not the last value (i < len(spam) - 1) and as "and {spam[i]}" if it is the last value, then you can directly join the element of the new list with .join()
method.
def list_to_string(spam: list) -> str:
return " ".join([f"{spam[i]}," if i < len(spam) - 1 else f"and {spam[i]}" for i in range(len(spam))])
This is equal to this code:
def list_to_string(spam: list) -> str:
new_list = []
for i in range(len(spam)):
if i < len(spam) - 1:
new_list.append(f"{spam[i]},")
else:
new_list.append(f"and {spam[i]}")
return " ".join(new_list)
And equal to this code:
def list_to_string(spam: list) -> str:
new_list = []
for i in range(len(spam)):
new_list.append(f"{spam[i]}," if i < len(spam) - 1 else f"and {spam[i]}")
return " ".join(new_list)
Edit: note that if the list include only one item the resulting string will be "and item" like your initial request state:
with and inserted before the last item.
-
1\$\begingroup\$ Welcome to Code Review! You've suggested an alternative implementation, but you haven't explained why your implementation is better. Please edit your post to explain the advantage of your implementation. \$\endgroup\$Dan Oberlam– Dan Oberlam2021年10月13日 17:15:38 +00:00Commented Oct 13, 2021 at 17:15
So a bit late to the party, but let me just add my 2 cents. I agree with most of the comments above. Do add annotations, write good docstrings with doctesting und zu weiter. However, I feel this code could have been written even more pythonic.
Solution 1
My approach would be something like this
def comma_code(items: list[str], oxford_comma: bool = True) -> str:
"""Transforms ['a', 'b', 'c'] into 'a, b, and c' or 'a, b and c'"""
if not items:
return ""
*items_except_last, last_item = items
return (
", ".join(items_except_last)
# 'a and b' not 'a, and b' so we need 3 or more items for oxford comma
+ ("," if oxford_comma and len(items) >= 3 else "")
+ (" and " if items_except_last else "")
+ last_item
)
- Typing hints and a clear variable name.
string_2_list
is as generic of a name as they come. Avoid generic names and try to encompass the intent of your code through good function names and variables. Hence "comma_code" is a decent name, as this is the problem we are solving. - Early returns so we avoid the arrow nesting problem
- The * operator unpacks an argument list. It allows you to call a function with the list items as individual arguments.
- Notice that we also avoid temporary strings everything is simply returned.
- The functionality is slightly expanded by introducing the oxford comma. When coding we need to be careful not over-engineering our solutions.
In the cases where we've considered things over engineered, it's always been describing software that has been designed to be so generic that it loses sight of the main task that it was initially designed to perform, and has therefore become not only hard to use, but fundamentally unintelligent.
- We avoid indices at all in our code, as some would consider them unpythonic. We refer to the number of items only once, and add a comment explaining that the oxford comma only comes into play when we have
3
or more elements. - Again the code should have proper docstring and doctests, but the other commenters have already commented on this.. (And I am lazy).
Solution 2
If we wanted to write a more generic code, we could do it as follows.
def list_2_str_atbswp(items: list[str], conjuction, seperator, spaces=True) -> str:
if not items:
return ""
*items_except_last, last_item = items
return (
(seperator + (space := " " if spaces else "")).join(items_except_last)
+ (space + conjuction + space if items_except_last else "")
+ last_item
)
def comma_code(items: list[str], oxford_comma: bool = True) -> str:
return list_2_str_atbswp(
items,
seperator=",",
conjuction=", and" if oxford_comma and len(items) > 2 else "and",
)
Here we have separated the work of adding conjunctions and separators to its own function. Where "atbswp" is short for "Automate the Boring Stuff with Python". While I do like this code, I would not write it in production. Why? list_2_str_atbswp
is close to being a god object. If we use this function a lot and in many different situations it can make maintaining it difficult. If we change the input of list_2_str_atbswp
every function that uses this would also have to change.
Our original solution is more than good enough. The problem asked us to use comma in our code, so of course we are going to comma and not some other separator =P
You can use the .join() method in the python list class:
def list_to_string(some_list: list) -> str:
if not some_list:
return ''
if len(some_list) == 1:
return some_list[0]
return ", ".join(some_list)
-
1\$\begingroup\$ This does not add ` and ` between the second last and last list items, as required by the problem text. \$\endgroup\$AJNeufeld– AJNeufeld2021年10月12日 16:49:56 +00:00Commented Oct 12, 2021 at 16:49
-
3\$\begingroup\$ How does this insert
and
before the last item? I think the principle is sound, but the code needs a little more work. And this suggestion is part of Alex's answer, so there's little value writing it again here. \$\endgroup\$Toby Speight– Toby Speight2021年10月12日 16:50:05 +00:00Commented Oct 12, 2021 at 16:50