So, I just managed to finish a problem that was proposed to me, and I find it very difficult since I started learning looping with for
and using lists recently, I thought about sharing the problem and my solution with you guys so you could give me some advice on how I should make my code better or different ways to solve the problem, hopefully you guys can help me, thank you for your attention and sorry if I made some mistakes, I am new to programming.
Question
Write a function that receives a list with natural numbers, verify if such list has repeated numbers, if so remove them. The function should return a correspondent list with no repeated elements. The list should be returned in ascending order.
def remove_repeated_numbers(z):
x = 1
for i in z:
y = x
while y <= (len(z) - 1):
if i == z[y]:
del z[y]
y = (y - 1) #Once z[y] is deleted, should check same spot again
y = (y + 1)
x = (x + 1)
list_finished = sorted(z[:])
return list_finished
def main():
run = True
z = []
while run:
u = int(input("Entry value: "))
if u != 0:
z.append(u)
else:
run = False
print(remove_repeated_numbers(z))
main()
3 Answers 3
It's a good start. Depending on what class this is for, the person (teacher?) who proposed this question to you might care about time complexity of your solution. Your solution is going to have a number of problems there, since:
del z[y]
is linear-time ("O(n)")- That occurs within a loop over the list, within another loop over the list, making the solution worst-case cubic if I'm not mistaken.
The trivial and common solution is to scrap the whole routine and replace it with calls to two built-ins:
from typing import Iterable, List
def remove_repeated_numbers(numbers: Iterable[int]) -> List[int]:
return sorted(set(numbers))
I will leave it as an exercise to you to read the documentation and determine the complexity of this implementation.
This will construct a set from the numbers, which guarantees uniqueness but not order; and then constructs a sorted list from that. Interpreted literally, the question doesn't call for the existence of a main
but I don't know how picky the problem is.
Other things to note:
- Don't use single-letter variables
- Add type hints for greater clarity
Add some unit tests
Are we sure that the function works? Will it still work when we make changes (perhaps to improve its performance)?
Python comes with some unit-testing facilities to help us ensure that we don't break the function. One that's very easy to use is doctest
, which lets us put the tests right by the code itself.
We could start by testing an empty list:
def remove_repeated_numbers(z):
"""
Return a sorted copy of the input list, with duplicates removed.
>>> remove_repeated_numbers([])
[]
"""
⋮
⋮
We write each test as a line beginning with >>>
and an expression to evaluate, with the expected result on the following line.
We can run the tests with some easy boilerplate (which we don't need to understand right now):
if __name__ == '__main__':
import doctest
doctest.testmod()
To be sure that our tests work properly, let's add a test that we know will fail:
>>> remove_repeated_numbers([])
None
That gives us a useful diagnostic:
**********************************************************************
File "268315.py", line 7, in __main__.remove_repeated_numbers
Failed example:
remove_repeated_numbers([])
Expected:
None
Got:
[]
**********************************************************************
Let's remove that, and add some more tests:
>>> remove_repeated_numbers([0])
[0]
>>> remove_repeated_numbers([0, 0])
[0]
>>> remove_repeated_numbers([0, 1])
[0, 1]
>>> remove_repeated_numbers([1, 0])
[0, 1]
>>> remove_repeated_numbers([0, 0, 1, 1, 1])
[0, 1]
>>> remove_repeated_numbers([0, 1, 1, 1, 0])
[0, 1]
"""
That's good, and we now have more confidence in what we're doing.
We can now change the implementation to use a set
, as suggested in a different answer, but I'll propose a smaller change, and that is to sort the input first:
for i in sorted(z):
That initially seems like it's making more work (after all, the input could be much longer than the output), but it will reduce the later operations, because now all the duplicates are adjacent, so we don't need to examine the whole list to find them.
Now, instead of the expensive del z[y]
(which needs to shuffle all the subsequent elements in the list), we can copy to a new list to return to the user. Note that this changes the behaviour - I believe for the better. The code presented modifies the actual array passed to the function (though the modified array is only deduplicated, not sorted). It's generally less surprising to return a copy so that the original data is unchanged.
Just to demonstrate, try this code with the function as posted:
a = [ 2, 1, 1, 2]
print(remove_repeated_numbers(a))
print(a)
With this code, the function returns [1, 2]
and a
is changed to [2, 1]
.
So let's create our result array:
result = []
⋮
return result
Now our loop is very simple: for each element we see, if it's the same as the one we most recently added to result
, then ignore it, else add it to result
:
for i in sorted(z):
if i != result[-1]:
result.append(i)
That's not quite right, because result
starts off empty. We need to always include the first element, so we must change the test to also succeed when result
is empty - i.e. not result
:
for i in sorted(z):
if not result or i != result[-1]:
result.append(i)
I think that's easier to understand than code that del
etes items from the input list.
Full modified code
For reference, here's the full program from this approach:
def remove_repeated_elements(input_list):
"""
Return a sorted copy of the input list, with duplicates removed.
>>> remove_repeated_elements([])
[]
>>> remove_repeated_elements([0])
[0]
>>> remove_repeated_elements([0, 0])
[0]
>>> remove_repeated_elements([0, 1])
[0, 1]
>>> remove_repeated_elements([1, 0])
[0, 1]
>>> remove_repeated_elements([0, 0, 1, 1, 1])
[0, 1]
>>> remove_repeated_elements([0, 1, 1, 1, 0])
[0, 1]
>>> remove_repeated_elements(['review', 'code', 'code', 'review'])
['code', 'review']
"""
result = []
for element in sorted(input_list):
if not result or result[-1] < element:
result.append(element)
return result
if __name__ == '__main__':
import doctest
doctest.testmod()
Note that I changed the name slightly, as this will work with any type that can be ordered (such as strings - see the last test), not just numbers. The answer that recommends using set
also works with more than just numbers (but requires a type that can be hashed, as well as ordered).
I also changed the test to use <
, so it will work for any (admittedly theoretical, and highly improbable) type that can be sorted but not compared for inequality.
-
\$\begingroup\$ I really like this answer, and I suspect indeed the eliminating the duplicates in a sorted list is the core of the exercise -- otherwise it could just as well require the elements to be returned in the order they were in the initial list. I do find
not result
slightly difficult to read, and would favorlen(result) == 0
instead as I find it more "immediate" in meaning; isnot result
considered more idiomatic? \$\endgroup\$Matthieu M.– Matthieu M.2021年09月24日 09:40:01 +00:00Commented Sep 24, 2021 at 9:40 -
1\$\begingroup\$ There's some debate on what's preferred - some Python people prefer the length test and others the truth test. My opinion on that isn't very strong. \$\endgroup\$Toby Speight– Toby Speight2021年09月24日 10:36:08 +00:00Commented Sep 24, 2021 at 10:36
-
2\$\begingroup\$ Since
sorted
only require comparison via__lt__
, you could writeresult[-1] < element
for the rare cases__ne__
is not defined. Also, what's the rationale of usingresult += [element]
rather thanresult.append(element)
? \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2021年09月24日 14:20:01 +00:00Commented Sep 24, 2021 at 14:20 -
\$\begingroup\$ @301: Both good points - I'll modify accordingly. \$\endgroup\$Toby Speight– Toby Speight2021年09月24日 14:27:44 +00:00Commented Sep 24, 2021 at 14:27
The first thing to know about Python is that most basic operations on lists have helper built-ins. Your code could use set()
as the holding structure which will ensure uniqueness for you without any work.
With that out of the way, I think your assignment is more about showing your work so let's take that view on it:
- Your
run
variable is not needed. You can usebreak
keyword to get out of thewhile
loop. - You should try to do the early escape first in conditions to avoid nesting. In your case, you can say "if input is 0, break" then everything below assumes that value is not 0 so you don't need an
else
. - Double loop is overkill.
if <numer> not in <array>:
then add the number to array. It's slow but easy. You can do better with a set or a dictionary but that's excessive work for what you are doing. - If you're working in Python avoid index-based iteration. You can/should use iterables:
for value in entered_data:
. - Finally, avoid short names on variables. They create unmaintainable code and keep mental overhead high.
sorted(set(l))
? This is extremely basic,set
is a built-in data type that is ordered without duplicates with fast membership checking that is exactly what is designed to do this sort of thing, and then you just need to usesorted
function to return a sorted (shallow) copy of the list. Type casting is by far the most efficient and readable way to do this. \$\endgroup\$sorted(set(l))
. Didn't work. GotNameError: name 'l' is not defined
. \$\endgroup\$NameError: name 'z' is not defined
. \$\endgroup\$