I have a list of values and I need to, given any arbitrary starting index, find the closest value which is non zero, if the value at the starting index is zero...
Here is what I have:
def getNearestNonZero(start_index):
mylist = [4,2,6,7,3,0,0,9,4,2,5,8,1,7]
val = mylist[start_index]
if val == 0:
loop_length = 0
after = mylist[start_index+1:]
before = mylist[:start_index]
before = before[::-1]
print(before, after)
if len(before) >= len(after):
loop_length = len(before)
else:
loop_length = len(after)
for i in range(loop_length):
if i < len(before):
before_val = before[i]
if i < len(after):
after_val = after[i]
if before_val > 0:
return before_val
if after_val > 0:
return after_val
return val
result = getNearestNonZero(6)
print(result)
result = getNearestNonZero(5)
print(result)
[0, 3, 7, 6, 2, 4] [9, 4, 2, 5, 8, 1, 7]
9
[3, 7, 6, 2, 4] [0, 9, 4, 2, 5, 8, 1, 7]
3
What I do, is I first check to see if the value at the start_index
is> 0. If it is, great, return it. If however, the value is zero, we need to find the closest non-zero, with a preference for before, rather than after...
To do this, I split mylist
into two separate lists, before
and after
. If my start index is 6, before
will now look like: [4,2,6,7,3,0]
and after will look like: [9,4,2,5,8,1,7]
.
Since I need the closest value to the start_index
, I reverse my before
list: before = before[::-1]
I then get the length of the longest of the two (before
and after
).
I then loop and check the value at each index of the two lists. The first one to have a value> 0 is returned and my work is done.
However, this feels very clunky and as if it can be done in a cleaner way.
Does anyone have any recommendations? What is the faster/cleaner/pythonic way for finding the nearest non-zero value in a list, given a starting index?
3 Answers 3
I really like your idea of splitting the list -- it's very functional! Let's look at optimizing that approach, though.
def nearest_non_zero(lst, idx):
# note that I parameterized the list. Should make it easier to test.
# though I'll use [4,2,6,7,3,0,0,9,4,2,5,8,1,7] as in your code for
# all examples throughout
if lst[idx] > 0:
return lst[idx]
# I find it easier to track early-exit routes through code than
# to rule everything out. Let's short-circuit if possible!
before, after = reversed(lst[:idx]), lst[idx+1:]
# using `reversed` here lets us avoid creating a new list, which
# is otherwise necessary if we did `(lst[:idx])[::-1]`. Imagine
# using that on a list with length of ten million. Yikes!
So far I've really only cleaned up a little bit of code, nothing big, but here is where I'll make a decent departure from your original code. Your code does this:
if len(before) >= len(after):
loop_length = len(before)
else:
loop_length = len(after)
This is unnecessary, and it forces you to write in the guards inside your for
loop to make sure you're not indexing a value outside of the list length. That's too much work. There's a couple better ways, the primary one being itertools.zip_longest
.
The builtin zip
pulls together two (or more!) lists into one list of tuples. For instance
>>> a = [1, 2, 3]
>>> b = ['a', 'b', 'c']
>>> zip(a, b)
[(1, 'a'), (2, 'b'), (3, 'c')]
>>> zip(b, a)
[('a', 1), ('b', 2), ('c', 3)]
However if you have lists of uneven size, it will only go so far as the shortest list
>>> a = [1, 2]
>>> b = ['a', 'b', 'c']
>>> zip(a, b)
[(1, 'a'), (2, 'b')] # where's 'c'?!?!
That's not what we want here. We want to zip together every value from both the before
and after
lists and compare them. That's what itertools.zip_longest
does. But of course it needs another value: what should I use to pair with a value that doesn't exist? That's the parameter fillvalue
, which by default is None
>>> a = [1, 2]
>>> b = ['a', 'b', 'c']
>>> itertools.zip_longest(a, b)
[(1, 'a'), (2, 'b'), (None, 'c')] # there's 'c', phew!
>>> a = [1, 2]
>>> b = ['a', 'b', 'c']
>>> itertools.zip_longest(b, a, fillvalue="foobar")
[('a', 1), ('b', 2), ('c', 'foobar')]
That's exactly what we want! In this case we want our fillvalue
to explicitly be a failing case (since if the number doesn't really exist we obviously don't want to use it), and our failing case is zero.
After that we compare to zero our "before" value, then our "after" value (to implement that preference for before rather than after), and return either if it's appropriate to do so.
import itertools
for b_val, a_val in itertools.zip_longest(before, after, fillvalue=0):
if b_val > 0:
return b_val
elif a_val > 0:
return a_val
else:
# what do you do if you have a list that's all zeroes?
This keeps us from having to deal with any indexing at all, which is common idiomatically in Python. If you have to do for i in range(...): some_list[i]
, then you're probably doing something wrong!
Our final code, then is:
from itertools import zip_longest
def nearest_non_zero(lst, idx):
if lst[idx] > 0:
return lst[idx]
before, after = lst[:idx], lst[idx+1:]
for b_val, a_val in zip_longest(reversed(before), after, fillvalue=0):
# N.B. I applied `reversed` inside `zip_longest` here. This
# ensures that `before` and `after` are the same type, and that
# `before + [lst[idx]] + after == lst`.
if b_val > 0:
return b_val
if a_val > 0:
return a_val
else:
return 0 # all zeroes in this list
-
1\$\begingroup\$ And just for fun, here's an implementation in haskell \$\endgroup\$Adam Smith– Adam Smith2017年08月04日 23:46:19 +00:00Commented Aug 4, 2017 at 23:46
-
\$\begingroup\$ Thank you very much. I had thought to reverse
before
and combinebefore
andafter
into pairs just likezip
does, but I had no idea how to go about it! Thanks for the great info! P.S You have an extra bracket afterreversed(before), after
\$\endgroup\$pookie– pookie2017年08月05日 18:36:47 +00:00Commented Aug 5, 2017 at 18:36
Do you actually need to print the before
and after
, or is that just for debugging? To print the lists, you could write:
print(list(reversed(mylist[:start_index])), mylist[start_index+1:])
I wouldn't bother chopping the lists at all, and I'd rather work with indexes instead.
Most of your code is inside an if val == 0:
block. I would invert that condition and reduce the indentation of most of your code. (I also question why that special case exists. The task is poorly specified, and does not define what the function should do if mylist[start_index]
is nonzero.)
By PEP 8 conventions, the function should be named get_nearest_nonzero
.
mylist = [4,2,6,7,3,0,0,9,4,2,5,8,1,7]
def get_nearest_nonzero(start_index):
if mylist[start_index] != 0:
return mylist[start_index]
i = j = start_index
while i >= 0 or j < len(mylist):
if i >= 0 and mylist[i] != 0:
return mylist[i]
if j < len(mylist) and mylist[j] != 0:
return mylist[j]
i, j = i - 1, j + 1
return None # List is all zeroes
While you're current code works and is functional and works (nice job for that!), I believe your original logic is a bit over complicated. So I propose a different approach.
First, let's clearly define our problem and the requirements.
Given a list and an initial starting index, return the closest non-zero element in the list based off of the position of the initial start index.
However, your task is a bit unspecified, so I'm assuming two things. Here's my outline for the logic:
- There will never be a case where the given list is empty.
- If the list is all zeros, we'll return
None
.
Now, that we've defined the problem, here's the outline for the logic of my function.
- Test if the element at the given start index is not zero. If it is, our work is done and we return the element. If not, continue.
- Filter out all non-zero indexes from the given list.
- If the list that we've filtered from the original list (the list that contains indexes with non-zero elements) is empty, we know we have a list of zeros, so return
None
. - Otherwise, find the index closet to the given start index. We'll do this by comparing the absolute value of subtracting the current index from the start index. We'll use
abs()
to ensure we get only positive values. - Return the element at the closet index in the given list.
Without further ado, here is the implementation:
def get_nearest_non_zero(seq, start_index):
"""
Given a list and an initial starting index, return the closest non-zero
element in the list based off of the position of the initial start index.
If the list is only filled with zeros, None will be returned.
"""
# Test if the element at the given start index is not zero.
# If it is, our work is done and we return the element. If not, continue.
if seq[start_index] != 0:
return seq[start_index]
# Filter out all non-zero indexes from the given list.
non_zero_indexes = [i for i, el in enumerate(seq) if el != 0]
# If the list that we've filtered from the original list
# (the list that contains indexes with non-zero elements)
# is empty, we know we have a list of zeros, so return `None`.
if not non_zero_indexes:
return None
# We'll do this by comparing the value of subtracting the start index
# from the current index.
# We'll also use `abs()` to ensure we get only positive values.
# Otherwise, find the index closet to the given start index.
nearest_index = min(non_zero_indexes, key=lambda i: abs(i - start_index))
# Return the element at the closet index in the given list.
return seq[nearest_index]
Here's a demo:
>>> mylist = [4,2,6,7,3,0,0,9,4,2,5,8,1,7]
>>> get_nearest_non_zero(mylist, 6)
9
>>> get_nearest_non_zero(mylist, 5)
3
>>> mylist = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> get_nearest_non_zero(mylist, 6) # None was returned
>>>
-
\$\begingroup\$ Thank you very much for taking the time to answer! P.S I think your code is returning the value at the
start_index
, whether the value is zero or not :) \$\endgroup\$pookie– pookie2017年08月05日 18:41:36 +00:00Commented Aug 5, 2017 at 18:41 -
\$\begingroup\$ @pookie Thanks. Yes, I did have some typos in my original code. Sorry 'bout that, guess I wasn't paying attention :-) It should work now if you want to check it out again. \$\endgroup\$Chris– Chris2017年08月05日 21:28:37 +00:00Commented Aug 5, 2017 at 21:28
-
1\$\begingroup\$ Note that this solution is fully O(n): it will analyze the whole list even if there is a zero nearby. \$\endgroup\$200_success– 200_success2017年08月05日 23:56:16 +00:00Commented Aug 5, 2017 at 23:56
-
\$\begingroup\$ @200_success Yes, I realized that. While I believe this solution is pretty "pythoninc" and clean, the OP does lose points in time complexity. \$\endgroup\$Chris– Chris2017年08月06日 00:03:21 +00:00Commented Aug 6, 2017 at 0:03