Find all indices where value of array changes

Question 1

I try to solve the following problem using Python:
Given a 1-dimensional numeric array arr, find all indices where the value of arr changes. Return for each pair (a, b) (where a != b) an iterator of the indices i where arr[i] == a and arr[i+1] == b.

Example:
If the input array was [1,1,1,1,1,1,5,5,5,5,5,5,5,5,5,5,5,1,1,1,1,1,1,4,4,4,4,4,4,4,4,4,1,1,1,1,1,1], then the function should return the following pairs and iterators:

(1, 5): [6],
(5, 1): [17],
(1, 4): [23],
(4, 1): [32]

If the input is [1,1,1,1,4,4,4,4,1,1,4,4,4,4,4,4,3,3,3,1,4,4,4], then the function should return:

(1, 4): [4, 10, 20],
(4, 1): [8],
(4, 3): [16],
(3, 1): [19]

In my current implementation, for each pair a list is returned, but it may be any iterator. All lists are contained in a dictionary (a defaultdict, to be precise), but they may be returned in any style.

This is my current implementation:

from collections import defaultdict
def find_changepoints(arr):
 changepoints = defaultdict(list)
 current_value = arr[0]
 for (index, value) in enumerate(arr):
 if value != current_value:
 changepoints[ (current_value, value) ].append(index)
 current_value = value
 
 return changepoints

Question 2

Minor feedback about your existing solution: add PEP484 type hints; enumerate(arr[1:]) instead of over the whole array; no need to enclose your tuple index in parens; add unit tests.

If this is homework, a programming challenge or whatever and exists in a vacuum, you've basically met the brief. If this is for a specific application and you can replace the dictionary representation with something more efficient, then that will benefit you. For example, the following implementation has a dictionary wrapper that mimics yours around a vectorised function. If you have an application that is able to use the vectorised function directly, it will be faster for large arrays.

from collections import defaultdict
from numbers import Real
from typing import Sequence
import numpy as np
def changed_where_numpy(in_seq: np.ndarray) -> tuple[
 np.ndarray, # old values
 np.ndarray, # new values
 np.ndarray, # new indices
]:
 old_idx, = np.diff(in_seq).nonzero()
 new_idx = old_idx + 1
 return in_seq[old_idx], in_seq[new_idx], new_idx
def changed_where(in_seq: Sequence[Real]) -> dict[
 tuple[Real, Real], # old and new value
 list[int], # new value indices
]:
 old_vals, new_vals, new_idx = changed_where_numpy(np.array(in_seq))
 result = defaultdict(list)
 for old_val, new_val, new_idx in zip(old_vals, new_vals, new_idx):
 result[old_val, new_val].append(new_idx)
 return result
def test():
 in_seq = (
 1, 1, 1, 1, 1, 1,
 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
 1, 1, 1, 1, 1, 1,
 4, 4, 4, 4, 4, 4, 4, 4, 4,
 1, 1, 1, 1, 1, 1,
 )
 expected = {
 (1, 5): [6],
 (5, 1): [17],
 (1, 4): [23],
 (4, 1): [32],
 }
 actual = changed_where(in_seq)
 assert expected == actual
 in_seq = (
 1, 1, 1, 1,
 4, 4, 4, 4,
 1, 1,
 4, 4, 4, 4, 4, 4,
 3, 3, 3,
 1,
 4, 4, 4,
 )
 expected = {
 (1, 4): [4, 10, 20],
 (4, 1): [8],
 (4, 3): [16],
 (3, 1): [19]
 }
 actual = changed_where(in_seq)
 assert expected == actual
if __name__ == '__main__':
 test()

Question 3

Return for each pair (a, b) (where a != b) an iterator of the indices i where arr[i] == a and arr[i+1] == b.

You failed the task, since your function is not returning an iterator of indices. Also, if you read your task carefully, your indices are off by 1.

from itertools import islice
from typing import Iterator
def changes(lst: list) -> Iterator[int]:
 for index, (current, next) in enumerate(zip(lst, islice(lst, 1, None))):
 if current != next:
 yield index

Question 4

Thank you for pointing out my misconception. I mixed up the words iterable and iterator when I formulated the task. I will not edit my initial question as to not make your answer invalid.

Question 5

I see. Given your question, I assumed that the task was given to you by a third party as cited by you.

Question 6

This solution seems to return a single iterator of all changes in the whole input list, whilst the task is (if I get it correctly) to return a separate iterator for each distinct pair of values at the constant runs' boundaries.

Reinderien Reinderien 71k5 gold badges76 silver badges256 bronze badges · Answer 1 · 2022-03-21 18:09:32Z

Minor feedback about your existing solution: add PEP484 type hints; enumerate(arr[1:]) instead of over the whole array; no need to enclose your tuple index in parens; add unit tests.

If this is homework, a programming challenge or whatever and exists in a vacuum, you've basically met the brief. If this is for a specific application and you can replace the dictionary representation with something more efficient, then that will benefit you. For example, the following implementation has a dictionary wrapper that mimics yours around a vectorised function. If you have an application that is able to use the vectorised function directly, it will be faster for large arrays.

from collections import defaultdict
from numbers import Real
from typing import Sequence
import numpy as np
def changed_where_numpy(in_seq: np.ndarray) -> tuple[
 np.ndarray, # old values
 np.ndarray, # new values
 np.ndarray, # new indices
]:
 old_idx, = np.diff(in_seq).nonzero()
 new_idx = old_idx + 1
 return in_seq[old_idx], in_seq[new_idx], new_idx
def changed_where(in_seq: Sequence[Real]) -> dict[
 tuple[Real, Real], # old and new value
 list[int], # new value indices
]:
 old_vals, new_vals, new_idx = changed_where_numpy(np.array(in_seq))
 result = defaultdict(list)
 for old_val, new_val, new_idx in zip(old_vals, new_vals, new_idx):
 result[old_val, new_val].append(new_idx)
 return result
def test():
 in_seq = (
 1, 1, 1, 1, 1, 1,
 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
 1, 1, 1, 1, 1, 1,
 4, 4, 4, 4, 4, 4, 4, 4, 4,
 1, 1, 1, 1, 1, 1,
 )
 expected = {
 (1, 5): [6],
 (5, 1): [17],
 (1, 4): [23],
 (4, 1): [32],
 }
 actual = changed_where(in_seq)
 assert expected == actual
 in_seq = (
 1, 1, 1, 1,
 4, 4, 4, 4,
 1, 1,
 4, 4, 4, 4, 4, 4,
 3, 3, 3,
 1,
 4, 4, 4,
 )
 expected = {
 (1, 4): [4, 10, 20],
 (4, 1): [8],
 (4, 3): [16],
 (3, 1): [19]
 }
 actual = changed_where(in_seq)
 assert expected == actual
if __name__ == '__main__':
 test()

Richard Neumann Richard Neumann 6,42122 silver badges41 bronze badges · Answer 2 · 2022-03-21 21:04:38Z

1

\$\begingroup\$

Return for each pair (a, b) (where a != b) an iterator of the indices i where arr[i] == a and arr[i+1] == b.

You failed the task, since your function is not returning an iterator of indices. Also, if you read your task carefully, your indices are off by 1.

from itertools import islice
from typing import Iterator
def changes(lst: list) -> Iterator[int]:
 for index, (current, next) in enumerate(zip(lst, islice(lst, 1, None))):
 if current != next:
 yield index

Share

answered Mar 21, 2022 at 21:04

Richard Neumann's user avatar

Richard Neumann Richard Neumann

6,42122 silver badges41 bronze badges

\$\endgroup\$

3

\$\begingroup\$ Thank you for pointing out my misconception. I mixed up the words iterable and iterator when I formulated the task. I will not edit my initial question as to not make your answer invalid. \$\endgroup\$

NerdOnTour
– NerdOnTour

2022年03月22日 08:20:05 +00:00
Commented Mar 22, 2022 at 8:20
\$\begingroup\$ I see. Given your question, I assumed that the task was given to you by a third party as cited by you. \$\endgroup\$

Richard Neumann
– Richard Neumann

2022年03月22日 13:09:59 +00:00
Commented Mar 22, 2022 at 13:09
\$\begingroup\$ This solution seems to return a single iterator of all changes in the whole input list, whilst the task is (if I get it correctly) to return a separate iterator for each distinct pair of values at the constant runs' boundaries. \$\endgroup\$

CiaPan
– CiaPan

2023年12月12日 07:55:01 +00:00
Commented Dec 12, 2023 at 7:55

Add a comment |

Stack Exchange Network

Find all indices where value of array changes

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find all indices where value of array changes

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions