Array segmenting algorithm

Question 1

I have a sorted array of integers [x, ..., n, ..., y] so that
x> 0
n - x = random positive integer

The array might look something like this:
[1, 2, 3, 15, 16, 17, 120, 121, 122, 123, 124, 125, 126, 127, 210, 211, 212, 213, 214, 215]

I want to divide this array into segments so that,
1) The difference between last and first element of a segment < 125
2) The number of segments is equal to k (predefined positive integer)
3) Sum of differences between first and last element of all segments is minimal

The end result should be a list of segments with their start and end values.
The minimum size of a segment is 1.

Suppose k = 2, then the given example array could be divided into two such segments:
Option 1) [(1, 125), (126, 215)]
Option 2) [(1, 17), (120, 215)]
Both satisfy requirements 1 and 2.

Sum of differences between first and last element of all segments:
Option 1) (125-1) + (215 - 126) = 213
Option 2) (17 - 1) + (215 - 120) = 111
Therefore option 2 is the preferred solution.

Python is the used language.

Any suggestions on how to approach this problem are welcome.

Question 2

This is CS forum - "Python is the used language" - implementation language is not relevant here. Algorithms of course are.

Question 3

From a quick look at your problem, it seems that it can be solved in $O(nk)$ time using dynamic programming.

Let $[a_1, \dots, a_n]$ be your input array and define $\delta(i,j)$ as the minimum sum of the differences between the first and last element of each segment in an optimal subdivision of $[a_1, \dots, a_i]$ into at most $j$ segments. If no feasible subdivision exists then let $\delta(i,j) = +\infty$.

Then:

$\delta(0,j)=0$ for any $j \ge 0$;
$\delta(i,0)=+\infty$ for $i>0$; and
$\delta(i,j) = \min_{\substack{h = 1, \dots, i \\ a_i - a_h < 125}} \big\{ \delta(h-1, j-1) + a_i - a_h \big\}$, for $i,j>0$ (where the minimum over an empty set is $+\infty$).

The value of the optimal solution is $\delta(n, k)$ and, since computing each $\delta(i,j)$ requires $O(1)$ time, the overall time required is $O(nk)$. The optimal solution can be reconstructed using standard techniques (e.g., by retracing the optimal choices backwards, or by storing -for each $\delta(i,j)$- the optimal value of $h$ in the minimum).

There isn't much different between the above description an the pseudocode of the algorithm:

Let A[1],...,A[n] be the elements of the input array.
Let delta be a (n+1)x(k+1) matrix of integers, whose entries are indexed from 0 and initialized with -1 (representing +infinity);
for j=0,...,k:
 delta[0][j]=0;
for i=1,...,n:
 for j=1,...,k:
 for h=i,i-1,...,1:
 if A[i] - A[h]>=125: break;
 if delta[h-1][j-1]!=-1 and (delta[i][j]==-1 or delta[h-1][j-1] + A[i] - A[h] < delta[i][j]):
 delta[i][j] = delta[h-1][j-1] + A[i] - A[h];
return delta[n][k];

Question 4

Thank you for this response, but since I am not that familiar with implementing functions of this notation in code, can you provide me some guidelines, pseudocode or references as to how to bring this solution to code.

Question 5

I added a possible pseudocode. There isn't much difference from the previous description.

Steven Steven 29.7k2 gold badges29 silver badges49 bronze badges · Answer 1 · 2020-03-11 14:55:05Z

From a quick look at your problem, it seems that it can be solved in $O(nk)$ time using dynamic programming.

Let $[a_1, \dots, a_n]$ be your input array and define $\delta(i,j)$ as the minimum sum of the differences between the first and last element of each segment in an optimal subdivision of $[a_1, \dots, a_i]$ into at most $j$ segments. If no feasible subdivision exists then let $\delta(i,j) = +\infty$.

Then:

$\delta(0,j)=0$ for any $j \ge 0$;
$\delta(i,0)=+\infty$ for $i>0$; and
$\delta(i,j) = \min_{\substack{h = 1, \dots, i \\ a_i - a_h < 125}} \big\{ \delta(h-1, j-1) + a_i - a_h \big\}$, for $i,j>0$ (where the minimum over an empty set is $+\infty$).

The value of the optimal solution is $\delta(n, k)$ and, since computing each $\delta(i,j)$ requires $O(1)$ time, the overall time required is $O(nk)$. The optimal solution can be reconstructed using standard techniques (e.g., by retracing the optimal choices backwards, or by storing -for each $\delta(i,j)$- the optimal value of $h$ in the minimum).

There isn't much different between the above description an the pseudocode of the algorithm:

Let A[1],...,A[n] be the elements of the input array.
Let delta be a (n+1)x(k+1) matrix of integers, whose entries are indexed from 0 and initialized with -1 (representing +infinity);
for j=0,...,k:
 delta[0][j]=0;
for i=1,...,n:
 for j=1,...,k:
 for h=i,i-1,...,1:
 if A[i] - A[h]>=125: break;
 if delta[h-1][j-1]!=-1 and (delta[i][j]==-1 or delta[h-1][j-1] + A[i] - A[h] < delta[i][j]):
 delta[i][j] = delta[h-1][j-1] + A[i] - A[h];
return delta[n][k];

Thank you for this response, but since I am not that familiar with implementing functions of this notation in code, can you provide me some guidelines, pseudocode or references as to how to bring this solution to code.
I added a possible pseudocode. There isn't much difference from the previous description.

Stack Exchange Network

Array segmenting algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Array segmenting algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions