1
$\begingroup$

I have a sorted array of integers [x, ..., n, ..., y] so that
x> 0
n - x = random positive integer

The array might look something like this:
[1, 2, 3, 15, 16, 17, 120, 121, 122, 123, 124, 125, 126, 127, 210, 211, 212, 213, 214, 215]

I want to divide this array into segments so that,
1) The difference between last and first element of a segment < 125
2) The number of segments is equal to k (predefined positive integer)
3) Sum of differences between first and last element of all segments is minimal

The end result should be a list of segments with their start and end values.
The minimum size of a segment is 1.

Suppose k = 2, then the given example array could be divided into two such segments:
Option 1) [(1, 125), (126, 215)]
Option 2) [(1, 17), (120, 215)]
Both satisfy requirements 1 and 2.

Sum of differences between first and last element of all segments:
Option 1) (125-1) + (215 - 126) = 213
Option 2) (17 - 1) + (215 - 120) = 111
Therefore option 2 is the preferred solution.

Python is the used language.

Any suggestions on how to approach this problem are welcome.

asked Mar 11, 2020 at 12:18
$\endgroup$
1
  • $\begingroup$ This is CS forum - "Python is the used language" - implementation language is not relevant here. Algorithms of course are. $\endgroup$ Commented Mar 11, 2020 at 14:16

1 Answer 1

2
$\begingroup$

From a quick look at your problem, it seems that it can be solved in $O(nk)$ time using dynamic programming.

Let $[a_1, \dots, a_n]$ be your input array and define $\delta(i,j)$ as the minimum sum of the differences between the first and last element of each segment in an optimal subdivision of $[a_1, \dots, a_i]$ into at most $j$ segments. If no feasible subdivision exists then let $\delta(i,j) = +\infty$.

Then:

  • $\delta(0,j)=0$ for any $j \ge 0$;
  • $\delta(i,0)=+\infty$ for $i>0$; and
  • $\delta(i,j) = \min_{\substack{h = 1, \dots, i \\ a_i - a_h < 125}} \big\{ \delta(h-1, j-1) + a_i - a_h \big\}$, for $i,j>0$ (where the minimum over an empty set is $+\infty$).

The value of the optimal solution is $\delta(n, k)$ and, since computing each $\delta(i,j)$ requires $O(1)$ time, the overall time required is $O(nk)$. The optimal solution can be reconstructed using standard techniques (e.g., by retracing the optimal choices backwards, or by storing -for each $\delta(i,j)$- the optimal value of $h$ in the minimum).

There isn't much different between the above description an the pseudocode of the algorithm:

Let A[1],...,A[n] be the elements of the input array.
Let delta be a (n+1)x(k+1) matrix of integers, whose entries are indexed from 0 and initialized with -1 (representing +infinity);
for j=0,...,k:
 delta[0][j]=0;
for i=1,...,n:
 for j=1,...,k:
 for h=i,i-1,...,1:
 if A[i] - A[h]>=125: break;
 if delta[h-1][j-1]!=-1 and (delta[i][j]==-1 or delta[h-1][j-1] + A[i] - A[h] < delta[i][j]):
 delta[i][j] = delta[h-1][j-1] + A[i] - A[h];
return delta[n][k];
answered Mar 11, 2020 at 14:55
$\endgroup$
2
  • $\begingroup$ Thank you for this response, but since I am not that familiar with implementing functions of this notation in code, can you provide me some guidelines, pseudocode or references as to how to bring this solution to code. $\endgroup$ Commented Mar 14, 2020 at 9:02
  • $\begingroup$ I added a possible pseudocode. There isn't much difference from the previous description. $\endgroup$ Commented Mar 14, 2020 at 12:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.