3
\$\begingroup\$

I have written the following code for diving a list 'seq' into 'k' segments such that the maximum sum of each segment is minimized. I have used dynamic programming to solve this problem. But my class "Segmentation" seems a little overwhelmed with too many redundant functions. Can anybody give me a few suggestions on how can I improve this code? I am even using separate functions for returning boolean values and calculating the average. Also using raise SystemExit because I do not want to add new library only for this line. Please provide your valuable feedback on the below code.

"""
Calculate the optimal k-segmentation of a sequence using dynamic programming
Author: Jahid Chowdhury Choton ([email protected])
The sub-problems for dynamic programming are given below:
Base case: Opt(i,j,M) = cost(S[:i]-M) if j==1
Recursion: Opt(i,j,M) = min( max( Opt(l,j-1, M), cost(S[l:i])-M) for l=1,2,...,i)
"""
class Segmentation:
 def __init__(self, seq, k):
 self.seq = seq
 self.k = k
 self.n = len(seq)
 self.M = self.avg() # Parameter M is taken to be the average of the sequence
 self.opt_val, self.opt_segments = self.opt() # Calculate optimal values and segmentation
 def avg(self): # To calculate average of sequence seq
 return sum(self.seq) / self.n
 def is_odd(self, val): # To return a boolean value 0 or 1
 return val % 2
 def initialize_cost_and_dp(self): # To initialize cost array, DP array and traceback array
 cost_sum = [0] * (self.n + 1) # Initialize cost array with zeros
 dp = [[-1.0 for _ in range(self.n+1)] for _ in range(2)] # Tabulation array for DP
 tb = [[[] for _ in range(self.n+1)] for _ in range(2)] # Traceback array for optimal segmentation
 for i in range(0, self.n): # O(n)
 # Add the elements of the sequence seq into cost array
 cost_sum[i + 1] = self.seq[i] + cost_sum[i]
 # Initialize DP array with infinity values
 dp[0][i+1] = float('inf')
 dp[1][i+1] = float('inf')
 # print(cost_sum, dp, tb)
 return cost_sum, dp, tb
 def generate_segmentation(self, traceback): # To generate k-segmentation of sequence seq using traceback array
 # Separate the traceback array into two parts starts and ends
 starts = traceback[self.is_odd(self.k)][self.n]
 ends = starts[1:] + [self.n]
 # print(starts, ends)
 k_segments = [self.seq[s:e] for s, e in zip(starts, ends)] # Concatenate the two parts for optimal segments
 return k_segments
 def opt(self): # To calculate the optimal value and segmentation using dp
 if self.n < self.k:
 print("Not enough cells for k-segmentation, now exiting the program...")
 raise SystemExit # Program will terminate if k is less than the segmented cells
 costSum, dp, tb = self.initialize_cost_and_dp() # Get the initial values of cost array, DP array and TB array
 for i in range(1, self.k + 1): # O(k)
 for j in range(1, self.n + 1): # O(n)
 for l in range(i - 1, j): # O(n-k)
 curr_val = max(dp[self.is_odd(i - 1)][l],
 abs((costSum[j] - costSum[l]) - self.M)) # Check the current max value
 prev_val = dp[self.is_odd(i)][j] # Check the previous value
 if curr_val < prev_val: # Check the min value and update it in DP and traceback array
 dp[self.is_odd(i)][j] = curr_val # Update the min value in dp array
 tb[i % 2][j] = tb[self.is_odd(i - 1)][l] + [l] # Update the breakpoint in traceback array
 optimal_value = dp[self.is_odd(self.k)][self.n] # Calculate the optimal value
 optimal_segments = self.generate_segmentation(tb) # Generate the optimal segments
 return optimal_value, optimal_segments
# Driver Code
def main():
 # S = [3, 7, 7, 33]
 # S = [3, 4, 2, 2, 6, 7, 43, 100000, 4, 3, 11]
 # S = [3, 4, 2, 2, 6, 7, 43, 50, 43, 3, 11]
 # S = [3, 4, 2, 2, 6, 7, 13, 10, 43, 50, 5]
 # S = [1,1,1,5,5,5]
 S = [3,8,7,1,11]
 k = 3
 # Call the class Segmentation
 k_segmentation = Segmentation(S, k)
 # Get the optimal value and segmentation
 optimal_values = k_segmentation.opt_val
 optimal_segmentation = k_segmentation.opt_segments
 # Print the optimal value and segmentation
 print("Optimal value =", optimal_values)
 print("Optimal k-segmentation = ", optimal_segmentation)
if __name__ == '__main__':
 main()
200_success
145k22 gold badges190 silver badges478 bronze badges
asked Sep 3, 2022 at 19:13
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

One could approach this from the goal:

What is the business of a Segmentation instance, what are its business methods? You tagged object-oriented: What is the value added by having a Segmentation(sequence, k) instance over an opt(sequence, k) function?

In your main(), you don't (syntactically) call any method of k_segmentation.
A first improvement would be to take the call to opt() out of the constructor, rendering "the opt_... attributes" obsolete.

While I can "see" the segmentation part opt() returns, what is that optimal_value?
Document both in a docstring for opt()!

To get some advantage from a Segmentation instance, have it keep information useful for more than one access.
If you construct it with just the sequence and pass the number of parts as a parameter to opt(k), you can look for information independent of k - M is, cost_sum looks a candidate.


  • The methods are commented for what they do. Not bad.
    Docstrings would be even better, starting with help(Segmentation.opt).
    With opt(), a description of how it works would add value.

  • Naming
    is_odd() is named for how it works - I'd prefer current(i) and previous(i).
    (There is one instance of i % 2 not substituted.)
    costSum isn't snake_case (nor is M).
    • internal methods like initialize_cost_and_dp maybe should start with an underscore.
    • Method/function names with an and are a code smell.

From Python 3.10, generate_segmentation() could use itertools.pairwise.


I think all that indexing with [self.is_odd(i)] for current (i - 1 for previous) reduces readability - if you want to keep the mechanism, you can use descriptors -
I'd likely go for named attributes & an explicit swap at the end of the outer loop.

answered Sep 4, 2022 at 12:07
\$\endgroup\$
1
  • \$\begingroup\$ Thanks a lot for your useful comments. It really helped a lot! \$\endgroup\$ Commented Sep 6, 2022 at 18:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.