I wrote the following code for a Dynamic Programming problem but the online judge gives me a 'Time Limit Exceeded' error. How can I optimize this code to make it more efficient, especially the part of initializing the DP array to 0.0 (I am using memset()
right now)?
Problem:
- There are N values in an array. Two players take turn by turn to play the game.
- Player1, with 50% probability can pick the left most value or the right most value. 3.If there is only one value left in the array, the player, having no more choice, will just pick up that value.
- Question is to calculate the expected sum of values that First player can obtain.
- \$T <= 500\$ and \$N <= 2000\$
#include<iostream>
#include<iomanip>
#include<cstdio>
#include<cstring>
using namespace std;
#define max 2000
int main(){
int T;
double **dp = new double*[max];
for(int i=0; i<max; ++i)
dp[i] = new double[max];
cin >> T;
while(T--){
int N;
cin >> N;
int V[N];
for(int i=0; i<N; ++i)
cin >> V[i] ;
for(int i=0; i<N; ++i)
memset(dp[i], 0.0, sizeof(dp[0][0])*N);
for(int i=0; i<N; ++i)
dp[i][i] = V[i];
for(int j=1; j<N; ++j)
for(int i=j-1; i>=0; --i){
dp[j][i] = 0.5 * ( dp[i][i] + dp[j][j] + dp[j-1][i+1] );
if( (i+2)<N )
dp[j][i] += (0.250 * dp[j][i+2]);
if( (j-2)>=0 )
dp[j][i] += (0.250 * dp[j-2][i]);
}
cout << setprecision (5) << dp[N-1][0] << endl;
}
for(int i=0; i<max; ++i)
delete[] dp[i];
delete dp;
return 0;
}
-
\$\begingroup\$ Is there a public url that you could point us to for information on the problem? \$\endgroup\$Glenn Rogers– Glenn Rogers2012年10月03日 12:15:15 +00:00Commented Oct 3, 2012 at 12:15
-
\$\begingroup\$ Glenn: I've updated the Question. \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 13:39:51 +00:00Commented Oct 3, 2012 at 13:39
-
\$\begingroup\$ What is the time limit? Do you have any idea how much you're failing it by? That should help indicate if another approach is required. \$\endgroup\$Glenn Rogers– Glenn Rogers2012年10月03日 18:04:48 +00:00Commented Oct 3, 2012 at 18:04
-
\$\begingroup\$ sadly, I don't get to know the time when my code gets interuppted but for this question they give 4sec on their servers for K<=500 and N<=2000. I too guess theres' nothing much left to optimize so I've to come up with something better than O(N^2). like- O(NlogN) or O(N). Thanks for all your help guys. \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 18:34:48 +00:00Commented Oct 3, 2012 at 18:34
2 Answers 2
It looks like the nested loops of your algorithm are using only three rows of the dp
matrix:
dp[j]
dp[j-1]
dp[j-2]
You can use this observation to transform your solution from one requiring O(N^2)
space to one requiring O(N)
space by changing
double **dp = new double*[max];
to
double dp[3][max]; // Now your matrix has a chance of fitting on the stack
and adding %3
to all operations that touch the first index of the dp
matrix.
This may or may not eliminate the timeout, but it will make your program better by reducing the amount of resources that it needs to operate.
-
\$\begingroup\$ Thanks for suggesting a memory efficient approach. I've updated my code but it still goes over the allowed Time Limit! \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 17:09:22 +00:00Commented Oct 3, 2012 at 17:09
-
\$\begingroup\$ @srbh.kmr Does it produce the correct output? Can you switch from
double
tofloat
? \$\endgroup\$Sergey Kalinichenko– Sergey Kalinichenko2012年10月03日 17:11:06 +00:00Commented Oct 3, 2012 at 17:11 -
\$\begingroup\$ I think so. Yes, it never gives me a 'Wrong answer' just runs out of time when submitted. and yes, I can change double to float but what difference would it make specifically? i wonder. \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 17:15:02 +00:00Commented Oct 3, 2012 at 17:15
-
\$\begingroup\$ @srbh.kmr Changing
double
s tofloat
s will reduce the amount of data needed for computation in half, improving the chance of hitting the cache more often. \$\endgroup\$Sergey Kalinichenko– Sergey Kalinichenko2012年10月03日 17:26:50 +00:00Commented Oct 3, 2012 at 17:26 -
\$\begingroup\$ Oh. Right. I was under the wrong impression that both double and float are 4Bytes. \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 17:33:34 +00:00Commented Oct 3, 2012 at 17:33
There are some things that I can think of:
- Is it quicker to deal with one blob of size max*max, and replacing dp[i][j] with dp[i+j*N]?
- Are you sure that your values are doubles, or could they be 32, 16, or even 8-bit integers? - less space to alloc/memset. Similarly, is your 2000 limit a specified one or one "large enough"?
- I'm not convinced that you actually need memset at all (You'll need to guard the
dp[j-1][i+1]
term which reaches below the diagonal for the first cells)! Because you're always refering to values to the 'bottom left' of the current cell, you never access values generated by the previous run... - Also, because you "shouldn't" access values below the diagonal, all the dp[i] lines are actually (N - i) in length, so less to alloc/zero. You will though need to make other changes to the indices in the loops (and especially guard the
dp[j-1][i+1]
term).
And although it won't actually save any time, you can 'squash' a couple of loops together.
for(int i=0; i<N; ++i) {
int V;
cin >> V;
memset( dp[i], 0, sizeof(dp[0][0])*N );
dp[i][i] = V;
}
And don't forget that the final delete should be delete[] dp;
!
-
\$\begingroup\$ Thanks for your review. Points noted but yes, result has to be a double. and T<=500 and N<=2000 is specified. and I need to do memset in order to zero out the dp[][] array for every new testcases. I know I'm allocating almost twice as many elements as I need to, but please take a look at my updated code. It still gives me 'Time Limit Exceeded' error! \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 17:12:25 +00:00Commented Oct 3, 2012 at 17:12
-
\$\begingroup\$ Actually, yes I can see I don't need that memset at all. Thanks, \$\endgroup\$srbh.kmr– srbh.kmr2012年10月03日 17:30:39 +00:00Commented Oct 3, 2012 at 17:30
Explore related questions
See similar questions with these tags.