10
\$\begingroup\$

I have written a script which is supposed to call the function at the bottom for each unique combination of values in the parameter vectors. That function is a simulation which I intend to run over all the unique combinations of those parameters. The simulation is run on a cluster, and the sim_ind for each of these jobs is pulled from the environment.

Currently, my solution is to build a matrix with each row corresponding to a unique set of parameters. Then the sim_ind value (passed from the environment) is used to determine which unique set of parameters will be used in that iteration by indexing the matrix.

The following is the code that I am using, which works:

% getting sim_ind from env
AI = getenv('PBS_ARRAYID');
sim_ind = str2num(AI);
time_slots = 40000;
% parameter vectors
up_down_ratio_arr = [1];
traffic_load_arr = [ 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10, 100, Inf ];
blocked_modes_ind_arr = [1 2 3];
traffic_model_arr = [1 2];
% total number of unique combinations
uniq = length(up_down_ratio_arr) * length(traffic_load_arr) * length(blocked_modes_ind_arr) * length(traffic_model_arr);
parameters = zeros(uniq, 4);
% propagating the parameter list matrix
counter = 0;
for i = up_down_ratio_arr
 for j = traffic_load_arr
 for k = blocked_modes_ind_arr
 for l = traffic_model_arr
 counter = counter + 1;
 parameters(counter, 1) = i;
 parameters(counter, 2) = j;
 parameters(counter, 3) = k;
 parameters(counter, 4) = l;
 end
 end
 end
end
% wrapping the sim_ind
ind = mod(sim_ind, uniq);
% mod maps all multiples of the largest index to zero
if ind == 0
 ind = uniq;
end
% retrieving a parameter set
up_down_ratio = parameters(ind,1);
traffic_load = parameters(ind,2);
blocked_modes_ind = parameters(ind,3);
traffic_model = parameters(ind,4);
% call the routine
MainCode_Single_Backahul_Cell_Sch(time_slots, up_down_ratio, traffic_load, blocked_modes_ind, traffic_model, ceil(sim_ind/uniq) );

Is there a better way?

I have a version in which I use mod to map each value of the sim_ind parameter to a value that exists in the range of each vector. The problem with this solution was that not all the combination were unique. For example, in the following code traffic_load_arr is of size 10, and traffic_model_arr is of size 2. Therefore, whenever the 8th element of traffic_load_arr is selected; i.e. mod(sim_ind, 10) == 8, then the 2nd element of traffic_model_arr must also be selected. As a results, the parameter combination of traffic_load == 10 and traffic_model == 2 will never be formed.

I would appreciate for all code to be cross compatible with MATLAB and Octave.

asked Aug 4, 2016 at 22:15
\$\endgroup\$

2 Answers 2

8
\$\begingroup\$

For a full factorial experiment, you may want to avoid constructing the whole matrix just to get the factor levels for a single experiment, especially if you have a large number of factors and levels and you are using a language where loops are not very efficient. Here is how you can do it using mod.

In a single factor experiment, you run the experiment over the range of values of the first factor. This is easy to do using mod: use mod(sim_ind, nlevels_factor1), or

mod(sim_ind-1, nlevels_factor1) + 1

if you want to start at one (since MATLAB indexes arrays from 1).

To extend the experiment to a second factor, you want this new factor to have its first level value for the first nlevels_factor1 runs, then its second level value for the second nlevels_factor1 runs, etc... all the way up till the nlevels_factor1*nlevels_factor2th run.

You'll use mod again, but now you want the first argument to mod to be 1 for the first nlevels_factor1 runs, 2 for the next nlevels_factor1 runs, etc. So instead of passing the sim_index (or sim_index-1) to mod, you'll pass ceil(sim_index/nlevels_factor1), which has value 1 for the first nlevels_factor1 runs, 2 for the next nlevels_factor1 runs, etc. Your factor level for the second factor will be

mod(ceil(sim_index/nlevels_factor1) - 1, nlevels_factor2) + 1

if you want to start from 1.

You can continue this procedure to extend your experiment to any number of factors. Just make the divisor inside the ceil equal to the product of the number of levels of all of the previous factors.

Here's what it might look like for your example:

% getting sim_ind from env
AI = getenv('PBS_ARRAYID');
sim_ind = str2num(AI);
time_slots = 40000;
% parameter vectors
up_down_ratio_arr = [1];
traffic_load_arr = [ 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10, 100, Inf ];
blocked_modes_ind_arr = [1 2 3];
traffic_model_arr = [1 2];
size_up_down_ratio = length(up_down_ratio_arr);
size_traffic_load = length(traffic_load_arr);
size_blocked_modes = length(blocked_modes_ind_arr);
size_traffic_model = length(traffic_model_arr);
% total number of unique combinations
uniq = size_up_down_ratio*size_traffic_load*size_blocked_modes*size_traffic_model;
% repeat each value once: 1,2,3,4,5,6,7,8,9,10,1,2,3,4...
i_up_down_ratio = mod(sim_ind - 1,size_up_down_ratio) + 1;
% repeat each value size_so_far times:
% 1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3...
size_so_far = size_up_down_ratio;
i_traffic_load = mod(ceil(sim_ind/size_so_far) - 1, size_traffic_load) + 1;
size_so_far = size_so_far*size_traffic_load;
i_blocked_modes = mod(ceil(sim_ind/size_so_far) - 1, size_blocked_modes) + 1;
size_so_far = size_so_far*size_blocked_modes;
i_traffic_model = mod(ceil(sim_ind/size_so_far) - 1, size_traffic_model) + 1;
up_down_ratio = up_down_ratio_arr(i_up_down_ratio);
traffic_load = traffic_load_arr(i_traffic_load);
blocked_modes_ind = blocked_modes_ind_arr(i_blocked_modes);
traffic_model = traffic_model_arr(i_traffic_model);
% call the routine
MainCode_Single_Backahul_Cell_Sch(time_slots, up_down_ratio, traffic_load, blocked_modes_ind, traffic_model, ceil(sim_ind/uniq) );
answered Aug 5, 2016 at 7:15
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Very nice! =) +1 from me! \$\endgroup\$ Commented Aug 5, 2016 at 7:27
7
\$\begingroup\$

NOTE: The code in this answer is rather compact, in order to avoid a lot of scrolling. You might want to put in some additional spaces and line breaks.

You know how to do this! Your code looks very nice and clean!

But, this hurts my MATLAB eyes:

for i = up_down_ratio_arr
 for j = traffic_load_arr
 for k = blocked_modes_ind_arr
 for l = traffic_model_arr
 counter = counter + 1;
 parameters(counter, 1) = i;
 parameters(counter, 2) = j;
 parameters(counter, 3) = k;
 parameters(counter, 4) = l;
 end
 end
 end
end

It's the prime example of something that should be done without loops... One thing that was hurting even more was that I apparently couldn't make a vectorized approach that was faster. The reason why it appeared to be slow was because I tried timing it with tic/toc.

Tips: Don't use tic/toc when benchmarking, use the superior timeit! When benchmarking with timeit I got results that was much in favor of vectorization. (Phew)

The vectorized approach (creds to Luis Mendo) in a function, so that I can time it:

function parameters = vectorized_approach(up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr)
vectors = {up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr};
n = numel(vectors); %// number of vectors
parameters = cell(1,n); %// pre-define to generate comma-separated list
[parameters{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
parameters = cat(n+1, parameters{:}); %// concat the n n-dim arrays along dimension n+1
parameters = reshape(parameters,[],n);
end

Your loop approach is placed in the function loop_approach (I won't repeat it here).

up_down_ratio_arr = [1]; 
traffic_load_arr = [ 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10, 100, Inf ];
blocked_modes_ind_arr = [1 2 3];
traffic_model_arr = [1 2];
f = @() loop_approach(up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr);
g = @() vectorized_approach(up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr);
isequal(f(),g()) 
ans = 
 1 
timeit(f)/timeit(g)
ans =
 1.6072

So, for your input variables, the vectorized approach is about 60% faster than the loops. Let's make the up_down_ratio_arr a bit longer:

up_down_ratio_arr = [1:5];
f = @() loop_approach(up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr);
g = @() vectorized_approach(up_down_ratio_arr, traffic_load_arr, blocked_modes_ind_arr, traffic_model_arr);
isequal(f(),g())
ans =
 1
timeit(f)/timeit(g)
ans =
 7.2192 

So, the vectorized approach is quite a bit faster for the data you posted in the question, and a lot faster for larger data sets.


Let's continue:

In general, numel is better than length, at least for long vectors. I recommend you get used to using it every time!

This, could be simplified:

if ind == 0
 ind = uniq;
end

This does the same job, and avoids the if: ind(ind == 0) = uniq;.

I don't think I would have done ceil(sim_ind/uniq) in the function call. I suggest you calculate this and give it a descriptive name (probably the same as in the function you're calling). Then use that variable as input argument.

answered Aug 5, 2016 at 7:15
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.