5
\$\begingroup\$

I am developing a framework that allows to specify a machine learning model via a yaml file with different parameters nested so the configuration files are easy to read for humans.

I would like to give users the option of instead of specifying a parameter giving a range of options to try via a list.

Then I have to take this and generate all the possible valid combinations for the parameters the user has given multiple values for.

To mark which parameters are in fact lists and which ones are multiple values, I have opted to choose that combination values begin with 'multi_' (though if you have a different take I would be interested to hear it!).

So for example an user could write:

config = {
 'train_config': {'param1': 1, 'param2': [1,2,3], 'multi_param3':[2,3,4]}, 
 'model_config': {'cnn_layers': [{'units':3},{'units':4}], 'multi_param4': [[1,2], [3,4]]}
}

Indicating that 6 configuration files must be generated, where the values of 'param3' and 'param4' take all the possible combinations.

I have written a generator function to do this:

from pandas.io.json.normalize import nested_to_record 
import itertools
import operator
from functools import reduce
from collections import MutableMapping
from contextlib import suppress
def generate_multi_conf(config):
 flat = nested_to_record(config)
 flat = { tuple(key.split('.')): value for key, value in flat.items()}
 multi_config_flat = { key[:-1] + (key[-1][6:],) : value for key, value in flat.items() if key[-1][:5]=='multi'}
 if len(multi_config_flat) == 0: return # if there are no multi params this generator is empty
 keys, values = zip(*multi_config_flat.items())
 # delete the multi_params
 # taken from https://stackoverflow.com/a/49723101/4841832
 def delete_keys_from_dict(dictionary, keys):
 for key in keys:
 with suppress(KeyError):
 del dictionary[key]
 for value in dictionary.values():
 if isinstance(value, MutableMapping):
 delete_keys_from_dict(value, keys)
 to_delete = ['multi_' + key[-1] for key, _ in multi_config_flat.items()]
 delete_keys_from_dict(config, to_delete)
 for values in itertools.product(*values):
 experiment = dict(zip(keys, values))
 for setting, value in experiment.items():
 reduce(operator.getitem, setting[:-1], config)[setting[-1]] = value
 yield config

Iterating over this with the example above gives:

{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 2}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [1, 2]}}
{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 2}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [3, 4]}}
{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 3}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [1, 2]}}
{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 3}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [3, 4]}}
{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 4}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [1, 2]}}
{'train_config': {'param1': 1, 'param2': [1, 2, 3], 'param3': 4}, 'model_config': {'cnn_layers': [{'units': 3}, {'units': 4}], 'param4': [3, 4]}}

Which is the result expected.

Any feedback on how to make this code more readable would be very much appreciated!

alecxe
17.5k8 gold badges52 silver badges93 bronze badges
asked Dec 24, 2018 at 16:26
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

For non-trivial list comprehensions such as

multi_config_flat = { key[:-1] + (key[-1][6:],) : value for key, value in flat.items() if key[-1][:5]=='multi'}

You should split it onto multiple lines, i.e.

multi_config_flat = {key[:-1] + (key[-1][6:],): value
 for key, value in flat.items()
 if key[-1][:5]=='multi'}

This:

key[-1][:5]=='multi'

should be

key[-1].startswith('multi')

This:

if len(multi_config_flat) == 0: return

is equivalent (more or less) to

if not multi_config_flat:
 return

The latter also catches the case of multi_config_flat being None, but that won't be possible in this context.

This:

for key, _ in multi_config_flat.items():

is not necessary; simply iterate over keys:

for key in multi_config_flat:

This is fairly opaque:

reduce(operator.getitem, setting[:-1], config)[setting[-1]] = value

Probably you should assign the output of reduce to a meaningfully named variable, so that your code is more clear.

Stephen Rauch
4,31412 gold badges24 silver badges36 bronze badges
answered Dec 24, 2018 at 20:08
\$\endgroup\$
1
  • 1
    \$\begingroup\$ for key in multi_config_flag.keys() can simply be for key in multi_config_flag as the default iterator is keys() \$\endgroup\$ Commented Dec 24, 2018 at 21:30

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.