I noticed I tend to use many if-conditions in my code, and I know there is a way to simplify this with one-liners.
The function is iterating through configuration dictionaries so that I can log parameters in MLflow not as a long non-readable string:
but as key-value pairs:
!pip install mlflow
!python -m pip install -e detectron2
import mlflow
import detectron2
from detectron2.config import get_cfg, CfgNode
cfg = get_cfg()
for k, v in cfg.items():
# (1) check if params are being logged as a long hard-to-read string
if type(v) == detectron2.config.config.CfgNode:
# (1) if yes, iterate through it to log separately
for kk, vv in v.items():
# (2) check again if params are being logged as a long hard-to-read string
if type(vv) == detectron2.config.config.CfgNode:
# (2) if yes, iterate through it to log separately
for kkk, vvv in vv.items():
# (3) check again if params are being logged as a long hard-to-read string
if type(vvv) == detectron2.config.config.CfgNode:
# (3) if yes, iterate through it to log separately
for kkkk, vvvv in vvv.items():
# (3) generate param_name
param_name = k + '.' + kk + '.' + kkk + '.' + kkkk
print(param_name, ':', vvvv)
# (3) log if exists
if vvvv:
mlflow.log_param(param_name, vvvv)
# (3) if not a dict
else:
# (3) generate param_name
param_name = k + '.' + kk + '.' + kkk
print(param_name, ':', vvv)
# (3) log if variable exists
if vvv:
mlflow.log_param(param_name, vvv)
# (2) if not a dict
else:
# (2) generate param_name
param_name = k + '.' + kk
print(param_name, ':', vv)
# (2) log if exists
if vv:
mlflow.log_param(param_name, vv)
# (1) if not a dict
else:
print(k, ':', v)
# (1) log if exists
if v:
mlflow.log_param(k, v)
I would gladly provide you with more details if necessary. Thanks in advance for all the tips and suggestions!
-
1\$\begingroup\$ Code Review needs code that can be run by reviewers, can you give an example input and output so it's executable by reviewers? \$\endgroup\$lukstru– lukstru2022年07月15日 10:29:53 +00:00Commented Jul 15, 2022 at 10:29
-
\$\begingroup\$ @lukstru added input, output is in the photo \$\endgroup\$Kami– Kami2022年07月15日 11:17:45 +00:00Commented Jul 15, 2022 at 11:17
-
\$\begingroup\$ Also could you give your variables some names that have some meaning too please or use _ if they're not needed. It looks like a function could be useful, but it's tricky to read with all those single or double letters eg use: node or log_item or error_log_item_list. I'm just making up some names but I'm sure you get the drift. Or _ \$\endgroup\$Lozminda– Lozminda2022年07月15日 12:32:30 +00:00Commented Jul 15, 2022 at 12:32
-
\$\begingroup\$ @Lozminda thanks for the input. in my case, k = key and v = value, I thought such notation was standard while iterating through a dictionary. I admit that kkkk and vvvv are not as obvious, but also had no idea what naming would be better. giving a name like k_dict_num_3 seems redundant to me. but I would love to hear any name suggestions, that make the code more readable in my case \$\endgroup\$Kami– Kami2022年07月15日 12:49:48 +00:00Commented Jul 15, 2022 at 12:49
-
\$\begingroup\$ @Kami You know what you're looking at having worked on it for a day/week/month. The folks at code review are giving their time for nothing, so anything you can do to make our lives a bit easier, the better. After looking through everything you've posted (for 15-20 mins) I now kinda see what you're trying to achieve. My question is: Can what you call all your varaibles help reduce that time ? Maybe the answer is no ? Maybe yes. Re conventions they're always debated. Will explore more deeply if I have time later. :-) \$\endgroup\$Lozminda– Lozminda2022年07月15日 13:09:52 +00:00Commented Jul 15, 2022 at 13:09
2 Answers 2
It looks like you are basically trying to "flatten" a nested dict. A recursive generator works nicely. For each key,value pair in the dict, make a recursive call if the value is a dict, otherwise yield the key,value pair (the base case).
Note: you shouldn't use type(cfg) == some_class
. Use isinstance()
instead.
from collections.abc import Mapping
def flatten_cfg(cfg, keys=''):
for key, value in cfg.items():
new_keys = f"{keys}.{key}" if keys else key
if isinstance(value, Mapping):
yield from flatten_cfg(value, new_keys)
else:
yield (new_keys, value)
Use it like so:
cfg = {
"ANCHOR":{
"GENERATOR":{
"ANGLES":[[-90, 0, 90]],
"ASPECT_RATIOS":[[0,5, 1.0, 2.0]],
"NAME":"DefaultAnchorGenerator",
"SIZES":[[32], [64], [128], [256], [512]]
}
},
"AUG":{
"FLIP":True,
"MAX_SIZE":4000,
"MIN_SIZES":(400,500,600,700,800,900,1000,1100,1200)
},
"BACKBONE":{
"FREEZE_AT":2,
"NAME":"build_resnet_fpn_backbone"
}
}
flat_cfg = list(flatten_cfg(cfg))
width = max(len(name) for name,_ in flat_cfg)
for name, value in flat_cfg:
print(f"{name:{width}s} {value}")
Output:
ANCHOR.GENERATOR.ANGLES [[-90, 0, 90]]
ANCHOR.GENERATOR.ASPECT_RATIOS [[0, 5, 1.0, 2.0]]
ANCHOR.GENERATOR.NAME DefaultAnchorGenerator
ANCHOR.GENERATOR.SIZES [[32], [64], [128], [256], [512]]
AUG.FLIP True
AUG.MAX_SIZE 4000
AUG.MIN_SIZES (400, 500, 600, 700, 800, 900, 1000, 1100, 1200)
BACKBONE.FREEZE_AT 2
BACKBONE.NAME build_resnet_fpn_backbone
-
\$\begingroup\$ just tested your code, does the same as mine, but much cleaner! thank you so much! I knew I could extract the functionality of the same if-else-conditions, but wasn't sure how. I've never used yield, but now I know how. and also great isinstance() tip! \$\endgroup\$Kami– Kami2022年07月18日 08:55:50 +00:00Commented Jul 18, 2022 at 8:55
I noticed I tend to use many if-conditions in my code
Conditionals are not necessarily bad, and code can contain lots of them. They are however hard to reason about when they're nested. However, you have made your work to fix them easier for yourself by adding convenient comments:
# (1) if yes, iterate through it to log separately
These comments describe what the block below does, and provide good indicators for code that can be extracted, and usually provide a good name as well. If you use Jetbrains products like IntelliJ, the IDE can do most of the work for you: mark the block to be extracted -> right click -> Refactor -> Extract. If not, refactoring.guru/extract-method has a neat example and explanation.
You also have quite a few imbalanced conditionals in the style of
if condition:
doSomethingRegular()
else:
edgeCase()
This is a prime example of code that can be refactored into a guard statement:
if not condition:
return edgeCase()
return doSomethingRegular()
return
can of course be replaced with continue or break, whatever is needed. Usually, you want to extract the inner blocks though and then it ends up as a return statement anyways. For more info, https://refactoring.guru/replace-nested-conditional-with-guard-clauses is a good resource.
My recent post about making code more readable is also a good resource if you want to go a bit more in depth.