I'm working on quite complicated scientific project.
I decided to use a configuration file for model description. However it was quite complicated to parse all strings after ConfigParser
and convert every option to required objects, so I would like to avoid this long/boring code.
I wrote very simple parser which converts all options into Python objects by default. With some additional name resolving I've got a simple, elegant and useful solution.
import os,sys,ConfigParser, types
import logging
from ConfigParser import ConfigParser
import numpy as np
import scipy as sp
def nameresolv(item,nspace):
copir = item.split("@")
if len(copir) < 2:
res = item
else:
res = ''
for pre,var in map(None,copir[::2],copir[1::2]):
if pre != None: res += pre
if var == None: continue
var = var.split(":")
if len(var) != 2: return None
if not var[0] in nspace : return None
if not var[1] in nspace[var[0]] : return None
res += 'nspace["%s"]["%s"]'%tuple(var)
copir = res.split("$")
if len(copir) < 2: return res
res = ''
for pre,var in map(None,copir[::2],copir[1::2]):
if pre != None: res += pre
if var == None: continue
var = var.split(":")
if len(var) != 2: return None
if not var[0] in nspace : return None
if not var[1] in nspace[var[0]] : return None
if type(nspace[var[0]][var[1]]) is types.LambdaType and nspace[var[0]][var[1]].__name__ == '<lambda>':
res += 'nspace["%s"]["%s"]'%tuple(var)
else:
res += str(nspace[var[0]][var[1]])
return unicode(res)
def confreader(filename,nspace = {}):
"""
Reads file with configurations and returns dictionary with sections
and options. All options will be turned into python objects (DON'T
FORGET PUT ALL STRING OPTIONS WITHIN QUOTES).
You can use @SECTION:OPTION@ notation to refer to existed python object,
or $SECTION:OPTION$ to convert object back into a string and insert a string.
Returns option dictionary or {}.
Empty dictionary indicates error with file opening, reading or parsing.
If confreader couldn't turn option into some python object, this
options is skipped and Warning message will put in logger.
"""
if filename == None:
return nspace
if not os.access(filename,os.R_OK):
return nspace
config = ConfigParser()
config.optionxform=str
try:
config.read( filename )
except :
return nspace
for section in config.sections():
if not section in nspace: nspace[section]={}
for option in config.options(section):
if option in nspace[section]:
logging.error("Name conflict option \'%s\' exists in section [\'%s\']"%(option,section) )
return {}
xitem = unicode( config.get(section,option) )
item = nameresolv(xitem,nspace)
if item == None:
logging.error("Problem with resolving option in [\'%s\']\'%s\'=\'%s\'"%(section,option,item) )
return {}
try:
exec "nspace[\""+section+"\"][\""+option+"\"]="+item
except :
logging.warning("Problem with reading configuration from the %s"%filename)
logging.warning("Cannot read section: \'%s\', option: \'%s\'"%(section,option) )
logging.warning(" %s"%item)
logging.warning("!!!! SKIPPED IT !!!!")
pass
return nspace
An example:
#FILE: examples.cfg
[LINKS]
x = ["a","b"]
y = [ @LINKS:x@, "c"]
z = ["x"]+@LINKS:x@+["c","d"]
[COMPUTATIONS]
x = 5
y = 7
x+y = @COMPUTATIONS:x@+@COMPUTATIONS:y@
x*y = @COMPUTATIONS:x@*@COMPUTATIONS:y@
lst = range(@COMPUTATIONS:x*y@)
filter = @COMPUTATIONS:lst@[@COMPUTATIONS:x@:@COMPUTATIONS:y@]
[FUNCTIONS]
fun = lambda x: x**2+32
operation = @FUNCTIONS:fun@(12)
fun(x+y) = @FUNCTIONS:fun@(@COMPUTATIONS:x+y@)
[LINKS_AND_STRINGS]
x = "I"
y = "Python"
exmp1 = @LINKS_AND_STRINGS:x@+' love '+@LINKS_AND_STRINGS:y@
#same but with string resolving. Please not that sting inside ' '
exmp2 = '$LINKS_AND_STRINGS:x$ love $LINKS_AND_STRINGS:y$'
#BUT this will create an a problem
#exmp3 = '@LINKS_AND_STRINGS:x@ love @LINKS_AND_STRINGS:y@'
#You can resolve variable into string
exmp4 = 'In my $LINKS_AND_STRINGS:y$ + conf, $LINKS_AND_STRINGS:x$ can resolve x+y = $COMPUTATIONS:x+y$ inline'
#END examples.cfg
And the result of parsing:
>>> from pyconf import pyconf
>>> cfg = pyconf("examples.cfg")
>>> for name in cfg: print name
...
COMPUTATIONS
FUNCTIONS
LINKS
LINKS_AND_STRINGS
>>>
>>> for name in cfg["LINKS"]: print name,"=",cfg["LINKS"][name]
...
y = [['a', 'b'], 'c']
x = ['a', 'b']
z = ['x', 'a', 'b', 'c', 'd']
>>>
>>> for name in cfg["COMPUTATIONS"]: print name,"=",cfg["COMPUTATIONS"][name]
...
filter = [5, 6]
lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
x+y = 12
y = 7
x = 5
x*y = 35
>>>
>>> for name in cfg["FUNCTIONS"]: print name,"=",cfg["FUNCTIONS"][name]
...
fun = <function <lambda> at 0xa5841ec>
operation = 176
fun(x+y) = 176
>>>
>>> for name in cfg["LINKS_AND_STRINGS"]: print name,"=",cfg["LINKS_AND_STRINGS"][name]
...
y = Python
x = I
exmp4 = In my Python + conf, I can resolve x+y = 12 inline
exmp1 = I love Python
exmp2 = I love Python
>>>
I'd like to ask Python gurus for their opinion and any suggestions on code improvement. I'll appreciate any ideas on making this code better. Yep, feel free to use it if you'd like.
-
1\$\begingroup\$ it might be helpful to provide an example of how you actually intend to use this programme for model description. There might be easier ways than using a config file. \$\endgroup\$Stuart– Stuart2014年06月08日 22:23:16 +00:00Commented Jun 8, 2014 at 22:23
-
\$\begingroup\$ Thank you for comment @Stuart. I used to use a lot of different approaches: XML, SQL (SQLight), just python and NEURON[neuron.yale.edu/neuron/] structures and so on. When I started my new project I have found that config file seems to be well fitted for my model description. However it was hard to describe some computation procedures in config (like some parameter gradients in a population of neurons). So in this solution, python functionality appears in config file only when we need it. Yes and I call lambda functions from code also. \$\endgroup\$rth– rth2014年06月09日 03:14:18 +00:00Commented Jun 9, 2014 at 3:14
1 Answer 1
Firstly, some general style comments:
- Take a look at PEP8, the official Python style guide. This gives some really nice pointers on how to style your code.
Avoid multiple statements per line. You do this a lot. Putting multiple statements on a single line makes the code more difficult to follow. So instead of this:
if pre != None: res += pre
use this style:
if pre is not None: res += pre
Use better variable names. When thinking of names, its better to err on the side of being too verbose than being too terse. Some shortened names are fine due to a global understanding of what they mean (i.e.
str
,char
, etc.). Applying this idea:result
is better than:
res
Whitespace is your friend. Place single blank lines to separate logical sections of code. Also, PEP8 gives a nice overview of where space in statements is appropriate and when its not. A (somewhat extreme) example:
# This is bad whitespace foo= list ( [2,3,4, 5 ]) # This is 'correct' whitespace foo = list([2, 3, 4, 5])
Technically your indentation is fine. However, conventional Python uses a 4-space indentation level to mark deeper blocks of code.
In Python, the naming convention for basically everything is to use
underscores_in_names
. I don't see anycamelCase
orPascalCase
in your code (which is good). However, your function namesconfreader
andnameresolv
should be formatted like so:conf_reader
andname_resolv
.Since we're already speaking on function names, typically you want to start function names with a verb:
# This is better than... def resolve_name(): # this. def name_resolve():
Each import of individual modules should get their own line:
# No import os, sys, logging # Yes import os import sys import logging # This is also fine from mymodule import foo, bar, baz
Now onto some improvements:
In your
nameresolv
function, you have a for loop where you usemap
with aNone
function. What this does according to the docs is:If function is None, the identity function is assumed;
So your code works. However, in Python 3.X this actually errors. Essentially what you want to do is get pairs of consecutive elements in an iterable. To do this, we can use the
zip
function:for pre, var in zip(copir[::2], copir[1::2]):
Because we now use the
zip
function, we can remove theNone
check and simply appendpre
each time. This is becausepre
will either be a necessary string or it will be''
. Either way it is fine to append tores
.Inside the for loop described above, you can combine multiple of your if statements:
if len(var) != 2: return None if not var[0] in nspace : return None if not var[1] in nspace[var[0]] : return None
into:
if len(var) != 2 or not var[0] in nspace or not var[1] in nspace[var[0]]: return None
When creating strings, using
str.format
is more preferred than using the%
formatting notation. So your formatting statement would now look like this:res += 'nspace["{}"]["{}"]'.format(*var)
You could (and should) pull your for-loop into its own function. You essentially do the same loop twice. Seeing duplicated code is a sure-fire indicator that a function can be created:
def build_result(split, namespace, check_lambda=False): result = '' for prefix, var in zip(split[::2], split[1::2]): if not var: continue try: # This will error if len(var.split()) != 2 section, option = var.split(':') except ValueError: return None if section not in namespace or option not in namespace[section]: return None if not check_lambda: result += 'namespace["{}"]["{}"]'.format(section, option) continue value = namespace[section][option] if isinstance(value, types.LambdaType) and value.__name__ == '<lambda>': result += 'namespace["{}"]["{}"]'.format(section, option) else: result += str(value) return result
With my suggestions, your code now looks like this:
import os
import sys
import types
import logging
from configparser import ConfigParser
def is_lambda(value):
return isinstance(value, types.LambdaType) and value.__name__ == '<lambda>'
def build_result(split, namespace, check_lambda=False):
result = ''
for prefix, var in zip(split[::2], split[1::2]):
if not var:
continue
try:
# This will error if len(var.split()) != 2
section, option = var.split(':')
except ValueError:
return None
if section not in namespace or option not in namespace[section]:
return None
if not check_lambda:
result += 'namespace["{}"]["{}"]'.format(section, option)
continue
value = namespace[section][option]
if is_lambda(value):
result += 'namespace["{}"]["{}"]'.format(section, option)
else:
result += str(value)
return result
def resolve_name(item, namespace):
# Check "'@' not in" to save from having to call `split`
if '@' not in item:
result = item
else:
result = build_result(item.split('@'), namespace)
# Check for error before next build_result
if result == None:
return result
# Same reasoning as above.
if '$' not in result:
return result
return unicode(build_result(result.split("$"), namespace, check_lambda=True))
def parse_config(filename, namespace={}):
if not filename or not os.access(filename,os.R_OK):
return namespace
config = ConfigParser()
config.optionxform = str
# This should not error. If a file could not be read, it will skip it.
# Worst case, config will be an empty list.
config.read(filename)
for section in config.sections():
if section not in namespace:
namespace[section] = {}
for option in config.options(section):
if option in namespace[section]:
logging.error("Name conflict option '{}' exists in section ['{}']".format(option, section))
return {}
item = resolve_name(unicode(config.get(section, option)), namespace)
if not item:
logging.error("Problem with resolving option in ['{}']'{}' = {}'".format(section, option, item))
return {}
try:
exec "namespace['{}']['{}'] = {}".format(section, option, item)
except Exception:
logging.warning("Problem with reading configuration from the {}".format(filename))
logging.warning("Cannot read section: '{}', option: '{}'".format(section, option))
logging.warning("\t\t{}".format(item))
logging.warning("!!!! SKIPPED IT !!!!")
return namespace
-
1\$\begingroup\$
if pre is not None
is preferable toif pre != None
. \$\endgroup\$200_success– 200_success2014年06月09日 18:44:45 +00:00Commented Jun 9, 2014 at 18:44 -
\$\begingroup\$ To @DarinDouglass, I have got an error message in line 'if len(var) != 2 or not var[0] in nspace or not var[1] in nspace[var[0]]: return None' It seems we cannot concatenate conditions by or in this case. for example if `v=["a"]' the condition 'if len(v) == 2 or not "b" in v[2]: print v' returns an error 'Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range ' \$\endgroup\$rth– rth2014年06月12日 02:00:00 +00:00Commented Jun 12, 2014 at 2:00
-
\$\begingroup\$ Is your code throwing an
IndexError
? The only place I see inbuild_result
that could throw that error is the for loop statement (which is not the line you mentioned). Also, I guard against this error with mysection, option = var.split()
line. This assures thatvar
has two parts to it (i.evar[0]
andvar[1]
are valid indices). \$\endgroup\$BeetDemGuise– BeetDemGuise2014年06月12日 12:02:07 +00:00Commented Jun 12, 2014 at 12:02 -
\$\begingroup\$ To @DarinDouglass. I've found a lot of bugs in your code. Before I used just ideas to improve my code, but today I coped - pasted it. So I've edit code: there was extra column after lambda in the line 7; there were no closed parentheses for logging.warning function in lines 61-63. I add checking if result is None, otherwise it tries split None and end up with exception. \$\endgroup\$rth– rth2014年06月13日 15:18:54 +00:00Commented Jun 13, 2014 at 15:18
-
\$\begingroup\$ @user29689 Thanks for the extra pair of eyes. I wasn't able to test the code before my review went up. \$\endgroup\$BeetDemGuise– BeetDemGuise2014年06月13日 15:25:48 +00:00Commented Jun 13, 2014 at 15:25