The question/s
I'm working on a small scientific Python package. Many of the public methods it is going to offer will have to deal with dimensional input. A wavelength, for example, which could be available to the user in microns or as the equivalent vacuum photon frequency or who knows how.
I could simply state what units a method expects in its documentation and leave it at that, however, from a user perspective, I find having to think about unit conversion very annoying and I know from experience that it is a great source of errors. For this reason, I would like to use a ready-made unit conversion package (probably astropy.units
due to my background) to take care of converting arguments to public methods to some unit system internal to the package so that private (or hidden, since we're talking about Python) methods don't have to "think" about units.
- Are there arguments against this approach apart from more development work compared to simply stating expected units in the documentation?
- What design choices are there for implementing this?
An example
Say I have the following (freely invented) hidden function:
def _foo(l):
"""
Photon foobar length.
Parameters
----------
l : float
Wavelength [micron].
Returns
-------
: float
Foobar length [micron].
"""
return l/_bar(l)
Where _bar
does something non-trivial that might even involve calling compiled C extensions.
I would like the public counterpart foo
of _foo
to behave as follows:
>>> import astropy.units as u
>>> foo(1*u.m)
<Quantity 0.5 m>
>>> foo(1*u.micron)
<Quantity 0.1 micron>
>>> foo(1)
0.1
My research so far
People must have dealt with similar problems before, but apparently I don't know what terminology to use for searching.
The best approach I can think of is using a decorator to do the work:
foo = unit_decorator(_foo,
arg_units=['wavelength'],
kwarg_units={},
out_units=['length']
)
This requires typing out what each argument represents, but I guess there's no way around that. The more important concern I have with this approach is that it's not extremely obvious that a change in _foo
's signature probably requires changing the above code as well. With duck typing that might produce subtle bugs.
Feel free to suggest tags to use for this question, I wasn't really sure which ones apply.
2 Answers 2
Well, you wrote
to take care of converting arguments to public methods to some unit system internal to the package so that private (or hidden, since we're talking about Python) methods don't have to "think" about units.
so for yourself, you already see the major drawback of the "arbitrary unit" approach: it is way less tedious any less error prone to make a decision for one unit system and use that consistently throughout your whole code base.
But the same argument applies to the users of your package: they will probably experience the same observation! So they may be much happier if you offer them just one system.
Of course, if those users have to import many different packages, and all of these packages are using different conventions, they would benefit from having a standard way for doing those conversion. Ultimately, you will have to ask them, what they prefer and how they intend to use your package. They may ask you
to provide one unit system, maybe consistent with something they are already used to
to provide one version of the package with a global switch (for example, between metric and non-metric system)
to provide only some functions in two or three different variants, each one accepting a different scale of numbers
or exactly what you suggested: a package where every function can deal with arbitrary units.
But beware, chances are high that in case you add the fullest possible flexibility to your package, almost noone uses that.
-
Thank you, I guess there's not much of a point in implementing something fancy no one will use. Except for me feeling good about myself :-D. If it turns out that people would want this feature, do you have any suggestions on the implementation itself?user35915– user359152019年06月08日 04:18:01 +00:00Commented Jun 8, 2019 at 4:18
Make the parameter strongly typed, and make the type responsible for performing the conversions.
An example of this pattern is the C# TimeSpan
class. Instances can be obtained from various static methods with self-documenting names like FromMinutes
. And through properties like TotalMinutes
, the value of an instance can be expressed in various units.
Following this pattern, your methods could accept a Wavelength
type that can be constructed and expressed in various units, with all instances having the same internal representation.
astropy.units
, if you're talking about that, you would get 50 kilogram-meters and if you multiplied that by cm and divided by s**2 you could convert it to J. For my package the idea would be to give users the possibility to input e.g. an energy in any unit (J, eV, erg) and to haveastropy
convert it to some default energy unit. Then you can internally just use thevalue
(float or numpy array) instead of theQuantity
inastropy
linguo.