homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: float(0.0) singleton
Type: resource usage Stage:
Components: Interpreter Core Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder: Intern certain integral floats for memory savings and performance
View: 14381
Assigned To: tim.peters Nosy List: christian.heimes, georg.brandl, ldeller, rhettinger, terry.reedy, tim.peters
Priority: normal Keywords: patch

Created on 2008年10月03日 03:25 by ldeller, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python_zero_float.patch ldeller, 2008年10月03日 03:25 patch for svn trunk / py2.6 / py2.5
Messages (7)
msg74224 - (view) Author: lplatypus (ldeller) * Date: 2008年10月03日 03:25
Here is a patch to make PyFloat_FromDouble(0.0) always return the same 
float instance. This is similar to the existing optimization in 
PyInt_FromLong(x) for small x.
My own motivation is that the patch reduces memory by several megabytes 
for a particular in-house data processing script, but I think that it 
should be generally useful assuming that zero is a very common float 
value, and at worst almost neutral when this assumption is wrong. The 
minimal performance impact of the test for zero should be easily 
recovered by reduced memory allocation calls. I am happy to look into 
benchmarking if you require empirical performance data.
msg74228 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008年10月03日 07:46
Will it correctly distinguish between +0.0 and -0.0?
msg74243 - (view) Author: lplatypus (ldeller) * Date: 2008年10月03日 12:00
No it won't distinguish between +0.0 and -0.0 in its present form,
because these two have the same value according to the C equality
operator. This should be easy to adjust, eg we could exclude -0.0 by
changing the comparison
 if (fval == 0.0)
into 
 static double positive_zero = 0.0;
 ...
 if (!memcmp(&fval, &positive_zero, sizeof(double)))
msg74244 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008年10月03日 12:16
We need maybe more hardcoded floats. I mean a "cache" of current 
float. Example of pseudocode:
def cache_float(value):
 return abs(value) in (0.0, 1.0, 2.0)
def create_float(value):
 try:
 return cache[value]
 except KeyError:
 obj = float(value)
 if cache_value(value):
 cache[value] = obj
 return obj
Since some (most?) programs don't use float, the cache is created on 
demand and not at startup.
Since the goal is speed, only a benchmark can answer to my question 
(is Python faster using such cache) ;-) Instead of cache_float(), an 
RCU cache might me used.
msg74245 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008年10月03日 12:19
Please use copysign(1.0, fval) == 1.0 instead of your memcpy trick. It's
the cannonical way to check for negative zero. copysign() is always
available because we have our own implementation if the platform doesn't
provide one. We might also want to special case 1.0 and -1.0.
I've to check with Guido and Barry if we can get the optimization into
2.6.1 and 3.0.1. It may have to wait until 2.7 and 3.0.
msg74261 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008年10月03日 17:02
I question whether this should be done at all. Making the creation of a
float even slightly slower is bad. This is on the critical path for all
floating point intensive computations. If someone really cares about
the memory savings, it is not hard take a single in instance of float
and use it everywhere: ZERO=0.0; arr=[ZERO if x == 0.0 else x for x in
arr]. That technique also works for 1.0 and -1.0 and pi and other
values that may commonly occur in a particular app. Also, the technique
is portable to implementations other than CPython. I don't mind this
sort of optimization for immutable containers but feel that floats are
too granular. Special cases aren't special enough to break the rules. 
If the OP is insistent, then at least this should be discussed with the
numeric community who will have a better insight into whether the
speed/space trade-off makes sense in other applications beyond the OP's
original case.
Tim, any insights?
msg83884 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009年03月20日 23:14
I have 3 comments for future readers who might want to reopen.
1) This would have little effect on calculation with numpy.
2) According to sys.getrefcount, when '>>>' appears, 3.0.1 has 1200
duplicate references to 0 and 1 alone, and about 2000 to all of them. 
So so small int caching really needs to be done by the interpreter. Are
there *any* duplicate internal references to 0.0 that would help justify
this proposal?
3) It is? (certainly was) standard in certain Fortran circles to NAME
constants as Raymond suggested. One reason given was to ease conversion
between single and double precision. In Python, named constants in
functions would ease conversion between, for instance, float and decimal.
History
Date User Action Args
2022年04月11日 14:56:39adminsetgithub: 48274
2012年03月22日 11:55:09kristjan.jonssonsetsuperseder: Intern certain integral floats for memory savings and performance
2009年03月20日 23:15:00terry.reedysetnosy: + terry.reedy
messages: + msg83884
2009年03月20日 01:07:50rhettingersetstatus: open -> closed
resolution: rejected
2009年03月20日 01:06:25vstinnersetnosy: - vstinner
2008年10月03日 17:02:19rhettingersetassignee: christian.heimes -> tim.peters
messages: + msg74261
nosy: + rhettinger, tim.peters
2008年10月03日 12:19:45christian.heimessetpriority: normal
assignee: christian.heimes
versions: + Python 3.0
messages: + msg74245
nosy: + christian.heimes
2008年10月03日 12:16:19vstinnersetnosy: + vstinner
messages: + msg74244
2008年10月03日 12:00:00ldellersetmessages: + msg74243
2008年10月03日 07:46:27georg.brandlsetnosy: + georg.brandl
messages: + msg74228
2008年10月03日 03:25:06ldellercreate

AltStyle によって変換されたページ (->オリジナル) /