homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: bsddb: test01_basic_replication fails sometimes
Type: behavior Stage: resolved
Components: Extension Modules, Tests Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: jcea Nosy List: BreamoreBoy, amaury.forgeotdarc, barry, benjamin.peterson, flox, gregory.p.smith, jcea, mhammond, ocean-city, r.david.murray, trent
Priority: normal Keywords: buildbot, patch

Created on 2008年09月18日 01:18 by benjamin.peterson, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
fix_timeout.patch ocean-city, 2009年01月27日 14:29 review
bsddb3-skip-random-failure.patch r.david.murray, 2010年01月29日 20:46 review
Messages (23)
msg73348 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008年09月18日 01:18
This happens on the Windows buildbot everything once and a while [1]:
======================================================================
FAIL: test01_basic_replication
(bsddb.test.test_replication.DBReplicationManager)
----------------------------------------------------------------------
Traceback (most recent call last):
 File
"S:\buildbots\python\trunk.nelson-windows\build\lib\bsddb\test\test_replication.py",
line 122, in test01_basic_replication
 self.assertTrue(time.time()<timeout)
AssertionError
----------------------------------------------------------------------
http://python.org/dev/buildbot/stable/x86%20W2k8%20trunk/builds/159/step-test/0 
msg73351 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008年09月18日 01:44
We're going to disable this test for 2.6rc2. Please jcea and
gregory.p.smith, take a look at it for 2.6 final. I've made it a
deferred blocker for that release.
msg73352 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008年09月18日 01:51
I instrumented the code a little. The error is happening because
self.client_startupdone never gets set to True. This is supposed to be
set in the client_startupdone() method. It expects an event type of
db.DB_EVENT_REP_STARTUPDONE, but we see exactly one call to this
function with a value of 2, which is apparently DB_EVENT_REP_CLIENT. 
After this call no further calls are made to the method and after 10
seconds the test gives up.
msg73357 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008年09月18日 02:50
As discussed with Barry on #python-dev, I committed r66498 which skips
the failing assertion on Windows and replaces it with some noise to
stderr. Note that only that one assertion fails - the rest of the test
passes on Windows.
Also, Brett Cannon on #python-dev experimented and could see the
callback function being called a number of times - with 2 first as
Windows does, but then a number of other times and once with the
expected value. I'm afraid I've no insight into why these aren't
delivered on Windows.
msg73370 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008年09月18日 11:22
Adding Nelson to the nosy list, since I'm unable to debug any Windows
issue by myself.
I'm very interested in the report saying that MS Windows Berkeley DB
correctly sends the db.DB_EVENT_REP_STARTUPDONE event in his setup. That
could indicate some issue in the buildbot.
Barry, I don't consider this a "release blocker" for 2.6.0. This is a
very advanced feature (database replication). This must be solved, but
2.6 doesn't need to be delayed for it.
msg73371 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008年09月18日 11:25
¿Somebody can try this issue in a Windows box?. It could be a Berkeley
DB bug in that platform, but I would like to verify this before
informing Oracle.
Thanks.
msg73386 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008年09月18日 14:43
I've tried this issue on VC6 + db4.7.25.0 + win2k, and I could reproduce
error.
I added following patch for investigation.
Index: Lib/bsddb/test/test_replication.py
===================================================================
--- Lib/bsddb/test/test_replication.py (revision 66483)
+++ Lib/bsddb/test/test_replication.py (working copy)
@@ -40,6 +40,7 @@
 self.confirmed_master=True
 def client_startupdone(a,b,c) :
+ print "DBReplicationManager:", (a, b, c)
 if b==db.DB_EVENT_REP_STARTUPDONE :
 self.client_startupdone=True
@@ -201,6 +202,7 @@
 self.confirmed_master = True
 def client_startupdone(a,b,c) :
+ print "DBBaseReplication:", (a, b, c)
 if b == db.DB_EVENT_REP_STARTUPDONE :
 self.client_startupdone = True
And this is result.
DBReplicationManager: (<DBEnv object at 0x00A80148>, 2, None)
DBReplicationManager: (<DBEnv object at 0x00A80148>, 5, 0)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 5, 13)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 7, None)
DBBaseReplication: (<DBEnv object at 0x00B49658>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B49658>, 5, 13)
DBBaseReplication: (<DBEnv object at 0x00B49730>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B49730>, 5, 13)
msg73415 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008年09月19日 02:15
We are seeing one more error almost identical to the one I fixed (even
the method name is the same), but its at line 315 - this is the last
error in the Windows buildbot! Please let me know if you would like me
to make a similar "fix" to this line, or if you think you will be able
to back out my hack or skip the test some other way.
msg73421 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年09月19日 05:56
just make a similar "fix" for now.
msg73452 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008年09月20日 01:18
Actually, I've decided to leave it alone. The buildbots most recent
failure was:
test test_bsddb3 failed -- Traceback (most recent call last):
 File
"S:\buildbots\python\trunk.nelson-windows\build\lib\bsddb\test\test_replication.py",
line 315, in test01_basic_replication
 self.assertTrue(time.time()<timeout)
AssertionError
But I can't repro that - instead, I now see:
Traceback (most recent call last):
 File "o:\src\python-svn\lib\bsddb\test\test_lock.py", line 127, in
test03_lock_timeout
 self.assertTrue((end_time-start_time) >= 0.1)
AssertionError
So I go back to the windows buildbots and they are all green!! So any
remaining issues appear truly transient and don't seem to be restricted
to a single place. I can't think of an obvious improvement to make here.
msg73785 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008年09月25日 12:53
Oracle guys are studying this issue. I will keep you informed.
This issue is not a release blocker, in any case. See rational in msg73370.
msg73912 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008年09月27日 01:37
I need some MS Windows user able to replicate this issue locally (not in
the buildbot). Oracle people need to do some test and I would like to
avoid to mess with builtbots.
Please, help!.
msg73920 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年09月27日 08:53
I reproduce the test failure very consistently, on both win2k and winXP, 
and may be of some help to test stuff.
msg80652 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009年01月27日 14:29
I've investigated more. I think this is timeout problem. According to 
following site,
http://www.oracle.com/technology/documentation/berkeley-
db/db/api_c/rep_timeout.html
timeout should be specified in microseconds. This means
self.dbenvMaster.rep_set_timeout(db.DB_REP_CONNECTION_RETRY,100123)
only waits 10msec. This is too small comparing to default timeout. 
Probably 10sec is appropriate here.
After I applied an attached patch "fix_timeout.py", I ran test several 
times and found no error.
# I changed time.time()+10 to time.time()+30 because my machine is too 
slow, so 10sec was not enough. :-[
msg94932 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009年11月05日 14:49
I changed the timeout to +30, and the bsddb3 test has been passing on
the windows buildbot. That buildbot is a virtualbox instance running on
a P4, so it makes sense that it just took a long time for the (second)
test to pass. While testing I noticed that on my gentoo buildbot the
ReplicationManager test would time out but the base test would pass...so
I wonder if the test that is currently being skipped would pass if you
gave that windows buildbot even longer to think about it. But since
bsddb3 isn't in py3k, I'm not inclined to take the time to find out.
msg95247 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009年11月14日 18:22
Failures still occur occasionally even with the timeout set to 60. So
I've turned the check that is skipped on Windows from an assertion into
a warning only on all other platforms, since bsddb support isn't
actively maintained and is gone in py3k. Fix applied to trunk in r76265
and 2.6 in r76267.
msg97735 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010年01月13日 19:45
Still occurs, on 2010年01月13日.
http://www.python.org/dev/buildbot/all/builders/x86%20Ubuntu%20trunk/builds/264 
msg98521 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年01月29日 18:35
Hmm. That's not actually the same timeout. What that means is that if the startup timeout triggers, then a subsequent test (one that is looking for the results of a client transaction) fails, which contradicts the earlier observation that even if the startup timeout fails then everything else still passes.
msg98537 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年01月29日 20:46
Since this isn't likely to get fixed and it is causing buildbot instability, I propose the attached patch to just skip the remainder of the test method if this second timeout happens.
msg100234 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010年03月01日 01:55
r.david.murray - sounds like a good idea.
msg100266 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年03月01日 20:09
Applied skip patch to trunk in r78558, and 2.6 in 78560.
msg214525 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014年03月23日 00:10
Is any follow up needed to this or can it be closed as bsddb has been removed from Python 3?
msg214529 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014年03月23日 00:52
Yeah, let's just close it. Nobody is likely to care enough to try to fix it properly.
History
Date User Action Args
2022年04月11日 14:56:39adminsetgithub: 48142
2014年03月23日 00:52:52r.david.murraysetstatus: languishing -> closed
resolution: out of date
messages: + msg214529

stage: needs patch -> resolved
2014年03月23日 00:10:57BreamoreBoysetnosy: + BreamoreBoy

messages: + msg214525
versions: - Python 2.6
2010年03月01日 20:09:33r.david.murraysetstatus: open -> languishing

messages: + msg100266
2010年03月01日 01:55:27gregory.p.smithsetmessages: + msg100234
2010年01月29日 20:47:00r.david.murraysetfiles: + bsddb3-skip-random-failure.patch
keywords: + patch
messages: + msg98537
2010年01月29日 18:38:33r.david.murraysetkeywords: + buildbot, - patch
2010年01月29日 18:35:40r.david.murraysetmessages: + msg98521
stage: patch review -> needs patch
2010年01月13日 19:45:53floxsetnosy: + flox
messages: + msg97735
2009年11月14日 18:22:42r.david.murraysettitle: bsddb: test01_basic_replication fails on Windows sometimes -> bsddb: test01_basic_replication fails sometimes
nosy: mhammond, barry, gregory.p.smith, jcea, amaury.forgeotdarc, ocean-city, benjamin.peterson, trent, r.david.murray
messages: + msg95247

components: + Tests
type: behavior
stage: patch review
2009年11月06日 02:47:35r.david.murraysetkeywords: - buildbot
2009年11月05日 14:49:12r.david.murraysetnosy: + r.david.murray

messages: + msg94932
versions: + Python 2.7
2009年10月30日 19:47:16r.david.murraysetkeywords: + buildbot
2009年01月27日 14:30:01ocean-citysetfiles: + fix_timeout.patch
keywords: + patch
messages: + msg80652
2008年09月27日 08:53:20amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg73920
2008年09月27日 01:37:59jceasetmessages: + msg73912
2008年09月26日 23:37:33benjamin.petersonsetpriority: release blocker -> normal
2008年09月25日 12:53:10jceasetmessages: + msg73785
2008年09月20日 01:18:06mhammondsetmessages: + msg73452
2008年09月19日 05:56:44gregory.p.smithsetmessages: + msg73421
2008年09月19日 02:15:25mhammondsetmessages: + msg73415
2008年09月18日 14:43:06ocean-citysetnosy: + ocean-city
messages: + msg73386
2008年09月18日 11:25:50jceasetmessages: + msg73371
2008年09月18日 11:22:42jceasetnosy: + trent
messages: + msg73370
2008年09月18日 05:44:17barrysetpriority: deferred blocker -> release blocker
2008年09月18日 02:50:17mhammondsetmessages: + msg73357
2008年09月18日 01:51:41mhammondsetnosy: + mhammond
messages: + msg73352
2008年09月18日 01:44:45barrysetpriority: deferred blocker
nosy: + barry
messages: + msg73351
2008年09月18日 01:43:36barrysetnosy: + gregory.p.smith
2008年09月18日 01:18:07benjamin.petersoncreate

AltStyle によって変換されたページ (->オリジナル) /