Oh wait, one more (not sure if these are still applicable, but maybe worth a look):
This won't be directly applicable because it has LUA_TRIGGERHOOK code in it (related to a sampling profiler hook we added but never shared), but the principle of the bug may still apply and the description + patch should make it reasonably easy to confirm and apply a corresponding fix.
DT
commit 024bcdb6f71d01534b72fd7ea19ebf26b5a95914
Author: Dan Tull <dtull@adobe.com>
Date: Thu Sep 27 10:48:48 2012 -0800
(@852507)
Fix a very rare but nasty bug in breakpoints in the Lua VM code:
1. Set a single breakpoint (it helps to pick a common entry point from native code).
2. Pause the app so that you stop on the same opcode via the trigger hook.
3. While paused via trigger, clear the breakpoint.
4. Resume the debuggee.
Result: SEGFAULT
Solution: don't fetch the instruction until after the hook executes so you don't end up with a dangling halt.
Normally, I'd delimit the patch based so the original version is preserved
It would be overly messy for no behavioral difference in this case, though.
We never saw this because pause didn't used to be as fast as it is now, so there was only a tiny probability of stopping due to a pause request on a line that also had a breakpoint (and clearing that breakpoint).
[git-p4: depot-paths = "//lua/main/third_party/lua_tp/lua_5_1/": change = 852508]
diff --git a/src/lvm.c b/src/lvm.c
index e970cd9..c7f02b7 100644
--- a/src/lvm.c
+++ b/src/lvm.c
@@ -417,8 +417,6 @@ void luaV_execute (lua_State *L, int nexeccalls) {
k = cl->p->k;
/* main loop of interpreter */
for (;;) {
- Instruction i = *pc++;
- StkId ra;
#ifdef LUA_TRIGGERHOOK
if ((L->hookmask & (LUA_MASKLINE | LUA_MASKCOUNT | LUA_MASKTRIGGER)) &&
(--L->hookcount == 0 || L->hookmask & LUA_MASKLINE || G(L)->triggers != 0)) {
@@ -426,13 +424,15 @@ void luaV_execute (lua_State *L, int nexeccalls) {
if ((L->hookmask & (LUA_MASKLINE | LUA_MASKCOUNT)) &&
(--L->hookcount == 0 || L->hookmask & LUA_MASKLINE)) {
#endif
- traceexec(L, pc);
+ traceexec(L, pc + 1);
if (L->status == LUA_YIELD) { /* did hook yield? */
- L->savedpc = pc - 1;
+ L->savedpc = pc;
return;
}
base = L->base;
}
+ Instruction i = *pc++;
+ StkId ra;
/* warning!! several calls may realloc the stack and invalidate `ra' */
#ifdef LUA_HALT_OP
resume:
> ... implementing OP_HALT changes the overhead from running with a debugger from 'severe' to 'unnoticeable'.
Sweet! That's exactly why I wrote it. Glad it was helpful (if rather ancient at this point). The difference between slowing down every opcode to check if there's a breakpoint and breakpoints that only have overhead if they get hit is like night and day.
> I would submit to Roberto et al. that adding something OP_HALT-ish to the official Lua sources might not be a bad idea...
One other fun experiment I did with OP_HALT is to use it to pepper a whole codebase with self-clearing breakpoints for nearly zero overhead instruction level code coverage (code coverage with the line hook is punishingly slow in some of our apps whereas I accidentally left my experimental breakpoint based code coverage on for a couple of weeks because I didn't notice the speed penalty). It can also be used for dtrace-like instrumentation points.
At one point, I did apply the OP_HALT patches to 5.2 and 5.3 internally, but we have never really found a compelling reason to actually move from our 5.1.x derivative, so I never published those patches (it was fairly straightforward but not automatic to port the patch from 5.1 as I recall) since they never got enough use to be all that confident in their correctness.