Hi Alex et al,
I ran into this behavior again, when writing a random /set, insert,
remove/ routine to test a table counter, implemented as core patch.
The counter worked all right. But my double checking lua code kept
failing. In other words: I found it exceedingly difficult to predict
the effect of insert(t,n,) on the number of elements in the table.
Like, that actual number that you can traverese. Nils and holes or
however you call it or regard it, separately or in one, /not/
counted.
Until I got the impression that insert(t,n,v) is in fact becoming
un-predictable at some point. Without knowledge of the past, it
would seem impossible to predict whether insert(t,n,v) will actually
insert a value, and grow the table, or instead replace a value in
the table, deleting what was at the specified key position.
The value of t[n] is no prediction. Taking #t is no help either.
That might be according to spec, but my point is not whether it
works according to spec. I wonder if it makes any sense.
And more to the point, I wonder, still, is there any gain by
allowing insert(t,n,nil)? <-- nil
On an abstract level, should a function that is to be used only on a
data type A be allowed to make type B from it, as a mere side
effect, and thus render any next use of itself on the same data to
be of undefined effect? It's not a data type transformation
function, after all.
Of course the reason for the trouble is that you lose strictness.
That's what you do by inserting nil. You would know that. But would
you know that insert() might thus start to replace values now and
then? Not always but sometimes? Could that ever be intended
behavior? Like you have the options to
- use insert(t,n,nil) once and then never again use it
OR
- live with that insert(t,n,v) may from then on sometimes
replaces values?
Even if I am getting it all wrong -- it invites a mean sort of
error.
With nil as allowed third argument, you can never be sure at what
point your program starts to replace values in the table, instead of
pushing them up, for any n ~= #t.
Trivial case, no nil: insert(t,n,v) pushes up as expected:
t = {}
table.insert(t,1,'a')
-- { 1:a }
table.insert(t,2,'x')
-- { 1:a 2:x }
table.insert(t,2,'y')
-- { 1:a 2:y 3:x }
Replace effect: one nil, re-use of index: insert(t,n,v) replaces a
value
t = {}
table.insert(t,1,nil) <-- insert nil into empty table
-- { empty }
table.insert(t,2,'x') <-- that's your mistake, you
should not use 2 here.
-- { 2:x }
table.insert(t,2,'y') <-- using 2 again
-- { 2:y } <-- table value at key 2 has been
replaced.
The nil needs not come first.
And two consecutive nils can make the replace effect happen faster:
t = {}
table.insert(t,1,"x")
-- { 1:x }
table.insert(t,1,nil) <-- insert nil into filled table,
makes hole
-- { 2:x }
table.insert(t,1,nil) <-- kind of aggravates the hole so
the following happens
-- { 3:x }
table.insert(t,3,"y") <-- yes, 3, it's a used key after
all
-- { 3:y }
table.insert(t,3,"z") <-- also replaces
-- { 3:z } #0
Here the same again with #t added for inspection. The key to avoid
the trouble is to monitor #t. But since it tends to drop below a
chunk that might be interesting for your, isn't that making
insert(t,n,v) pretty much useless? In these samples here it gets 0.
It mostly wont in real cases and still almost make insert(t,n,v)
useless, because it restricts your choices for n, possibly as harsh
as down to 1. What use could that leave over insert(t,v) ==
insert(t,v,#t+1)?
Trivial case, sane #t.
t = {}
table.insert(t,1,'a')
-- { 1:a } #t=1
table.insert(t,2,'x')
-- { 1:a 2:x } #t=2
table.insert(t,2,'y')
-- { 1:a 2:y 3:x } #t=3
Replace effect: #t gets 0 because of holes, so insert starts to
replace sometimes.
Lets define i as the number of inserts you made.
t = {}
table.insert(t,1,nil) <-- insert nil into empty table
-- { empty } #t=0
table.insert(t,2,'x') <-- 2 is <= i but not <=
#t+1 as probably expected.
-- { 2:x } #t=0
table.insert(t,2,'y') <-- using 2 again, still <= i,
still not <= #t+1
-- { 2:y } #t=0 <-- table value at key 2 has been
replaced.
The main thing is #t going 0, not the holes themselves:
t = {}
table.insert(t,1,"x")
-- { 1:x } #1
table.insert(t,1,nil) <-- hole
-- { 2:x } #2
table.insert(t,1,nil) <-- #t goes 0
-- { 3:x } #0
table.insert(t,3,"y") <-- replace with index NOT used
before (but > #t)
-- { 3:y } #0
table.insert(t,3,"z") <-- replace with index used before
-- { 3:z } #0
The table can be 'healed' and the replace-behavior, for the same
index, go away.
That can be nice and it can make it almost impossible to track, if
happening in error:
-- continued from above:
table.insert(t,1,"o")
-- { 1:o 3:z } #1
table.insert(t,2,"p")
-- { 1:o 2:p 3:z } #3 <-- table 'healed'
table.insert(t,3,"u") <-- same index again, no replace
effect this time
-- { 1:o 2:p 3:u 4:z } #4 <-- everything dandy again from here
on in
Making only one hole only delays the replacement effect:
t={}
table.insert(t,1,"x")
-- { 1:x } #1
table.insert(t,1,nil) <-- hole
-- { 2:x } #2
table.insert(t,3,"y") <-- no replace
-- { 2:x 3:y } #0
table.insert(t,3,"z") <-- same index, replace
-- { 2:x 3:z } #0
table.insert(t,1,"o") <-- healed
-- { 1:o 2:x 3:z } #3
table.insert(t,3,"u") <-- same index, no replace
-- { 1:o 2:x 3:u 4:z } #4
I am only opting for throwing an error for insert(t,n,nil), because
insert(t,n,nil) may be useless and dangerous.
The above is the part about the perceived danger. I think it is
really bad.
I can't see any use case that justifies that, in the face of this.
Best,
Henning