Thursday, October 24, 2013
Origin of metaclasses in Python
There was some speculation on python-ideas today on whether Python's metaclass design came from Ruby. It did not. And as long as we are speculating about the origins of language features, I feel the need to set the record straight.
I was not inspired by Ruby at that point (or ever :-). Ruby was in fact inspired by Python. Mats once told me that his inspiration was 20% Python, 80% Perl, and that Larry Wall is his hero.
I wrote about metaclasses in Python in 1998: http://www.python.org/doc/essays/metaclasses/.
New-style classes were just the second or third iteration of the idea.
I was inspired to implement new-style classes by a very
specific book, "Putting Metaclasses to Work" by Ira Forman and Scott
Danforth (http://www.amazon.com/Putting-Metaclasses-Work-Ira-Forman/dp/0201433052).
But even Python's original design (in 1990, published in
1991) had the notion that 'type' was itself an object. The type pointer
in any object has always been a pointer to a special object, whose
"data" was a bunch of C function pointers implementing the behavior of
other objects, similar to a C++ vtable. The type of a type was always a
special type object, which you could call a meta-type, to be recognized
because it was its own type.
I was only vaguely aware of Smalltalk at the time; I
remember being surprised by its use of metaclasses (which is quite different from that in Python or Ruby!) when I read about
them much later. Smalltalk's bytecode was a bigger influence of Python's
bytecode though. I'd read about it in a book by Adele Goldberg and
others, I believe "Smalltalk-80: The Language and its Implementation" (http://www.amazon.com/Smalltalk-80-The-Language-its-Implementation/dp/0201113716).
Why Python uses 0-based indexing
I was asked on Twitter why Python uses 0-based indexing, with a link to a new (fascinating) post on the subject (http://exple.tive.org/blarg/2013/10/22/citation-needed/).
I recall thinking about it a lot; ABC, one of Python's predecessors,
used 1-based indexing, while C, the other big influence, used 0-based.
My first few programming languages (Algol, Fortran, Pascal) used 1-based
or variable-based. I think that one of the issues that helped me decide
was slice notation.
Let's first look at use cases. Probably the most common use cases for slicing are "get the first n items" and "get the next n items starting at i" (the first is a special case of that for i == the first index). It would be nice if both of these could be expressed as without awkward +1 or -1 compensations.
Using 0-based indexing, half-open intervals, and suitable defaults (as Python ended up having), they are beautiful: a[:n] and a[i:i+n]; the former is long for a[0:n].
Using 1-based indexing, if you want a[:n] to mean the first n elements, you either have to use closed intervals or you can use a slice notation that uses start and length as the slice parameters. Using half-open intervals just isn't very elegant when combined with 1-based indexing. Using closed intervals, you'd have to write a[i:i+n-1] for the n items starting at i. So perhaps using the slice length would be more elegant with 1-based indexing? Then you could write a[i:n]. And this is in fact what ABC did -- it used a different notation so you could write a@i|n.(See http://homepages.cwi.nl/~steven/abc/qr.html#EXPRESSIONS.)
But how does the index:length convention work out for other use cases? TBH this is where my memory gets fuzzy, but I think I was swayed by the elegance of half-open intervals. Especially the invariant that when two slices are adjacent, the first slice's end index is the second slice's start index is just too beautiful to ignore. For example, suppose you split a string into three parts at indices i and j -- the parts would be a[:i], a[i:j], and a[j:].
So that's why Python uses 0-based indexing.
Let's first look at use cases. Probably the most common use cases for slicing are "get the first n items" and "get the next n items starting at i" (the first is a special case of that for i == the first index). It would be nice if both of these could be expressed as without awkward +1 or -1 compensations.
Using 0-based indexing, half-open intervals, and suitable defaults (as Python ended up having), they are beautiful: a[:n] and a[i:i+n]; the former is long for a[0:n].
Using 1-based indexing, if you want a[:n] to mean the first n elements, you either have to use closed intervals or you can use a slice notation that uses start and length as the slice parameters. Using half-open intervals just isn't very elegant when combined with 1-based indexing. Using closed intervals, you'd have to write a[i:i+n-1] for the n items starting at i. So perhaps using the slice length would be more elegant with 1-based indexing? Then you could write a[i:n]. And this is in fact what ABC did -- it used a different notation so you could write a@i|n.(See http://homepages.cwi.nl/~steven/abc/qr.html#EXPRESSIONS.)
But how does the index:length convention work out for other use cases? TBH this is where my memory gets fuzzy, but I think I was swayed by the elegance of half-open intervals. Especially the invariant that when two slices are adjacent, the first slice's end index is the second slice's start index is just too beautiful to ignore. For example, suppose you split a string into three parts at indices i and j -- the parts would be a[:i], a[i:j], and a[j:].
So that's why Python uses 0-based indexing.
Subscribe to:
Posts (Atom)