[ Re-inventing python's objects | import | globals and locals ]
In python 1, a namespace is implemented as a packaging of a dictionary, which a namespace, NS, `intrinsicly' knows as NS.__dict__. This defines a raw protocol for name-lookup in the namespace: NS.name is NS.__dict__['name'], if the dictionary has that key. To this protocol are added various layers of further lookup, such as a class falling back on its bases or an instance falling back on its class. These are used successively, by getattr(), until one of them yields an answer: the rest all raise AttributeError, which getattr() catches as its cue to try the next fall-back.
Finally, if none of the assorted strategies work, getattr() uses the __getattr__ protocol: as long as the given 'name' isn't '__getattr__'; and getattr(NS, '__getattr__') works; getattr(NS, 'name'), aka NS.name, will try NS.__getattr__('name'). This is its last shot: it doesn't try to catch errors from it. When NS is an instance of NS.__class__ and this last has a __getattr__ method, that method is a function taking two arguments: the lookup for NS, on the other hand, has NS `bound in' as the first argument and takes one (further) argument, as desired. I intend to address the issue of how a class should do fall-back name lookup, given that its __getattr__ has the wrong signature, but later.
Now, of course, NS.__class__'s .__getattr__ may well end by calling the .__getattr__ of some or all of its bases: so the chain of lookup protocols, of which __getattr__ is the last known to getattr(), can be considered as a continuation of that chain, albeit now implemented in python rather than `under the bonnet'. One of my aims is to suggest ways that python 2 might be able to make the transition earlier than this: to place more of the infrastructure in (or controlled by) python code.
So I look at the whole getattr() process as a sequence of attempts, catching AttributeError on each and raising it if none succeeds. These split into two sequences - one starting with __dict__ but potentially gaining several layers of further lookup, before the tail, driven by __getattr__.
Thus a namespace is really a packaging of a chain of `lookups': each is a callable taking a key and either raising AttributeError or returning a value. Equally, a namespace is a packaging of one lookup: witness
chain = []
def getat(key, c=chain): # single lookup built out of possibly many
for link in c:
try: return link(key)
except AttributeError: pass
raise AttributeError, key
dict = { '__lookups__': chain } # if you want chain accessible, else {}
def setit(key, val, d=dict): d[key] = val
def delit(key, d=dict):
try: del d[key]
except KeyError: raise AttributeError, key
dict.update({ '__setattr__': setit,
'__delattr__': delit })
def getit(key, d=dict):
if key == '__dict__': return d
# doing that by an entry in dict would be unkind to the garbage collector !
try: return d[key]
except KeyError: raise AttributeError
chain.append(getit)
Adding further entries to the chain will enrich getat(). I envisage creating all namespaces by an operation which performs this much (i.e. a dictionary-based lookup with attribute modifiers): this is the common ground shared by classes and instances. I'll now build those two cases on top of this.
Perform, or emulate, the above during the early stages of creating an object. Having done so, suppose we are given a class: go on to
def currie(func, first):
# not implementable in this form in python 1; see def.html
def meth(*args, **what, f=func, s=first):
return apply(f, (s,) + args, what)
return meth
def inherit(key, c=class, o=object):
if key == '__class__': return c
try: val = getattr(c, key)
except AttributeError: pass
else:
if callable(val):
return currie(val, o)
return val
raise AttributeError, (key, o)
chain.append(inherit)
although I'd very much like to separate out the inheritance into two separate mechanisms: the one involving currie()ing I'll call inheritance, the other I'll call borrowing. A class borrows attributes off its bases: instances inherit methods off their parents but borrow non-callable attributes. The separation involves deciding, when defining a class, which parts of its namespace are to be borrowed `as they are' and which parts are to be inherited - that is, the instance's value is constructed out of the instance and the class' value. This, however, is an implementation detail ;^>
Alternatively, consider an object which has been initialised as above down to chain.append(getit): we can give this object bases (i.e. sources of borrowed namespace) by
def borrow(key, b=bases):
if key == '__bases__': return b
for b in b:
try: return getattr(b, key)
except AttributeError: pass
raise AttributeError, key
chain.append(borrow)
Finally, here's how we could implement the __getattr__ protocol, for use in any or all of the above cases, as the final entry in our chain:
def final(key, o=object):
if key != '__getattr__':
try: return o.__getattr__(key)
except AttributeError: pass
raise AttributeError, (key, o)
chain.append(final)
I'm interested in unifying classes and instances (well, all namespaces, but let's keep to these for now). That implies a namespace using both borrowing (from some bases) and inheritance (from some class). In what order should we do them ? Imagine one of K's bases is of the same class as K: and from this class K and the given base each can inherit some method. We must want the method bound to K itself, not to its base, which means K has to have inherited the method itself before trying to fall back on its bases. [Well, we can get away with the other order provided borrow: tests early, as for __bases__, for a key, say __rawgetattr__, returning borrow itself; for each base, tries base.__dict__[key] first (duplicating the work of getit, above) and; replaces getattr(b, key) with b.__rawgetattr__(key). Occam's razor persuaded me not to.]
Consequently, inheritance happens before borrowing in attribute lookup.
Now, the __getattr__ protocol needs to be tried only when all other lookups fail: so must be done after all __bases__. In particular, this means that it will be tried on each base before being tried on the derived object; which may seem odd, but is surely what borrowing attributes must imply.
Now suppose an instance-with-bases, eki, manages to get itself used as a class and instantiated. The instance, ptang, may inherit methods from eki, which may have borrowed them from its bases or may, in turn, have inherited them from its own class. In the last case, double inheriting, we've currie()d up a function to receive first argument eki, second ptang and subsequently whatever others are passed to the resulting bound method of ptang when it eventually gets called.
Now, python defines various operations on things by having special names, like __str__ and __add__, which get used to implement the built-in `interfaces' of the python language, e.g. '%s' % thing, this + that. To be serious about unifying classes and their instances, we now have to deal with the ability to, for instance, call str() both on a class and on its instance, getting potentially very different results and having to struggle to avoid problems induced by the currie()ing that'll be done somewhere in the process. For the relatively simple case of a class, we'd need its __str__ to be something like:
def __str__(self=None):
if self is None:
# we were asked for the class as a string
else:
# we were asked for the instance as a string
Once we can do a class-which-is-an-instance, we can have its class be likewise to chains of arbitrary (finite) depth; the deeper it goes, the worse this will get.
The way out is to split the class object into two pieces, so instances don't inherit their methods directly from the class, but from some `sub-namespace' of the class: inherit() replaces c with c.__instance__, say, in its getattr(c, key). The existing python class statement's body is then used to build the namespace for the class's .__instance__ attribute, which will have to have, as bases, the .__instance__ attributes of all bases of the class (that have one). A further split could divide __instance__ into one portion, .__method__, whose contents simply get curried (no check for whether they are callable) when deployed by instances, and a second which is simply used as a base by each instance of the class, without currie()ing. To ensure that existing code can still see these attributes apparently on the class itself, the class could borrow from its own .__instance__ (or components thereof, if split): however, the methods would all have the wrong signature if it borrowed them, so I wouldn't want to. The class itself may (from elsewhere, somehow - however it managed to become an instance of something else) have a namespace containing other stuff beside its .__method__, in particular things like __getattr__(key) and __add__(towhom) if it supports them.
The __call__ of a class then has the job of building an object with (possibly borrow(), ...) getit() and inherit() functionality, as above, which suffice to let this new object have an __init__ method, which the class's __call__ invokes, with the arguments it was given, before returning the object. It'll need recourse to a `raw primitive' of some kind to build a namespace from a lookup function such as getat(): but it can do everything else (including getat()) in python.
So a namespace is just a packaging of a lookup - a function which takes one argument. All it remains for getattr(obj, key) to do is unpack the packaging, extract obj's hidden function, say f, and return f(key), allowing any exceptions this produces to pass out for someone else to handle. Everything else has been packed up inside the lookup.
'appen, as I argue elsewhere, import is a lookup too. Given all that, all that remains is to identify a sensible syntax for object creation and implement it.
Written by Eddy.