An interesting question

So, yesterday I was asked an interesting question:

Int: Are you familiar with Python's getattr?
Me: Um, yes?
Interviewer clarifies what it's all about
Int: So I now want you to implement __getattr__ in such a way that when a method is called with the prefix print it'll print it's name before calling it.

Before that, we had a discussion about decorators, where I created a decorator called printme:

def printme(func):
    def decorated(*args, **kwargs):
        print func.func_name #well, it was "print func", but now I know better :)
        return func(*args,**kwargs)
    return decorated

So after a bit of confusion (writing code like that in a piece of paper with two pairs of eyes looking at you can be tricky), I ended up with something that may have looked like this:

class Foo(object):
    def __getattr__(self, name):
        if name.startswith('print_'):
            f = getattr(self, name[6:]) #actually, I did super(Foo, self) but that won't work.
            return printme(f)
        return getattr(self, name)

    def hi(self):
        print "Hello!"

Can you spot the error?

.
.
.

Well, we didn't have access to a console, so we didn't. If you're lazy and didn't pasted the above into a console, here are the results:

>>> f = Foo()
>>> f.hi()
Hello!
>>> 
>>> f.print_hi()
hi
Hello!

So far, so good. But:

>>> f.blah()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 6, in __getattr__
  File "<stdin>", line 6, in __getattr__
  .
  .
  .
  File "<stdin>", line 6, in __getattr__
  File "<stdin>", line 6, in __getattr__
  File "<stdin>", line 6, in __getattr__
RuntimeError: maximum recursion depth exceeded

Seeing that this was a topic I wasn't sure about, I came home and searched about it, which resulted in this blog post. So, after this quite big introduction, a small tutorial on getattr

getattr, __getattr__ and __getattribute__

So, what is this all about? Python, as you may know is a Strong & Dynamic typed language. We don't care about Strong here, but the Dynamic part is very interesting. Dynamic languages don't care at the compile time the nature of the constructs that are used - they do all the work at run time. So, for example, when you call method "bar" on instance "foo" (foo.bar()), the compiler won't check if there is a void method named bar in the class tree of foo.

So, how does it work? There has to be support for looking up methods at runtime. This is what Python uses getattr, __getattr__ and __getattribute__ for. (Objective-C has similar support, though I haven't got that far yet. I don't know about other languages.)

In essence, consider the following snippet:

>>> class Foo(object):
...     def hi(self):
...             print 'Hello there!'
... 
>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'Foo': <class __main__.Foo at 0x54c30>, '__doc__': None}

What is interesting here is the entry Foo at the globals dictionary. Let's keep that and move on.

>>> f = Foo()
>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'Foo': <class __main__.Foo at 0x54c30>, '__doc__': None, 'f': <__main__.Foo instance at 0x60ad0>}

Again, f is mapped to a Foo instance, in the globals dictionary.

>>> dir(f)
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__', '__weakref__', 'hi']

What's this? Our instance has an attribute named hi. How can we access this?

>>>f.hi
<bound method Foo.hi of <__main__.Foo instance at 0x60ad0>>
>>> f.hi()
Hello there!

Nice, but boring. This isn't using any features of dynamic typing. Let's see what getattr can do.

getattr

>>> getattr(f, 'hi')
<bound method Foo.hi of <__main__.Foo instance at 0x60ad0>>

That looks familiar, but now we're using a string to specify what method we want.

>>> getattr(f, 'hi')()
Hello there!

Nice, right? But this isn't getting us any closer to the answer of the above question, so let's proceed. What we want to do is somehow intercept the built-in atrribute lookup (methods are, after all, attributes) and do some work of our own.

__getattribute__

This is where it gets a bit complicated. You may have noticed a method called __getattribute__ in the results of dir, above. So at first glance you'd want to implement this, but wait! Let's check what the reference has to say:

__getattribute__( self, name)
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the latter will not be called unless __getattribute__() either calls it explicitly or raises an AttributeError. This method should return the (computed) attribute value or raise an AttributeError exception. In order to avoid infinite recursion in this method, its implementation should always call the base class method with the same name to access any attributes it needs, for example, "object.__getattribute__(self, name)".

That is, if you implement this method, you are overriding the built-in attribute access mechanism! So tread carefully, and notice that you can easily trigger infinite recursion. Also, this method is only available with new-style classes.

__getattr__

Back to where we started. Let's see what the reference is saying about __getattr__:

__getattr__( self, name)
Called when an attribute lookup has not found the attribute in the usual places (i.e. it is not an instance attribute nor is it found in the class tree for self). name is the attribute name. This method should return the (computed) attribute value or raise an AttributeError exception.

Note that if the attribute is found through the normal mechanism, __getattr__() is not called. (This is an intentional asymmetry between __getattr__() and __setattr__().) This is done both for efficiency reasons and because otherwise __setattr__() would have no way to access other attributes of the instance. Note that at least for instance variables, you can fake total control by not inserting any values in the instance attribute dictionary (but instead inserting them in another object). See the __getattribute__() method below for a way to actually get total control in new-style classes.

So __getattr__ will be called only when an attribute is not found. This gives us a hint to why we got infinite recursion in the first example. Let's revisit that:

def __getattr__(self, name):
    if name.startswith('print_'):
        f = getattr(self, name[6:]) #actually, I did super(Foo, self) but that won't work.
        return printme(f)
    return getattr(self, name)

This will be called if there is an attribute lookup that fails. In the given specifications, we suppose that someone will try to call print_whatever. However, if something doesn't start with print_, I tried to get the normal behaviour, which is to raise an AttributeError calling it again. I should've either raised AttributeError manually, or used the default implementation, like this:

return object.__getattribute__(self, name)

Of course, this works only for new-style classes, but if you're implementing this for something new, why go old-style?

Final Thoughts

This probably came out more boring than I thought, though it was a good clarification for me. The uses for a technique like this are endless. The first that comes to mind is a Rails-like Active Record class that is empty but responds to messages that correspond to table columns. Another is a proxy object, that routes certain methods over to the real thing, perhaps with cacheing of results and so on.

December 10, 2007, 6:26 a.m. More (1186 words) 0 comments Feed
Previous entry: Leopard Calendar Store
Next entry: My 2008 resolutions

This post is older than 30 days and comments have been turned off.