I have read numerous articles about dynamic languages and static typing, the most recent being Steve Yegge's Dynamic Languages Strike Back, where he argues that most of the time, there is enough type information in a program written in a dynamic language (he uses Javascript) to do all kinds of cool refactorings.
I'm building a small proof-of-concept to do something like that in Python, for Python. Here are my thoughts... UPDATE: Read about PySmell, a tool I wrote to provide static auto-completion for Python projects.
Motivation
My motivation for a tool like that is many nasty and silly mistakes I make when coding in IronPython at Resolver Systems, that could be avoided easily. We have various safeguards in place to catch bugs, the most important being test-first development. We have a pylint check phase that has to pass before we check in, and some of us have pyflakes hooked in to run before we run a test.
The problem is that only pyflakes is fast enough to run without it being a distraction. In fact, it's so fast, that emacs has a mode to run it continuously in the background and highlight any errors reported (I wish I could do the same with Vim!). On the other hand pylint is very slow, as it has to execute a python module to make sure everything is covered. Running a test is not slow per se, but IronPython takes something like 4-5 seconds to start, which makes it a bit annoying.
So how pyflakes manages to be instantaneous? It's using the built-in compiler module to generate, and then analyze the AST of your code. It can't catch everything, but surely catching something is better than cathing nothing!
So my idea is, how far can we stretch the approach taken by pyflakes?
The compiler module
Unfortunately, the documentation for the compiler module is not very good. Thankfully, some kind souls have noticed that as well, and have written up better documentation, found here. I wonder why it's not accepted yet.
Basically, what the compiler modules does is parse a python source file (or string) and give you back the abstract syntax tree for it. You can then use compiler.walk, using your own visitor, to do as you please (Interestingly, while pyflakes uses compiler, it does its own walking).
Auto-complete
The question I had was: "Can we get enough information from the structure of Python code to generate autocompletion lists?". It seems like in many cases, the answer is yes, we can. As a matter of fact, professional IDEs like Wing IDE do it already. I will attribute the lack of other IDEs (that I know of) to do it as "lack of effort", as Yegge himself says (no offence!). (I know that Vim's python omnicompletion tries, but it never worked for me in an acceptable manner).
So how would it work? We can easily gather from an AST most of the properties and methods a class defines. We can the look for clues in the usage of a name to try and figure out its type, and therefore its autocompletion suggestions.
Example: Imagine having the following class:
class Dog(object):
def bark(self):
pass
def play_dead(self):
pass
def jump(self):
pass
then, somewhere else in the code, perhaps in another module:
def interact(animal):
animal.jump()
animal.
where the dot . represents the autocompletion request. It is clear that animal has a very big chance of being an instance of Dog. We can the present Dog's properties and methods, complete with calltips. Even if we had a Kangaroo class that defined jump, the completions could include both classes, with some kind of indicator. Better than mindlessly listing every possible keyword, as Vim does now!
In fact, in some cases, we can be 100% sure that a name is an instance of a specific class: If we encounter a constructor within the same scope, or an isinstance check. The other, harder case is to do data-flow analysis and examine the argument in calls to interact, to add more information.
The quote
As I'm writing this, an extension the the familiar Duck typing phrase comes to mind:
If it walks like a duck and quacks like a duck, I'd call it a duck, and this is the only duck that I know of, so there!
Conclusion
Sure, it will be very hard to be absolutely certain. When someone defines __getattr__ (or for even more evil, __getattribute__ -- see an earlier post about the differences), all bets are off. The question is, do we care? I personally don't. I seem to manage without any kind of help, other than keyword completion, so any kind of help will make my life easier, at least. Expect to hear more soon!
Comments
Comment by Eugene , 1 year, 6 months ago :
For vim you can use ropevim (http://rope.sourceforge.net/ropevim.html). It gives rather well completion, along with some other refactoring functionality
Comment by web design company , 1 year, 6 months ago :
The compiler module is kind of obsolete. _ast (2.5) / ast (2.6) modules gives access to the most recent AST (directly from the internal Python compiler) http://docs.python.org/lib/ast.html
Comment by Paul Boddie , 1 year, 6 months ago :
My understanding was that pylint used astng, which in turn does inspection of the compiler module ASTs. The reason why it isn't likely to be instantaneous is because it probably does a certain amount of whole program analysis.
Meanwhile, on the subject of compiler.ast vs. _ast, the documentation for the compiler module seems to be somewhat better, and it's unfortunate that backwards compatibility has been pushed to one side again, given the talk about removing the compiler module from the standard library.
Comment by Michael Foord , 1 year, 6 months ago :
Paul is right - PyLint does static analysis (which is how we are able to use it on the Resolver One codebase), it just does massively more than PyFlakes which is why it is slower.
By the way, your Dog inherits from self - a mistake I make a lot (maybe static analysis could help...).
You should look at Rope - a refactoring library that does static analysis (and I think some dynamic analysis as well):
http://rope.sourceforge.net
You should also look at the AST libraries (by logilab) behind PyLint - they're pretty complete!
Maciek (PyPy guy) thinks that the best way of getting autocomplete is through dynamic analysis as well as static analysis - using type information from when you run your tests.
Comment by Orestis Markou , 1 year, 6 months ago :
@web design company:
That's good to know. The problem of course is, when you're using Python 2.4 you have to use the compiler module, and any project that wants to support both has to write an abstraction layer on top, which is kind of annoying.
@Paul:
pylint indeed checks a lot of stuff, including code style, which I personally think is overkill.
Even if the compiler module is removed, you can still generate ASTs from the builtin compile function, right? Is it decided it's going to be removed?
Comment by Orestis Markou , 1 year, 6 months ago :
No! Bad Dog! (fixed, thanks!)
I tried integrating rope with Vim, once, but it's requirement on Python 2.5 put me off.
ASTNG looks interesting, but my aim here is to be small and simple! It's all about the 80/20 rule (also, I want to hack my way through, as a learning experience :).
Not sure if running some heavily Mock'ed tests would actually contribute something useful, but it's definitely a very good idea (esp. for the PyPy guys, I suspect the have to gather runtime information anyway for their JIT).
Comment by Michael Foord , 1 year, 6 months ago :
The built in compile function generates code objects not ASTs.
Is your goal something you can create to use with VIM?
Comment by Orestis Markou , 1 year, 6 months ago :
Hm, I see. Surely the ability to generate AST from source will be there for >2.5 right? Otherwise it seems a very bad idea to take away something that useful.
Vim integration is indeed a goal, probably the most important one right now. But there's nothing there that stops other tools to take advantage of a bunch of information about Python code.
Comment by Kevin Teague , 1 year, 6 months ago :
It would be interesting to explore using Python 3's function annotations w/ interfaces to assist in autocomplete, something like:
class IDog(zope.interface.Interface):
....def bark():
........"bark"
....def play_dead():
........"roll over and play dead"
....def jump():
........"jump"
def interact(animal: IDog):
....animal.jump()
....animal.
Comments are not allowed in this post

Comment by Nathan , 1 year, 6 months ago :
Komodo Edit is another IDE (a professional product that's now free and open) that can supply a similar depth of autocomplete functionality for Python editing. It's enough of a feature for me to use it for almost all my python editing, though I do wish that similar functionality was just an emacs mode away.