python: August 2008 Archives
You can store pretty much anything as an attribute of an elementtree Element instance, not just strings.
A recent comment got me experimenting with this. Tom mentioned "Attributes can only be text", which is abso-smurfly correct for XML. But remember: ElementTree is not XML. If you don't need to serialize to XML, you can hang whatever you like on Elements.
Extending my power plant elementtree example a bit, I create a simple class to represent an employee, then hang instances of it on elements in a nested structure:
from elementtree.ElementTree import Element
from elementtree.ElementTree import SubElement
class Emp(object):
"""A simple Employee
"""
def __init__(self, name, title):
self.name = name
self.title = title
# make some employee instances
burns = Emp("Monty", "CEO")
smithers = Emp("Waylon", "Flunkie")
karl = Emp("Carl", "Engineer")
lenny = Emp("Lenny", "Engineer")
homer = Emp("Homer","Safety Inspector")
# hang them on a tree
ceo = Element('e', emp=burns)
smithers_elem = SubElement(ceo, 'e', emp=smithers)
SubElement(smithers_elem, 'e', emp=karl)
SubElement(smithers_elem, 'e', emp=lenny)
SubElement(smithers_elem, 'e', emp=homer)
It's gross because I create element labels that will never be used ("e"), and I can't ever serialize the structure to XML (bummer), but if you need nested data in a hurry, it's hard to beat.
Plus, I'm pretty sure that someone smarter than I could make this scenario cleaner by doing things like extending tostring to correctly serialize the Emp instances.
In a recent post, he touched on something that burned me BAD when I first started slinging pythion: using mutables as default parameters.
My particular run-in was a cousin of what Fredik describes, using a mutable as a default class attribute. I had a class like this:
My problem was that I was setting a mutable as a default attribute value. This meant that each instance of MyPage was sharing the same error list; analagous to a static class variable in Java. It wasn't long before everything had errors, since the array just kept growing. This happened in a production environment. Sub-awesome indeed. What I SHOULD have done is this:
class MyPage(object): errors = [] def __init__(self): # set me up def run(self): try: self.build_page() except: self.errors.append("Oops. Something went wrong")
The __init__ clears out the array each time a new object is created. No sharing between instances. No pain.
class MyPage(object): errors = None def __init__(self): # set me up self.errors = [] def run(self): try: self.build_page() except: self.errors.append("Oops. Something went wrong")
I made a similar transition to Python, and it was nice to see a lot of my own discoveries in his list.
Brian manages to give a good summary without calling Java nasty names or badmouthing it's devotees:
I've been a Java programmer for a long time, and I bear no ill will toward a language I happily used for many years. I'm just trying to capture why I find Python to be more fun.
[link] Why is Python More Fun Than Java?
When you find yourself cursing Unicode, and you will, remember that you bookmarked this post that goes into great detail on how to use Python to handle Unicode.
Trust me. In six months you're going to hit Unicode, and you'll remember "Hey! I remember seeing something about this somewhere...if only I could remember..."
In Python, most people's instinct is to model tree-like structures with nested dictionaries. An organization chart from a nuclear power plant is a simple example:

The old me would have rushed to model this like so:
org = {'name':'Monty',
'title':'CEO',
'boss_of':[
{'name':'Waylon',
'title':'Flunkie',
'boss_of':[
{'name':'Carl',
'title':'Engineer',
'boss_of': None},
{'name':'Lenny',
'title':'Engineer',
'boss_of': None},
{'name':'Homer',
'title':'Safety Iinspector',
'boss_of': None}
]
}]
}
This is annoying to type and even harder to read. Keeping track of quotes, brackets and commas is frustrating. Imagine the pain when of a larger or more complex tree.Fear not. elementtree makes things super-easy when dealing with nested data.
"But wait", you say. "You're not doing XML!"
Say it with me: elementtree is not XML. Yes, it's good at serializing to and de-serializing from XML, but it's very useful even if you never read or generate a lick of markup.
Here's the same org chart using Element. Much shorter, less error-prone, and easier on the eyeballs:
from elementtree.ElementTree import Element
from elementtree.ElementTree import SubElement
burns = Element('emp', name="Monty", title="CEO")
smithers = SubElement(burns, 'emp', name="Waylan", title="Flunkie")
karl = SubElement(smithers, 'emp', name="Carl", title="Engineer")
lenny = SubElement(smithers, 'emp', name="Lenny", title="Engineer")
homer = SubElement(smithers, 'emp', name="Homer", title="Inspector")
I didn't create a whole class to help me model the nested data; that would have been overkill. I just used Elements to string together related data. Now it's easy to navigate the tree in idiot-proof fashion with methods like getchildren and getiterator.
Nested data pops up quite often, and it's nice to have a good tool for dealing with it gracefully.


