Tuesday, August 25, 2009

 

Potential problems with mutable default arguments in Python

Just spent about half an hour tracking down a mysterious problem in one of my web servers implemented in Python, where same page occasionally was showing obviously duplicate info in some table cells while loading correctly on other invocations.

The bug was, of course, a rather stupid one. I was using my own implementation of a container and wanted to initialize it with a regular array, like that:

class Container :
    def __init__ (self, initial_arr) :
        self.content = initial_arr
.........................................
a = Container( ["one","two","three"] )

Of course, this was already problematic enough, since I was initializing my container with (original) reference and not a copy, but provided that a consumer of this class is wary of this behavior, this is not incorrect per se.

The real problem emerges when, naturally, I wanted to provide a default initialization to an empty container, and did it like that:

class Container :
    def __init__ (self, initial_arr=[]) :
        self.content = initial_arr

So that I could write

a = Container()

With this "improvement", this is no longer just a poor style, but a clear bug: new instance of class is initialized with a reference to default argument, and if class is modified, so is default argument, in effect carrying over changed to the next, unrelated, instance.

No wonder I was seeing duplicated cells returned by server…

Here is the complete script which illustrates the problem:

#! /usr/bin/env python

class Container :
    def __init__ (self, initial_arr=[]) :
        self.content = initial_arr
    def __repr__(self) :
        return repr(self.content)
    def append(self,elm) :
        self.content.append(elm)

def foo (use_default) :
    if use_default :
        a = Container ()
    else :
        a = Container ( ["one","two","three"] )
    a.append ( "four" )
    print "a = %r" % a

print "\nUsing explicit argument"
foo (use_default=False)
foo (use_default=False)

print "\nUsing default argument"
foo (use_default=True)
foo (use_default=True)

This script generates the following output:

Using explicit argument
a = ['one', 'two', 'three', 'four']
a = ['one', 'two', 'three', 'four']

Using default argument
a = ['four']
a = ['four', 'four']

Ideal solution would have been to force default argument (if mutable) to be read-only. Unfortunately, Python does not support read-only values.

Lacking that, the only option available is to never use mutable values as default arguments. For example, above script could have been written like that:

class Container :
    def __init__ (self, initial_arr=None) :
        if initial_arr is None : initial_arr = []
        self.content = initial_arr
.........................................

With this simple update, class behaves "as intended" (whether this is a good behavior to begin with is another question entirely).

Update (14-Sep-09). One more manifestation of the same problem is initializing class members to mutable values outside of method __init__(). For example, this code

class Foo :
    m_x = []              # DANGEROUS!!!
    def __init__ (self) :
        self.m_y = []     # This is much better!
    def append(self,obj) :
        self.m_x.append (obj)
        self.m_y.append (obj)
    def __repr__ (self) :
        return "Foo(m_x=%r,m_y=%r)" % (self.m_x,self.m_y)

v = Foo ()
v.append ("A")
w = Foo ()
w.append ("B")
print "v = %r, w = %r" %(v,w)

Generates output

v = Foo(m_x=['A', 'B'],m_y=['A']), w = Foo(m_x=['A', 'B'],m_y=['B'])

Labels: ,


This page is powered by Blogger. Isn't yours?