boost.png (6897 bytes)

Boost.Python Pickle Support

Pickle is a Python module for object serialization, also known as persistence, marshalling, or flattening.

It is often necessary to save and restore the contents of an object to a file. One approach to this problem is to write a pair of functions that read and write data from a file in a special format. A powerful alternative approach is to use Python's pickle module. Exploiting Python's ability for introspection, the pickle module recursively converts nearly arbitrary Python objects into a stream of bytes that can be written to a file.

The Boost Python Library supports the pickle module through the interface as described in detail in the Python Library Reference for pickle. This interface involves the special methods __getinitargs__, __getstate__ and __setstate__ as described in the following. Note that Boost.Python is also fully compatible with Python's cPickle module.


The Boost.Python Pickle Interface

At the user level, the Boost.Python pickle interface involves three special methods:
__getinitargs__
When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a __getinitargs__ method. This method must return a Python tuple (it is most convenient to use a boost::python::tuple). When the instance is restored by the unpickler, the contents of this tuple are used as the arguments for the class constructor.

If __getinitargs__ is not defined, pickle.load will call the constructor (__init__) without arguments; i.e., the object must be default-constructible.

__getstate__
When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a __getstate__ method. This method should return a Python object representing the state of the instance.
__setstate__
When an instance of a Boost.Python extension class is restored by the unpickler (pickle.load), it is first constructed using the result of __getinitargs__ as arguments (see above). Subsequently the unpickler tests if the new instance has a __setstate__ method. If so, this method is called with the result of __getstate__ (a Python object) as the argument.
The three special methods described above may be .def()'ed individually by the user. However, Boost.Python provides an easy to use high-level interface via the boost::python::pickle_suite class that also enforces consistency: __getstate__ and __setstate__ must be defined as pairs. Use of this interface is demonstrated by the following examples.

Examples

There are three files in boost/libs/python/test that show how to provide pickle support.

pickle1.cpp

The C++ class in this example can be fully restored by passing the appropriate argument to the constructor. Therefore it is sufficient to define the pickle interface method __getinitargs__. This is done in the following way:

pickle2.cpp

The C++ class in this example contains member data that cannot be restored by any of the constructors. Therefore it is necessary to provide the __getstate__/__setstate__ pair of pickle interface methods:

For simplicity, the __dict__ is not included in the result of __getstate__. This is not generally recommended, but a valid approach if it is anticipated that the object's __dict__ will always be empty. Note that the safety guard described below will catch the cases where this assumption is violated.


pickle3.cpp

This example is similar to pickle2.cpp. However, the object's __dict__ is included in the result of __getstate__. This requires a little more code but is unavoidable if the object's __dict__ is not always empty.

Pitfall and Safety Guard

The pickle protocol described above has an important pitfall that the end user of a Boost.Python extension module might not be aware of:

__getstate__ is defined and the instance's __dict__ is not empty.

The author of a Boost.Python extension class might provide a __getstate__ method without considering the possibilities that:

To alert the user to this highly unobvious problem, a safety guard is provided. If __getstate__ is defined and the instance's __dict__ is not empty, Boost.Python tests if the class has an attribute __getstate_manages_dict__. An exception is raised if this attribute is not defined:

    RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
To resolve this problem, it should first be established that the __getstate__ and __setstate__ methods manage the instances's __dict__ correctly. Note that this can be done either at the C++ or the Python level. Finally, the safety guard should intentionally be overridden. E.g. in C++ (from pickle3.cpp):
  struct world_pickle_suite : boost::python::pickle_suite
  {
    // ...

    static bool getstate_manages_dict() { return true; }
  };
Alternatively in Python:
    import your_bpl_module
    class your_class(your_bpl_module.your_class):
      __getstate_manages_dict__ = 1
      def __getstate__(self):
        # your code here
      def __setstate__(self, state):
        # your code here

Practical Advice


Light-weight alternative: pickle support implemented in Python

pickle4.cpp

The pickle4.cpp example demonstrates an alternative technique for implementing pickle support. First we direct Boost.Python via the class_::enable_pickling() member function to define only the basic attributes required for pickling:
  class_<world>("world", args<const std::string&>())
      // ...
      .enable_pickling()
      // ...
This enables the standard Python pickle interface as described in the Python documentation. By "injecting" a __getinitargs__ method into the definition of the wrapped class we make all instances pickleable:
  # import the wrapped world class
  from pickle4_ext import world

  # definition of __getinitargs__
  def world_getinitargs(self):
    return (self.get_country(),)

  # now inject __getinitargs__ (Python is a dynamic language!)
  world.__getinitargs__ = world_getinitargs
See also the tutorial section on injecting additional methods from Python.
© Copyright Ralf W. Grosse-Kunstleve 2001-2004. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Updated: Feb 2004.