Object Serialization

1. Serialization Overview

Ambrai Smalltalk objects can be serialized using ObjectSerializingStream and deserialized using ObjectDeserializingStream. Both classes use the regular streams protocol to write objects as a sequence of bytes to a byte stream and read sequences of bytes back into objects.

2. Serializing Objects

To serialize objects, use an ObjectSerializingStream on any kind of byte stream. For example, to serialize the string 'Hello world!' to a byte array, do the following:

| targetStream |
targetStream := ObjectSerializingStream on: (
    ByteStream on: ByteArray new writeStream).
targetStream nextPut: 'Hello world!'.

To serialize the same string to a file, the serializing stream has to be initialized with a file stream. This file stream must be explicitly closed after the string is written:

| file targetStream |
file := Directory desktop createFileIfDoesNotExist: 'test.dat'.
targetStream := ObjectSerializingStream on: (
    ByteStream on: (
        file openForWriting beBinary truncate)).
targetStream nextPut: 'Hello world!'.
targetStream close.

You can also use the ObjectSerializingStream class >> toFile: helper method that takes a file object argument. In addition to creating the byte and file streams, this method also writes an identifying header and version number at the beginning of the file and sets the file creator to Ambrai. Ambrai Smalltalk recognizes and can open files with the extension .stobjects that were created this way.

3. Deserializing Objects

To deserialize objects, use an ObjectDeserializingStream on any kind of byte stream. For example, to deserialize the string 'Hello world!' from a byte array, do the following:

| targetStream sourceStream |

"Writing"
targetStream := ObjectSerializingStream on: (
    ByteStream on: ByteArray new writeStream).
targetStream nextPut: 'Hello world!'.

"Reading"
sourceStream := ObjectDeserializingStream on: (
    ByteStream on: targetStream byteStream contents readStream).
sourceStream next.

To deserialize the same string from a file, the deserializing stream has to be initialized with a file stream. As before, this file stream must be explicitly closed after the string is written:

| file targetStream |
file := Directory desktop fileNamed: 'test.dat'.
targetStream := ObjectDeserializingStream on: (
    ByteStream on: (
        file openForReading beBinary)).
targetStream next.
targetStream close.

You can also use the ObjectDeserializingStream class >> fromFile: helper method that takes a file object argument. In addition to creating the byte and file streams, this method also reads and verifies the identifying header and version number at the beginning of the file.

4. Customizing Serialization

You can change how an object serializes itself by overriding the #serializeOn:version: instance method, and the #deserializeFrom:version: class method.

4.1. When to Customize Serialization

The Object >> serializeOn:version: and Object class >> deserializeFrom:version: methods will serialize any object by default. These methods preserve the state of all instance variables by serializing and deserializing them in turn. These methods are adequate for most objects, but there are some objects that require special handling.

You should override the #serializeOn:version: and #deserializeFrom:version: methods under the following circumstances:

  • Your object relies on the identity hash of other objects.

    The identity hash of objects is not preserved. The identity hash of the object that was serialized will be different from identity hash of the recreated object. Any objects that rely on the identity hash of other objects, such as Sets and Dictionaries, must be rehashed after deserialization.

  • Your object is a singleton or flyweight.

    Singletons and flyweight should not return new instances when deserialized. Instead, they should return the appropriate instances of the class.

    For singletons, the #serializeOn:version: method does not need to store anything, and the #deserializeFrom:version: should return the existing sole instance of the class.

    For flyweights, the #serializeOn:version: method should store only what is needed to identify a single flyweight instance, and the #deserializeFrom:version: should return the appropriate flyweight instance of the class.

  • Your object contains native resources.

    Objects that contain native resource, such as file handles, Objective-C objects, and library handles either need special handling or you may forbid them from being serialized.

    In some cases, there may be a way serialize the native resource and recreate it when deserializing. For example, an open file handle could be serialized by storing the path to the file and the access type (read, write, read-write). The file handle could be recreated when deserialized by opening the same path with the same access type.

    In other cases it may be difficult or impossible to serialize a native resource. You may not have access to the entire internal representation of the resource. An example of such a resource is a window handle. It may be best to disallow serialization of such objects. An alternative to serializing the low level resource would be to serialize a higher level object that wraps the resource instead, and that is able to recreate the resource.

  • Your object is weak or finalizable.

    Some objects have special state, such as having weak references or being finalizable, in addition to their instance variables. By default, this state is not saved during serialization. If you would like to preserve this state, you must write your own serializing and deserializing methods for the class.

4.2. Forbidding serialization

To forbid serialization, override the #serializeOn:version: method to throw an exception.

serializeOn: anObjectSerializingStream version: anInteger

    ^self shouldNotSerialize

4.3. Writing #serializeOn:version: Methods

The #serializeOn:version: method should write the state of the receiver as a sequence of bytes to an ObjectSerializingStream argument. The #deserializeFrom:version: method should read back the same sequence of bytes from an ObjectDeserializingStream to reconstruct and return an equivalent object.

The #serializeOn:version: method for any class must follow the template:

  • The first step is to write a header for the object. The header identifies the class of the object being serialized.
  • The second step is to write any information that is needed before the object can be instantiated during deserialization. Usually, this step is empty for a fixed sized object. For variables sized objects, you normally write the size.

    Note that none of the object written here can reference the receiver, either directly or indirectly. Keep the information written at this step to a minimum.

  • The third step is to register the object with the stream.
  • The fourth step is to write the contents of the object.
serializeOn: anObjectSerializingStream version: anInteger

    "step 1: write a header for the object that will indicate its class"
    anObjectSerializingStream writeObjectHeader: self version: anInteger.

    "step 2: write any information that is required before the
    object can be instantiated during deserialization"
    anObjectSerializingStream nextPut: self size.
    anObjectSerializingStream nextPut: ...

    "step 3: register the object to assign it an index"
    anObjectSerializingStream registerObject: self.

    "step 4: write the contents of the object"
    anObjectSerializingStream nextPut: ...

4.4. Writing #deserializeFrom:version: Methods

The #deserializeFrom:version: method must mirror the #serializeOn:version: method. Everything that is written to the stream in the #serializeOn:version: method must be read back in the same order in the #deserializeFrom:version: method.

The #deserializeFrom:version: method for any class must follow the template:

  • The first step is to read what was written in step 2 of the #serializeOn:version: method.
  • The second step is to instantiate an new instance of the object.
  • The third step is to register the newly instantiated object with the stream.
  • The fourth step is to read the contents of the object.
deserializeFrom: anObjectDeserializingStream version: anInteger
    | size instance |

    "step 1: read information that is required before the object can be instantiated"
    size := anObjectDeserializingStream next.

    "step 2: instantiate a new object"
    instance := self new: size.

    "step 3: register the object to assign it an index"
    anObjectDeserializingStream registerObject: instance.

    "step 4: read the contents of the object"
    instance variable: anObjectDeserializingStream next

4.5. Flattening and Unflattening Objects

In the #serializeOn:version: method you have the flexibility of representing your object in terms of other objects, and/or in textual format (character strings), and/or in binary format (bytes).

Usually, you will write other objects directly to the serializing stream using the #nextPut: method. But, you can also obtain a character stream to write characters and strings using the #nextPut: and #nextPutAll: methods. Binary data can be written to the byte stream as a sequence of bytes or as primitive data types such as signed/unsigned integers and floats of various sizes, and structures. The following snippet of code illustrates several different ways you can flatten the state of your object.

anObjectSerializingStream nextPut: anyObject.
anObjectSerializingStream writeString: 'string of any length'.
anObjectSerializingStream characterStream nextPutAll: 'abcd'.
anObjectSerializingStream characterStream nextPutAll: 'textual prepresentation.'.
anObjectSerializingStream byteStream nextPutAll: #(1 2 3 253 254 255).
anObjectSerializingStream byteStream nextUInt16Put: 12345.
anObjectSerializingStream byteStream nextFloat32Put: 1.5.

Note that if you are writing integers you should consider writing them as objects instead of fixed sized binary integers. Integers written as objects are compacted to reduce their size and the range of values is unrestricted.

In the #deserializeFrom:version: method you can read the object in object format, in textual format (character strings), and in binary format (bytes). The following snippet of code shows how to unflatten the state of your object as it was flattened in the previous code snippet.

anObjectDeserializingStream next.
anObjectDeserializingStream readString.
anObjectDeserializingStream characterStream next: 4.
anObjectDeserializingStream characterStream upTo: $..
anObjectDeserializingStream byteStream next: 6.
anObjectDeserializingStream byteStream nextUInt16.
anObjectDeserializingStream byteStream nextFloat32.

4.6. Replacing Objects

To replace objects, simply override the #serializeOn:version: instance method and invoke the #nextPut:insteadOf: method on the serializing stream with the object you want to serialize instead of the receiver.

MyClass >> serializeOn: anObjectSerializingStream version: anInteger
    "Replace the receiver with anotherObject."

    | anotherObject |
    anotherObject := ....
    anObjectSerializingStream nextPut: anotherObject insteadOf: self

4.7. Changing Classes

To serialize an object as if it was an instance of another class, simply override the #serializeOn:version: instance method and invoke the #writeObjectHeader:class:version: method with the new class as argument.

Take the String class, for example. The String class itself is an abstract class that has three concrete subclasses: String8, String16 and String24. The three subclasses are somewhat private and are subject to change, therefore, we would like to serialize all three subclasses in a neutral representation as if they were instances of the String class.

String >> serializeOn: anObjectSerializingStream version: anInteger

    anObjectSerializingStream writeObjectHeader: self class: String version: anInteger.
    anObjectSerializingStream writeString: self.
    anObjectSerializingStream registerObject: self

The String class will deserialize the objects and return instances of the appropriate subclass.

String class >> deserializeFrom: anObjectDeserializingStream version: anInteger

    ^anObjectDeserializingStream registerObject: anObjectDeserializingStream readString

4.8. Renaming Classes

To serialize an object with a different class name, simply override the #serializeOn:version: instance method and invoke the #writeObjectHeader:class:className:version: method with the new class name as argument.

For example, to serialize instances of MyClass to a previous version named MYClass:

MyClass >> serializeOn: anObjectSerializingStream version: anInteger

    anObjectSerializingStream writeObjectHeader: self class: MyClass className: #MYClass version: anInteger.
    ...

4.9. Serializing and Deserializing Different Class Versions

Classes evolve over time and this affects their serialized representation. This becomes a problem when one version of a class serializes instances of itself and a different version of the same class attempts to deserialize them.

To deal with this issue, the serializer stores a version number for each class that is serialized. Both the #serializeOn:version: and #deserializeFrom:version: methods take a version integer as their second argument. A third method #serializedVersionAtDate: maps a given date to a particular version of a class.

The following example illustrates how different versions of a simple color class deal with serialization. The original RgbColor class that had three instance variables named red, green, and blue evolved to have a fourth instance variable added for the alpha value.

RgbColor >> serializedVersionAtDate: aDateAndTime

    aDateAndTime >= (DateAndTime year: 2006 month: 5 day: 14 timeZone: nil) ifTrue: [^1].
    ^0

RgbColor >> serializeOn: anObjectSerializingStream version: anInteger

    anInteger = 0 ifTrue: [^self serializeV0On: anObjectSerializingStream].
    ^super serializeOn: anObjectSerializingStream version: anInteger

RgbColor >> serializeV0On: anObjectSerializingStream

    anObjectSerializingStream writeObjectHeader: self version: 0.
    anObjectSerializingStream registerObject: self.
    anObjectSerializingStream nextPut: self red.
    anObjectSerializingStream nextPut: self green.
    anObjectSerializingStream nextPut: self blue.
    "note that we omit the alpha value"

RgbColor class >> deserializeFrom: anObjectDeserializingStream version: anInteger

    anInteger = 0 ifTrue: [^self deserializeV0From: anObjectSerializingStream].
    ^super deserializeFrom: anObjectDeserializingStream version: anInteger

RgbColor class >> deserializeV0From: anObjectDeserializingStream

    | instance |
    instance := self new.
    anObjectDeserializingStream registerObject: instance.
    instance red: anObjectDeserializingStream next.
    instance green: anObjectDeserializingStream next.
    instance blue: anObjectDeserializingStream next.
    instance alpha: self maximumColorValue.
    ^instance

To serialize a color object using the current version of the RgbColor class:

| color targetStream |
color := RgbColor red: 16rFFFF green: 16r0000 blue: 16r0000 alpha: 16rFFFF.
targetStream := ObjectSerializingStream toFile: (
    Directory desktop createFileIfDoesNotExist: 'test.stobjects').
targetStream nextPut: color.
targetStream close.

To serialize the color object using the previous version of the RgbColor class, simply configure the serializing stream with the cutoff date for the version:

| color targetStream |
color := RgbColor red: 16rFFFF green: 16r0000 blue: 16r0000 alpha: 16rFFFF.
targetStream := ObjectSerializingStream toFile: (
    Directory desktop createFileIfDoesNotExist: 'test.stobjects').
targetStream versionDate: (DateAndTime year: 2006 month: 1 day: 1 timeZone: nil).
targetStream nextPut: color.
targetStream close.

Deserialization is handled automatically, the object is automatically converted from the serialized version to the current version if necessary.

4.10. Advanced Customizations

If you must serialize objects differently that the default way, you have several options to configure the serializing and deserializing streams.

4.10.1. Configuring with Handlers

Rather than implementing custom serialization and deserialization methods, you can configure the streams with handler blocks. These can be used to:

  • change the serialized version of a particular class
  • rename a class when deserialized
  • completely change how an object is serialized

Here is an example that removes the alpha channel from RgbColor objects:

| targetStream sourceStream |
targetStream := ObjectSerializingStream toFile: (
    Directory desktop createFileIfDoesNotExist: 'test.stobjects').
targetStream addHandlerForClass: RgbColor block: [:class :version :object :stream |
    stream writeObjectHeader: object version: 0.
    stream registerObject: object.
    stream nextPut: object red.
    stream nextPut: object green.
    stream nextPut: object blue].
targetStream nextPut: RgbColor black.
targetStream close.

sourceStream := ObjectDeserializingStream fromFile: (
    Directory desktop fileNamed: 'test.stobjects').
sourceStream addHandlerForClassNamed: #RgbColor block: [:className :version :stream |
    RgbColor
        red: stream next
        green: stream next
        blue: stream next].
sourceStream next.
sourceStream close.

Note that the handlers are not inherited by subclasses so you must explicitly add handlers to each subclass, if needed.

4.10.2. Configuring with Selectors

A serializing stream can be configured with two selectors, a serialize and a version selectors. The serialize selector is the selector that is sent to an object to serialize itself on a serializing stream in a given version. By default, this is the #serializeOn:version: method selector. The version selector is the selector sent to an object to query the serialized version of a class on the specified date. By default, this is the #serializedVersionAtDate: method selector.

A deserializing stream can be be configured with a deserialize selector. The deserialize selector is the selector sent to a class to deserialize an object from a deserializing stream in a given version. By default, this is the #deserializeFrom:version: method selector.

Last modified May 23, 2006.