Java - Serialization

java

// 3 requirements to implement a serializable class:
1.  The class must implement the java.io.Serializable interface.
2.  All of the fields in the class must be serializable. If a 
    field is not serializable, it must be marked transient.
3.  The usage of the ObjectInputStream and ObjectOutputStream class

// The writeReplace method is used to obscure the value of a sensitive
// field, and the readResolve method is used to onobscure the value
// that was obscured.

// The ObjectInputValidation interface is used to validate the result of
// the unserialize process.  Our class have to implement this interface
// and provide concrete implementation for the validateObject method.
// If something looks wrong, we can throw the InvalidObjectException exception

// If we need to encrypt the serialized data, we can sign and seal the entire 
// object (rather than individual fields). The simplest thing is to put it in 
// a javax.crypto.SealedObject and/or java.security.SignedObject wrapper. Both 
// are serializable, so wrapping your object in SealedObject creates a sort of 
// "gift box" around the original object. You need a symmetric key to do the 
// encryption, and the key must be managed independently. Likewise, you can 
// use SignedObject for data verification, and again the symmetric key must 
// be managed independently.

public class Employee implements java.io.Serializable {
   public String name;
   public String address;
   public transient int SSN;
   public int number;

   public void mailCheck() {
      System.out.println("Mailing a check to " + name + " " + address);
   }
}

// Serializing an Object
import java.io.*;
public class SerializeDemo {

   public static void main(String [] args) {
      Employee e = new Employee();
      e.name = "Reyan Ali";
      e.address = "Phokka Kuan, Ambehta Peer";
      e.SSN = 11122333;
      e.number = 101;

      try {
         FileOutputStream fileOut = new FileOutputStream("/tmp/employee.ser");
         ObjectOutputStream out = new ObjectOutputStream(fileOut);
         out.writeObject(e);
         out.close();
         fileOut.close();
         System.out.printf("Serialized data is saved in /tmp/employee.ser");
      }catch(IOException i) {
         i.printStackTrace();
      }
   }
}

// Deserializing an Object
import java.io.*;
public class DeserializeDemo {

   public static void main(String [] args) {
      Employee e = null;
      try {
         FileInputStream fileIn = new FileInputStream("/tmp/employee.ser");
         ObjectInputStream in = new ObjectInputStream(fileIn);
         e = (Employee) in.readObject();
         in.close();
         fileIn.close();
      } catch(IOException i) {
         i.printStackTrace();
         return;
      } catch(ClassNotFoundException c) {
         System.out.println("Employee class not found");
         c.printStackTrace();
         return;
      }

      System.out.println("Deserialized Employee...");
      System.out.println("Name: " + e.name);
      System.out.println("Address: " + e.address);
      System.out.println("SSN: " + e.SSN);
      System.out.println("Number: " + e.number);
   }
}

If a class implements serializable then all its sub classes will also be 
serializable. Parent class properties are inherited to subclasses so if parent 
class is Serializable, subclass would also be.

By implementating java.io.Serializable, you get "automatic" serialization 
capability for objects of your class. No need to implement any other logic, 
it'll just work. The Java runtime will use reflection to figure out how to 
marshal and unmarshal your objects. In earlier version of Java, reflection 
was very slow, and so serializaing large object graphs (e.g. in client-server 
RMI applications) was a bit of a performance problem. To handle this situation, 
the java.io.Externalizable interface was provided, which is like 
java.io.Serializable but with custom-written mechanisms to perform the 
marshalling and unmarshalling functions (you need to implement readExternal 
and writeExternal methods on your class). This gives you the means to get 
around the reflection performance bottleneck.

In recent versions of Java (1.3 onwards, certainly) the performance of 
reflection is vastly better than it used to be, and so this is much less of
a problem. I suspect you'd be hard-pressed to get a meaningful benefit from 
Externalizable with a modern JVM. Also, the built-in Java serialization 
mechanism isn't the only one, you can get third-party replacements, such as 
JBoss Serialization, which is considerably quicker, and is a drop-in 
replacement for the default.

What are the 3 requirements to implement a serializable class?

  1. The class must implement the java.io.Serializable interface.
  2. All of the fields in the class must be serializable. If a field is not serializable, it must be marked transient.
  3. The usage of the ObjectInputStream and ObjectOutputStream class
public class Employee implements java.io.Serializable {
   public String name;
   public String address;
   public transient int SSN;
   public int number;

   public void mailCheck() {
      System.out.println("Mailing a check to " + name + " " + address);
   }
}

// Serializing an Object
import java.io.*;
public class SerializeDemo {

   public static void main(String [] args) {
      Employee e = new Employee();
      e.name = "Reyan Ali";
      e.address = "Phokka Kuan, Ambehta Peer";
      e.SSN = 11122333;
      e.number = 101;

      try {
         FileOutputStream fileOut = new FileOutputStream("/tmp/employee.ser");
         ObjectOutputStream out = new ObjectOutputStream(fileOut);
         out.writeObject(e);
         out.close();
         fileOut.close();
         System.out.printf("Serialized data is saved in /tmp/employee.ser");
      }catch(IOException i) {
         i.printStackTrace();
      }
   }
}

// Deserializing an Object
import java.io.*;
public class DeserializeDemo {

   public static void main(String [] args) {
      Employee e = null;
      try {
         FileInputStream fileIn = new FileInputStream("/tmp/employee.ser");
         ObjectInputStream in = new ObjectInputStream(fileIn);
         e = (Employee) in.readObject();
         in.close();
         fileIn.close();
      } catch(IOException i) {
         i.printStackTrace();
         return;
      } catch(ClassNotFoundException c) {
         System.out.println("Employee class not found");
         c.printStackTrace();
         return;
      }

      System.out.println("Deserialized Employee...");
      System.out.println("Name: " + e.name);
      System.out.println("Address: " + e.address);
      System.out.println("SSN: " + e.SSN);
      System.out.println("Number: " + e.number);
   }
}

How does serialization work for an IS-A relationship?

If a class implements serializable then all its sub classes will also be serializable. Parent class properties are inherited to subclasses so if parent class is Serializable, subclass would also be.

How does serialization work for an HAS-A relationship?

If a class has a reference of another class, all the references must be Serializable otherwise serialization process will not be performed. In such case, NotSerializableException is thrown at runtime.

How does serialization handle static member?

If there is any static data member in a class, it will not be serialized because static is the part of class not object.

How does serialization work for arrays and collections?

In case of array or collection, all the objects of array or collection must be serializable. If any object is not serialiizable, serialization will be failed.

What is the difference between the Serializable interface and the Externalizable interface?

By implementating java.io.Serializable, you get "automatic" serialization capability for objects of your class. No need to implement any other logic, it'll just work. The Java runtime will use reflection to figure out how to marshal and unmarshal your objects. In earlier version of Java, reflection was very slow, and so serializaing large object graphs (e.g. in client-server RMI applications) was a bit of a performance problem. To handle this situation, the java.io.Externalizable interface was provided, which is like java.io.Serializable but with custom-written mechanisms to perform the marshalling and unmarshalling functions (you need to implement readExternal and writeExternal methods on your class). This gives you the means to get around the reflection performance bottleneck.

In recent versions of Java (1.3 onwards, certainly) the performance of reflection is vastly better than it used to be, and so this is much less of a problem. I suspect you'd be hard-pressed to get a meaningful benefit from Externalizable with a modern JVM. Also, the built-in Java serialization mechanism isn't the only one, you can get third-party replacements, such as JBoss Serialization, which is considerably quicker, and is a drop-in replacement for the default.

A big downside of Externalizable is that you have to maintain this logic yourself - if you add, remove or change a field in your class, you have to change your writeExternal/readExternal methods to account for it.

Why do we need serializable objects?

Our objects should be serializable if we need to store it to a file on disk, or we need to transfer the object over the network like as part of a remote procedure call.

Is JSON an acceptable alternative for serialization?

Perhaps JSON can be an acceptable alternative for serialization, but we might need to do some work, like extending the JSONObject class, or come up with way to serialize our object instead of using readObject and writeObject.

Is using the database an acceptable alternative for serialization?

Perhaps using the database can be an acceptable alternative for serialization. If needed, we can query the database, convert each row into a JSONObject and send the JSONObject across the network. Inside the database, each field would be stored in a separate column. Perhaps we use a JSON database such as MongoDB.

What is Java Object Serialization capable of?

  1. Adding new fields to a class
  2. Changing the fields from static to nonstatic
  3. Changing the fields from transient to nontransient

Going the other way (from nonstatic to static or nontransient to transient) or deleting fields requires additional massaging, depending on the degree of backward compatibility you require.

Is Java Serialization secure?

Serialization is not secured by default. Serialization binary format is fully documented and entirely reversible. In fact, just dumping the contents of the binary serialized stream to the console is sufficient to figure out what the class looks like and contains. When making remote method calls via RMI, for example, any private fields in the objects being sent across the wire appear in the socket stream as almost plain-text, which clearly violates even the simplest security concerns. Fortunately, Serialization gives us the ability to "hook" the serialization process and secure (or obscure) the field data both before serialization and after deserialization. We can do this by providing a writeObject method on a Serializable object. Alternatively, when transfer sensitive data over the network, we should use a secure channel such as SSL or secured-VPN. If we are concerned with the security of the serialized data, I wonder if we can use web services to eliminate this concern. If web services, the data is stored in the database which should be secured, and only ID is passed around if the user is authenticated.

A lady never reveals her age and a gentleman never tells. We can obscure this data by rotating the bits once to the left before serialization, and then rotate them back after deserialization. To "hook" the serialization process, we'll implement a writeObject method on Person; and to "hook" the deserialization process, we'll implement a readObject method on the same class. It's important to get the details right on both of these — if the access modifier, parameters, or name are at all different from what's shown in Listing 4, the code will silently fail, and our Person's age will be visible to anyone who looks.

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a)
    {
        this.firstName = fn; this.lastName = ln; this.age = a;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }

    private void writeObject(java.io.ObjectOutputStream stream)
        throws java.io.IOException
    {
        // "Encrypt"/obscure the sensitive data
        age = age >> 2;
        stream.defaultWriteObject();
    }

    private void readObject(java.io.ObjectInputStream stream)
        throws java.io.IOException, ClassNotFoundException
    {
        stream.defaultReadObject();

        // "Decrypt"/de-obscure the sensitive data
        age = age << 2;
    }

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " age=" + age +
            " spouse=" + (spouse!=null ? spouse.getFirstName() : "[null]") +
            "]";
    }      

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;
}

How can we sign and seal the serialized data?

If we need to encrypt the serialized data, we can sign and seal the entire object (rather than individual fields). The simplest thing is to put it in a javax.crypto.SealedObject and/or java.security.SignedObject wrapper. Both are serializable, so wrapping your object in SealedObject creates a sort of "gift box" around the original object. You need a symmetric key to do the encryption, and the key must be managed independently. Likewise, you can use SignedObject for data verification, and again the symmetric key must be managed independently.

Together, these two objects let you seal and sign serialized data without having to stress about the details of digital signature verification or encryption.

What is the purpose of the writeReplace method and the readResolve method part of the serialization process?

From time to time, a class contains a core element of data from which the rest of the class's fields can be derived or retrieved. In those cases, serializing the entirety of the object is unnecessary. You could mark the fields transient, but the class would still have to explicitly produce code to check whether a field was initialized every time a method accessed it.

Given the principal concern is serialization, it's better to nominate a flyweight or proxy to go into the stream instead. Providing a writeReplace method on the original Person allows a different kind of object to be serialized in its place; similarly, if a readResolve method is found during deserialization, it is called to supply a replacement object back to the caller.

Together, the writeReplace and readResolve methods enable a Person class to pack a PersonProxy with all of its data (or some core subset of it), put it into a stream and then unwind the packing later when it is deserialized.

class PersonProxy
    implements java.io.Serializable
{
    public PersonProxy(Person orig)
    {
        data = orig.getFirstName() + "," + orig.getLastName() + "," + orig.getAge();
        if (orig.getSpouse() != null)
        {
            Person spouse = orig.getSpouse();
            data = data + "," + spouse.getFirstName() + "," + spouse.getLastName() + ","  
              + spouse.getAge();
        }
    }

    public String data;
    private Object readResolve()
        throws java.io.ObjectStreamException
    {
        String[] pieces = data.split(",");
        Person result = new Person(pieces[0], pieces[1], Integer.parseInt(pieces[2]));
        if (pieces.length > 3)
        {
            result.setSpouse(new Person(pieces[3], pieces[4], Integer.parseInt
              (pieces[5])));
            result.getSpouse().setSpouse(result);
        }
        return result;
    }
}

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a)
    {
        this.firstName = fn; this.lastName = ln; this.age = a;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    private Object writeReplace()
        throws java.io.ObjectStreamException
    {
        return new PersonProxy(this);
    }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }   

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " age=" + age +
            " spouse=" + spouse.getFirstName() +
            "]";
    }    

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;
}

Note that the PersonProxy has to track all of Person's data. Often this means the proxy will need to be an inner class of Person to have access to private fields. The Proxy will also sometimes need to track down other object references and serialize them manually, such as Person's spouse. This trick is one of the few that isn't required to be read/write balanced. For instance, a version of a class that's been refactored into a different type could provide a readResolve method to silently transition a serialized object over to a new type. Similarly, it could employ the writeReplace method to take old classes and serialize them into new versions.

What is the purpose of the ObjectInputValidation class?

This is regarding serialization. It would be nice to assume that the data in the serialized stream is always the same data that was written to the stream originally. But, as a former President of the United States once pointed out, it's a safer policy to "trust, but verify."

In the case of serialized objects, this means validating the fields to ensure that they hold legitimate values after deserialization, "just in case." We can do this by implementing the ObjectInputValidation interface and overriding the validateObject() method. If something looks amiss when it is called, we throw an InvalidObjectException.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License