Tomcat Cluster Session Replication

What are different ways to implement session replication?

  • Persistence (storing session information in a database)
  • Using IP multi-casting
  • Without using IP multi-casting
  • Using Session Manager

What is the role of the Session Manager?

Session Manager is used to create and manage the session on the behalf of the application. According to the servlet specification, when request.getSession() is used, the container (Tomcat) is responsible for creating the session, and deleting the session (when it expires).

What are the four types of Session Manager?

  • Standard Manager
  • Persistent Manager
  • Delta Manager
  • Backup Manager

What is the role of the Standard (Session) Manager?

This is the default manager used by Tomcat. If we want to customize this standard manager, add the <Manager> tag to context.xml file:

<Manager className="org.apache.catalina.session.StandardManager"/>

What is the role of Persistent (Session) Manager?

This manager is used to store the session information into a persistent place after some interval. This manager support two types of store: File, and JDBC.

This session manager does not swap out in real time. It pushes the information after certain interval. This is strange, but maybe it does not save session information to the database in real time, and it only save the changes to the database after certain interval. If anything happens (crash) before that interval, then in-memory session data is lost.

What is the role of Delta (Session) Manager?

It replicate the session to all other instances, so this manager is usually used to implement cluster environment, but it is not good for large cluster.

What is the role of Backup (Session) Manager?

This is usually used to implement clustered environment. It's like delta manager, but it will replicate to exactly one other instance (backup instance). To use Backup (Session) Manager, it looks like we do not have to make any change beside changing from DeltaManager to BackupManager. The Backup Manager automatically pick one node to act as the backup, and replicate session information to that node. If tomcat1 is down, the load balancer automatically direct the request to tomcat2, which will then ask all the nodes in the cluster "whose already have a copy of this session". Tomcat3 has a copy so tomcat3 inform tomcat2 and replicate the session to tomcat2. Now, tomcat2 act as the primary, tomcat2 gives the response to the user and handle all subsequent request (sticky session) and tomcat3 remains as the backup node for this session.

The DeltaManager replicates all changed session data to all nodes in the cluster. The BackupManager backs up session data to a specific "backup" node. For large clusters, the BackupManager is the option to go with. For small clusters, it is common to just use the default DeltaManager

How can we enable multicast?

In Linux environment, most system kernel is capable to process multicast, but we need to add route entry:

sudo route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0

Multicast address belong to class D address range (224.0.0.0 to 239.255.255.255). Here we inform the kernel that if anyone access these address then it goes through eth0 interface.

What are the disadvantages of using the all-to-all in-memory session replication?

With the all-to-all in-memory session replication, every worker in our cluster will replicate their session across every other worker. This is not always the most efficient method of session replication in high load environments.

Tomcat provides in-memory session replication through a combination of serializable session attributes, "sticky sessions", which are provided by the load balancer, and specialized components configured in Tomcat's XML configuration files.

What are the requirements for using Tomcat's built-in session replication?

In order to use Tomcat's built-in session replication, any session attribute or class that will need to be available in the event of a failover must implement java.io.Serializable. This interface allows the JVM to convert session objects into bytecode that can be stored in memory and copied to other instances. All JavaBeans are technically required to be serializable by default, but you should make sure that all your session attributes properly implement the interface.

If we want to use all-to-all session replication, which session manager do we need to use?

DeltaManager

What is the purpose of the channelSendOptions attribute?

The channelSendOptions is an attribute of the Cluster element:

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="6">

The channelSendOptions attribute control how messages are sent between cluster nodes. Are these messages sent synchronously, asynchronously.

Do the DeltaManager and the BackupManager use IP multi-casting?

Yes.

Why do people generally not implement session replication or session sharing using a database with Tomcat?

The PersistentManager does not synchronize the session information in real-time. It only synchronize session information at certain interval, which means, it is possible that the machine can go offline and some session information is not persisted to the database.

What is the disadvantage of using the PersistentManager?

The PersistentManager does not synchronize the session information in real-time. It only synchronize session information at certain interval, which means, it is possible that the machine can go offline and some session information is not persisted to the database.

How can we implement a persistent manager using a file store?

To setup a persistent session manager you must comment out the <Cluster> element in each instance, this disables the In-Memory replication mechanism, then add a context.xml file to each instance with the below:

<Context
  <Manager className="org.apache.catalina.session.PersistentManager" >
     <Store className="org.apache.catalina.session.FileStore"
            directory="c:\\cluster\shareddir"
     />
  </Manager>
</Context>

How can we implement a persistent manager using a database?

The only difference between a file-based store and a JDBC-based store is the <store> element:

<Store className="org.apache.catalina.session.JDBCStore"
       connectionURL="jdbc:mysql://localhost/datadisk?user=tomcat&amp;password=tomcat"
       driverName="com.mysql.jdbc.Driver"
       sessionIdCol="session_id"
       sessionValidCol="valid_session"
       sessionMaxInactiveCol="max_inactive"
       sessionLastAccessCol="last_access"
       sessionTable="tomcat_sessions"
       sessionAppCol="app_context"
       sessionDataCol="session_data"
/>

The table in your database should be:

create table tomcat_sessions (
  session_id varchar(100) not null primay key,
  valid_session char(1) not null,
  max_inactive int not null,
  last_access bigint not null,
  app_context varchar(255),
  session_data meduimlob,
  KEY kapp_context(app_context)
);

1. Update the <Cluster> element under the <Engine> element in the conf/server.xml file:

<Engine name="<meaningful_unique_name>" defaultHost="localhost">      
     <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
              channelSendOptions="8">
          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>
          <Channel className="org.apache.catalina.tribes.group.GroupChannel">
               <Membership className="org.apache.catalina.tribes.membership.McastService"
                           address="228.0.0.4"
                           port="45564"
                           frequency="500"
                           dropTime="3000"/>
               <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                         address="auto"
                         port="4000"
                         autoBind="100"
                         selectorTimeout="5000"
                         maxThreads="6"/>
               <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
                   <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
               </Sender>
               <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
               <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
          </Channel>
          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
                 filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.css;.*\.txt;"/>
          <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
     </Cluster>
     ...
</Engine>

2. Mark your web application as distributable.

Tomcat distribution comes with the "examples" web application. Open web.xml files in the "examples" web app in the webapps folder and mark this web application as distributable by adding <distributable/> element at the end of the web.xml file (just before the </web-app> element.

3. Add the session JSP file

This JSP file prints the contents of the session and also increments a counter stored in the session.

4. Access /examples/jsp/session.jsp

Try refresh the page a few times, you should see the counter getting updated.

How can we check if your tomcat instances are up and running and communicating with each other using Multicast?

ping -t 1 -c 2 228.0.0.4

64 bytes from <server_1_ip> ...
64 bytes from <server_2_ip> ...

See http://blogs.agilefaqs.com/2009/11/09/setting-up-tomcat-cluster-for-session-replication/

What is the difference between the DeltaManager and the BackupManager?

The DeltaManager replicates all changed session data to all nodes in the cluster. The BackupManager replicate session data to a specific backup node. For large cluster, use BackupManager. For small cluster, use the DeltaManager.

How can we mark your web application as distributable in your web application web.xml file?

Open up your web application web.xml file and add:

<distributable/>

to the end of the file just before the closing </webapp>. Beside this, if your application stores any object inside the session, your application must implement some Java serializable interface (your object must be serializable)

Note on the default settings:

Any session data created on a server will be duplicated to all other servers in the cluster. If your application creates session data for a user, and you have a heterogeneous cluster, the session data will still be replicated across the other nodes. A heterogeneous configuration is one that does not have all applications across nodes. Therefore, if application A stores session data for a user, and application A is running on server A, but not server B, session data will replicate to server B even though there is no user for it there.

Default Multicast Setup: The cluster is discovered and maintained via multicast heartbeats. The server will be set up with a default multicast IP address of 228.0.0.4 and a multicast port of 45564. This means, any other nodes that are using the same multicast address and port will see this cluster / node.

Can we enable session replication per engine or per host?

Yes. Simply add:

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/>

to your <Engine> element or your <Host> element to enable clustering. Using the above configuration will enable all-to-all session replication using the DeltaManager to replicate session information. This works great for small cluster, but for large cluster, we should use the BackupManager. The DeltaManager will replicate session information to all nodes, even nodes that don't have application deployed. The BackupManager replicate session information to only one random backup node, and only to nodes that have the application deployed. The BackupManager is not as well-tested as the DeltaManager.

What are the important default values?

  1. Multicast address: 228.0.0.4
  2. Multicast port: 45564 (the port and the address determine cluster membership)
  3. The TCP port listening for replication messages is the first available server socket in range 4000-4100
  4. Two listeners are configured ClusterSessionListener and JvmRouteSessionIDBinderListener
  5. Two interceptors are configured TcpFailureDetector and MessageDispatch15Interceptor

Here is the default cluster configuration:

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="8">
    <Manager 
        className="org.apache.catalina.ha.session.DeltaManager"
        expireSessionsOnShutdown="false"
        notifyListenersOnReplication="true"
    />
    <Channel
        className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership
                clasName="org.apache.catalina.tribes.membership.McastService"
                address="228.0.0.4"
                port="45564"
                frequency="500"
                dropTime="3000"
            />
            <Receiver
                className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                address="auto"
                port="4000"
                autoBind="100"
                selectorTimeout="5000"
                maxThreads="6"
            />
            <Sender
                className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
                <Transport
                    className=".tribes.transport.nio.PooledParallelSender"/>
            </Sender>
            <Interceptor
                className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
            <Interceptor
                className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    </Channel>
    <Valve className=".ha.tcp.ReplicationValve filter=".*\.gif;.*\.js;.*\.jpg;.*\.htm;.*\.html;.*\.css;.*\.txt" />
    <Valve className=".ha.session.JvmRouteBinderValve" />
    <Deployer
        className=".ha.deploy.FarmWarDeployer"
        tempDir="/tmp/war-temp/"
        deployDir="/tmp/war-deploy/"
        watchDir="/tmp/war-listen/"
        watchEnabled="false"
    />
    <ClusterListener
        className=".ha.session.JvmRouteSessionIDBinderListener"
    />
    <ClusterListener
        className=".ha.session.ClusterSessionListener"
    />
</Cluster>

To run session replication in your Tomcat 6.0 container, the following steps must be completed:

  1. All your session attributes must implement java.io.Serializable
  2. Make sure that your web.xml file has the <distributable/> element just before the closing </webapp>
  3. Uncomment the Cluster element in your server.xml file
  4. If you have defined custom cluster valves, make sure you have the ReplicationValve defined as well under the Cluster element in server.xml
  5. If your Tomcat instances are running on the same machine, make sure the tcpListenPort attribute is unique for each instance. In most cases, Tomcat is smart enough to resolve this on its own by autodetecting available ports in the range 4000-4100
  6. If you are using mod_jk, make sure that jvmRoute attribute is set at your Engine tag (<Engine name="Catalina" jvmRoute="node01">) and that the jvmRoute attribute value matches your worker name in worker.properties
  7. Make sure that all nodes have the same time and sync with NTP service
  8. Make sure that your load balancer is configured for sticky session mode.

To enable session replication in Tomcat, three different paths can be followed to achieve the exact same thing:

  1. Using session persistence, and saving the session to a shared file system (PersistenceManager + FileStore)
  2. Using session persistence, and saving the session to a shared database (PersistenceManager + JDBCStore)
  3. Using in-memory-replication via SimpleTcpCluster that ships with Tomcat 6 (lib/catalina-tribes.jar + lib/catalina-ha.jar)

Currently you can use the domain worker attribute (mod_jk > 1.2.8) to build cluster partitions with the potential of having a more scalable cluster solution with the DeltaManager (you'll need to configure the domain interceptor for this). In order to keep the network traffic down in an all-to-all environment, you can split your cluster into smaller groups. This can be achieved by using different multicast addresses for the different groups.

Session replication is only the beginning of clustering. Another popular concept used to implement clusters is farming. You deploy your apps only to one server, and the cluster will distribute the deployments across the entire cluster. This is all the capabilities that can go into the FarmWarDeployer

Membership is established using multicast heartbeats. Hence, if you wish to subdivide your clusters, you can do this by changing the multicast IP address or port in the <Membership> element. The heartbeat contains the IP address of the Tomcat node and the TCP port that Tomcat listen to for replication traffic. All data communication happens over TCP.

The ReplicationValve is used to find out when the request has been completed and initiate the replication, if any. Data is only replicated if the session has changed (by calling setAttribute or removeAttribute on the session).

One of the most important performance considerations is the synchronous versus asynchronous replication. In a synchronous replication mode the request doesn't return until the replicated session has been sent over the wire and reinstantiated on all the other cluster nodes. Synchronous vs. asynchronous is configured using the channelSendOptions flag and is an integer value. The default value for the SimpleTcpCluster / DeltaManager combo is 8, which is asynchronous. During async replication, the request is returned before the data has been replicated. Async replication yields shorter request times, and synchronous replication guarantees the session to be replicated before the request returns.

If you are using mod_jk and not using sticky sessions (or for whatever reasons, sticky session doesn't work), or you are simply failing over, the session id will need to be modified as it previously contained the worker id of the previous tomcat (as defined by jvmRoute attribute in the Engine element). To solve this, we use the JvmRouteBinderValve.

JvmRouteBinderValve rewrites the session id to ensure that the next request will remain sticky (and not fall back to random node because the previous worker is no longer available) after a failover. The valve rewrite the JSESSIONID value in the cookie with the same name. Not having this valve in place, will make it harder to ensure stickyness in case of a failure.

By default, if no valves are configured, the JvmRouteBinderValve is added on. The cluster message listener called JvmRouteSessionIDBinderListener is also defined by default and is used to actually rewrite the session id on the other nodes in the cluster once a fail over has occurred. If you've add your own valves or cluster listeners in server.xml, then the defaults are no longer valid, make sure that you add in all the appropriate valves and listeners as defined by the default.

The channelSendOptions is the flag that is attached to each message sent by the SimpleTcpCluster class or any objects that are invoking the SimpleTcpCluster.send method. The DeltaManager sends information using the SimpleTcpCluster.send method, while the backup manager sends directly through the channel.

Membership is done using multicasting. Tribes also supports static memberships using the StaticMembershipInterceptor if you want to extend your membership to points beyond multicasting.

The membership components broadcasts TCP address / port of itself to the other nodes so that communication between nodes can be done over TCP. The address being broadcasted is one defined by the address attribute within the Receiver element.

Tribes support having pool of senders, so that messages can be sent in parallel and if using NIO sender, you can send message concurrently. Concurrently means one message to multiple receivers at the same time, and Parallel means multiple message to multiple receiver at the same time.

TcpFailureDetector verifies crashed members through TCP. If multicast packets get dropped, this interceptor protects against false positive.

MessageDispatch15Interceptor dispatches messages to a thread (thread pool) to send message asynchronously.

ThroughputInterceptor prints out simple stats about message traffic.

The cluster uses valves to track requests to web applications. We've mentioned the ReplicationValve and the JvmRouteBinderValve above. The <Cluster> element itself is not part of the pipeline in Tomcat, instead, the cluster adds the valve to its parent container. If <Cluster> elements is configured in the <Engine> element, the valves get added to the engine and so on.

SimpleTcpCluster is itself a sender and receiver of the Channel object, components can register themselves as listeners to the SimpleTcpCluster. The listener above (ClusterSessionListener) listens for DeltaManager replication messages and applies the deltas to the manager that in turn applies it to the session.

What are the disadvantage of the default all-to-all session replication?

With all-to-all session replication, any session data created on a server will be duplicated to all other servers in the cluster.

What is the default multi-cast IP address?

The server will be set up with a default multi-cast IP address of 228.0.0.4 and a default multi-cast port of 45564. This means, any other nodes that are using the same multi-cast address and port will see this cluster / node.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License