Thursday, 26 August 2010

Tomcat Cluster Configuration without multicast

When trying to set this up, I could not find much on the web or in the tomcat documentation on how to get this configured.

This post should fix that. Apologies for the unpolished nature, but I wanted to record the configuration I got working as soon as possible and am pressed for time at the moment.


The basic configuration for the cluster on the first server is shown below.

<Cluster channelSendOptions="8" channelStartOptions="3" className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Manager className="org.apache.catalina.ha.session.DeltaManager" expireSessionsOnShutdown="false" domainReplication="true" notifyListenersOnReplication="true" />
<Channel className="org.apache.catalina.tribes.group.GroupChannel">

<Membership className="org.apache.catalina.tribes.membership.McastService" bind="127.0.0.1" domain="test-domain"/>

<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" />
</Sender>
<Receiver address="127.0.0.1" className="org.apache.catalina.tribes.transport.nio.NioReceiver" maxThreads="6" port="4000" selectorTimeout="5000" />
<Interceptor className="com.dm.tomcat.interceptor.DisableMulticastInterceptor" />
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector" />
<Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
<Member className="org.apache.catalina.tribes.membership.StaticMember" port="4001" host="127.0.0.1" domain="test-domain" uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}" />
</Interceptor>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor" />
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;" />
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener" />

</Cluster>

Other servers have a similar configuration, but with other Members and modifications to the ports that are specified.

Notes
Setting channelStartOptions to 3 is supposed to disable multicast according to the tomcat docs:

<Cluster channelSendOptions="8" channelStartOptions="3" className="org.apache.catalina.ha.tcp.SimpleTcpCluster">


The Receiver port needs to match the port of the Member element on the Member server configuration. So if we have a receiver port of 4000 for Server A, we need to make sure that where the member is defined on other servers that port 4000 is used, so

Server A:
<Receiver address="127.0.0.1" className="org.apache.catalina.tribes.transport.nio.NioReceiver" maxThreads="6" port="4000" selectorTimeout="5000" />


Server A Member defined on all other cluster servers:
<Member className="org.apache.catalina.tribes.membership.StaticMember" port="4000" host="127.0.0.1" domain="test-domain" uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}" />

Obviously the uniqueId for a Member needs to be unique for each cluster member. It needs to be specified as shown as a byte array of exactly 16 bytes.


This DisableMulticastInterceptor prevents multicast messages being sent. This class is not included in the tomcat distribution, but need to be as below:

public class DisableMulticastInterceptor extends ChannelInterceptorBase { @Override public void start(int svc) throws ChannelException { svc = (svc & (~Channel.MBR_TX_SEQ)); super.start(svc); } }

Deploy to Tomcat with no Downtime

Java web applications hosted in tomcat, can have a considerable startup time, particularly when libraries such as hibernate are used. Unless you have a large scale web site with multiple web servers, this means that deploying a new version of the application means a restart of the tomcat server, and this means downtime for the application.

It is however possible to eliminate this downtime by making use of tomcat's clustering capabilities.

The idea is that we define 2 tomcat instances in a cluster, fronted by the Apache web server. In normal operation1 only one of these tomcat instances is running. When we are ready to deploy a new release to the server, we deploy the war file to the tomcat instance that is not currently running. We start up this instance which then joins the cluster and has all the session information from the existing instance replicated to it. We can now test our application by connecting directly to the tomcat http connector ports.

Once we are happy that the application has deployed successfully, we switch the apache configuration to send all requests to the updated instance. As this has the session information replicated to it, users do not experience any downtime. Once the original instance has finished processing all of it's requests, it can be shutdown.

Should there turn out to be a problem with the release we have just performed, it is a simple matter to startup the old instance and change the apache configuration to switch back to the old instance.

With some clever scripting it is possible to manage deployments of different web applications within the cluster using the same techniques.

We can now perform deployments to our live servers without fear of downtime or unexpected errors. This can have a profound effect. No longer do we try and roll a number of features up into a single release, but can release more frequently increasing the rate at which we deliver useful features to our clients.

I will post another article on the details of the cluster configuration we have used detailing how to disable multicast (which does not work on Amazon EC2)