Thursday, 26 August 2010

Tomcat Cluster Configuration without multicast

When trying to set this up, I could not find much on the web or in the tomcat documentation on how to get this configured.

This post should fix that. Apologies for the unpolished nature, but I wanted to record the configuration I got working as soon as possible and am pressed for time at the moment.


The basic configuration for the cluster on the first server is shown below.

<Cluster channelSendOptions="8" channelStartOptions="3" className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Manager className="org.apache.catalina.ha.session.DeltaManager" expireSessionsOnShutdown="false" domainReplication="true" notifyListenersOnReplication="true" />
<Channel className="org.apache.catalina.tribes.group.GroupChannel">

<Membership className="org.apache.catalina.tribes.membership.McastService" bind="127.0.0.1" domain="test-domain"/>

<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" />
</Sender>
<Receiver address="127.0.0.1" className="org.apache.catalina.tribes.transport.nio.NioReceiver" maxThreads="6" port="4000" selectorTimeout="5000" />
<Interceptor className="com.dm.tomcat.interceptor.DisableMulticastInterceptor" />
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector" />
<Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
<Member className="org.apache.catalina.tribes.membership.StaticMember" port="4001" host="127.0.0.1" domain="test-domain" uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}" />
</Interceptor>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor" />
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;" />
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener" />

</Cluster>

Other servers have a similar configuration, but with other Members and modifications to the ports that are specified.

Notes
Setting channelStartOptions to 3 is supposed to disable multicast according to the tomcat docs:

<Cluster channelSendOptions="8" channelStartOptions="3" className="org.apache.catalina.ha.tcp.SimpleTcpCluster">


The Receiver port needs to match the port of the Member element on the Member server configuration. So if we have a receiver port of 4000 for Server A, we need to make sure that where the member is defined on other servers that port 4000 is used, so

Server A:
<Receiver address="127.0.0.1" className="org.apache.catalina.tribes.transport.nio.NioReceiver" maxThreads="6" port="4000" selectorTimeout="5000" />


Server A Member defined on all other cluster servers:
<Member className="org.apache.catalina.tribes.membership.StaticMember" port="4000" host="127.0.0.1" domain="test-domain" uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}" />

Obviously the uniqueId for a Member needs to be unique for each cluster member. It needs to be specified as shown as a byte array of exactly 16 bytes.


This DisableMulticastInterceptor prevents multicast messages being sent. This class is not included in the tomcat distribution, but need to be as below:

public class DisableMulticastInterceptor extends ChannelInterceptorBase { @Override public void start(int svc) throws ChannelException { svc = (svc & (~Channel.MBR_TX_SEQ)); super.start(svc); } }

3 comments:

mutina said...

One thing I don't understand here (and I couldn't find an explanation on the apache docs site) is where the uniqueId is defined. I presume that they're automatically getting registered with the host the replication data is being sent to, but that receiver wouldn't know what the id means, no?

magic said...

Hi Mark,

Thanks for informative post. I was curious to know, if there is any specific reason to choose MBR_TX_SEQ and not MBR_RX_SEQ .

Is there any reason to not to exclude both of these flags?

David Rehle said...

i don't understand, you are binding all to 127.0.0.1 so this should work only on a cluster on the same machine.
How to define the bindings if the cluster is over multiple machines?