Trouble shooting Validating that caches are being replicated across th

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
3. Trouble shooting
Validating that caches are being replicated across the cluster correctly.
To test if caches are being replicated correctly between the two nodes in the cluster.
Log in to one node in the cluster. Going directly to the node is easiest, bypassing any load balancer.
Go to Administration / Issue Types and edit the name of an issue type
Log in to the other node(s) in the cluster
Go to Administration / Issue Types and check that the edited name from step-b appears correctly.
If the new value is no seen on the other nodes then the cluster is not communicating properly.
You may need to disable your firewall, or at least allow the ports configured above to pass through. Some systems, especially later versions of linux block these even on the internal localhost network.
You need to ensure multicast is supported. For Linux you may need to turn it on. Multicast is often not enabled for the local host.
# ifconfig lo multicast
# ifconfig lo
lo: flags=4169<UP,LOOPBACK,RUNNING,MULTICAST> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 4974487 bytes 3608495877 (3.3 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4974487 bytes 3608495877 (3.3 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Each server needs to be able to resolve its own host name correctly. This is not as obvious as it seems and errors here can be difficult to detect
Some linux distributions will add entries to /etc/hosts such as
127.0.0.1 localhost.localdomain localhost
127.0.1.1 myhost.mycompany.com myhost
This may cause ehcache to announce itself to other nodes in the cluster as being located at 127.0.1.1. This is not helpful and will result in cache inconsistency across the cluster. You can set the logging level to ehcache in log4j.properties to trace to try and diagnose this sort of error.
log4j.logger.net.sf.ehcache.distribution = TRACE, console, filelog
Try removing the line refering to 127.0.1.1 from /etc/hosts or specify the hostName property for the cacheManagerPeerListenerFactory in the ehcache.xml
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="hostName=myhost.mycompany.com, port=40001" />3. Trouble shooting
Validating that caches are being replicated across the cluster correctly.
To test if caches are being replicated correctly between the two nodes in the cluster.
Log in to one node in the cluster. Going directly to the node is easiest, bypassing any load balancer.
Go to Administration / Issue Types and edit the name of an issue type
Log in to the other node(s) in the cluster
Go to Administration / Issue Types and check that the edited name from step-b appears correctly.
If the new value is no seen on the other nodes then the cluster is not communicating properly.
You may need to disable your firewall, or at least allow the ports configured above to pass through. Some systems, especially later versions of linux block these even on the internal localhost network.
You need to ensure multicast is supported. For Linux you may need to turn it on. Multicast is often not enabled for the local host.
# ifconfig lo multicast
# ifconfig lo
lo: flags=4169<UP,LOOPBACK,RUNNING,MULTICAST> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 4974487 bytes 3608495877 (3.3 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4974487 bytes 3608495877 (3.3 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Each server needs to be able to resolve its own host name correctly. This is not as obvious as it seems and errors here can be difficult to detect
Some linux distributions will add entries to /etc/hosts such as
127.0.0.1 localhost.localdomain localhost
127.0.1.1 myhost.mycompany.com myhost
This may cause ehcache to announce itself to other nodes in the cluster as being located at 127.0.1.1. This is not helpful and will result in cache inconsistency across the cluster. You can set the logging level to ehcache in log4j.properties to trace to try and diagnose this sort of error.
log4j.logger.net.sf.ehcache.distribution = TRACE, console, filelog
Try removing the line refering to 127.0.1.1 from /etc/hosts or specify the hostName property for the cacheManagerPeerListenerFactory in the ehcache.xml
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="hostName=myhost.mycompany.com, port=40001" />