I am trying to configure a Glassfish cluster following the official HA guide.
The app is a standard JSF app (with Primefaces). At first I thought that the problem was related to JSF itself, but then as soon as I got deeper in the matter, I realized that the problem is likely in the cluster configuration.
In fact if the cluster contains only one node everything works fine. As soon as I add another node, despite the log message shows nothing wrong is happening, a new JSESSIONID is created on each request.
This is the log that confirm that the instances are correctly seeing each others:
Instance 1:
[2020-04-16T09:11:43.201+0000] [glassfish 5.1] [INFO] [view.window.view.change] [ShoalLogger] [tid: _ThreadID=37 _ThreadName=GMS ViewWindowThread Group-my-cluster] [timeMillis: 1587028303201] [levelValue: 800] [[
GMS1092: GMS View Change Received for group: my-cluster : Members in view for ADD_EVENT(before change analysis) are :
1: MemberId: instance1, MemberType: CORE, Address: 10.0.20.9:9090:230.30.1.1:9090:my-cluster:instance1
2: MemberId: instance2, MemberType: CORE, Address: 10.0.10.14:9090:230.30.1.1:9090:my-cluster:instance2
3: MemberId: server, MemberType: SPECTATOR, Address: 10.0.10.4:9090:230.30.1.1:9090:my-cluster:server
]]
Instance 2
[2020-04-16T09:11:43.136+0000] [glassfish 5.1] [INFO] [view.window.view.change] [ShoalLogger] [tid: _ThreadID=45 _ThreadName=GMS ViewWindowThread Group-my-cluster] [timeMillis: 1
587028303136] [levelValue: 800] [[
GMS1092: GMS View Change Received for group: my-cluster : Members in view for ADD_EVENT(before change analysis) are :
1: MemberId: instance1, MemberType: CORE, Address: 10.0.20.9:9090:230.30.1.1:9090:my-cluster:instance1
2: MemberId: instance2, MemberType: CORE, Address: 10.0.10.14:9090:230.30.1.1:9090:my-cluster:instance2
3: MemberId: server, MemberType: SPECTATOR, Address: 10.0.10.4:9090:230.30.1.1:9090:my-cluster:server
]]
Also from DAS log the overall situation seems fine:
[2020-04-16T12:52:59.360+0000] [glassfish 5.1] [FINER] [] [ShoalLogger] [tid: _ThreadID=55 _ThreadName=GMS InDoubtPeerDetector Thread for Group-my-cluster] [timeMillis: 1587041579360] [levelValue: 400] [CLASSNAME: com.sun.enterprise.mgmt.HealthMonitor$InDoubtPeerDetector] [METHODNAME: processCacheUpdate] [[
processCacheUpdate : instance2 's state is aliveandready]]
......
[2020-04-16T12:52:59.359+0000] [glassfish 5.1] [FINER] [] [ShoalLogger] [tid: _ThreadID=55 _ThreadName=GMS InDoubtPeerDetector Thread for Group-my-cluster] [timeMillis: 1587041579359] [levelValue: 400] [CLASSNAME: com.sun.enterprise.mgmt.HealthMonitor$InDoubtPeerDetector] [METHODNAME: processCacheUpdate] [[
processCacheUpdate : instance1 's state is aliveandready]]
The web.xml contains the <distributable/>
tag and the app is deployed with --availabilityenabled true
and I have added <property name="relaxCacheVersionSemantics" value="true"/>
to the glassfish-web.xml.
Finally, the cookie is also set correctly and I am verifying the correctness of the cookie in the browser inspector.
<cookie-properties>
<property name="cookieDomain" value=".mydomain.com" />
<property name="cookiePath" value="/myapp" />
</cookie-properties>
I have spent almost a week trying to understand what's going on with no luck. All the articles and blog I have read reports to same same resolution steps which I have already applied. I have also increased logging to maximum level but there's no trace of error or similar.
One key factor is that the cluster is on Amazon AWS, and just because I am not sure that multicast is fully supported, I switched the cluster broadcast to TCP by using the GMS_DISCOVERY_LIST
. But apparently, as the instances are seeing each others, this settings works.
I have tried both Elastic Load Balancer and Apache HTTP load balancer, both of them with same effect. Also, enabling sticky session on ALB is not working because the balancer sees a different JSESSIONID
and therefore redirect to a different node each time.
I am trying to find a way to inspect the session mechanism, but I am not sure what specific logging I have to enable. Simply increasing javax logging results in an unreadable log.