0

I am trying to build an ElasticSearch cluster on Azure. I have done it successfully for testing purposes with 3 VM's under the same virtual network. It functioned very good.

Because of my subscription limits, I distrubuted those 3 VM into 3 different subscription. The only difference was that the VM's were not under the same virtual network, it wasn't possible because of the different subscriptions and structure of Azure... I used public ip's for my publish_host settings. It works for 5-6 minutes, I can create indices or do CRUD operations. After a few minutes the child nodes becomes unresponsive, _cluster/health does not respond and I can only reach to the master node which shows me health of the cluster is GREEN, but it is not.

If I try to create an index it fails to create shards, stucks, because of the unresponsive child nodes. I tried many things like different configuration combinations since a week but I could not find a solution. I checked all the logs they does not provide any information even when I set them to Debug or Trace mode. All I get is the unreachable node errors from master node after ca. 10 minutes. I am posting my configuration details:

Operating System: Ubuntu 16.04, default image provided by Canonical, 9200 and 9300 ports are open. Java Version: oracle-java8 ElasticSearch: 2.3.3 and 2.3.4 both same.

I can also provide my logs if you want but there is no clue as I understand it.

cluster.name: Alpha
node.name: Vulcan
network.host: _eth0_ #tried 0.0.0.0 too. eth0 is a local ip address like 10.0.0.4 or 192.168.0.4 assigned by the nic.
network.publish_host: myes1.westeurope.cloudapp.azure.com # I have to use this parameter when my nodes are not under the same network, but setting this variable creates the problem I explained.
discovery.zen.ping.unicast.hosts: ["myes1.westeurope.com","myes2.westeurope.cloudapp.azure.com","myes3.westeurope.cloudapp.azure.com"]
discovery.zen.minimum_master_nodes: 2
#Paths
path:
  logs: /var/log/elasticsearch
  data: /var/data/elasticsearch

P.S. Azure Discovery is not a suitable solution for me. I use ARM VMs not Cloud Service.

iboware
  • 111
  • 3
  • What is the reason for using different subscriptions? There is no reason I a know of why these can't be on the same sub and use the same Vnet, which will make life much easier – Sam Cogan Jul 12 '16 at 18:39
  • Limitation is my credits. Do you have any suggestion? – iboware Jul 13 '16 at 15:29
  • Well my suggestion would be to use a single subscription and pay for the required services. MS aren't going to support running a service over multiple subs so you are not going to get any SLA's or support from them. By the sounds of it you are using Public IP's to communicate between nodes over the internet which is a security concern as well. – Sam Cogan Jul 13 '16 at 15:30
  • Thanks. Security is not a big concern since I integrated Search Guard SSL plugin for ES. Transport and HTTP are working with SSL and Authentication. But you are right about SLA. I also got an answer about this subject there but no solution yet: https://discuss.elastic.co/t/elasticsearch-cluster-fails-5-minutes-after-starting-on-azure/55241/2 – iboware Jul 14 '16 at 21:32

0 Answers0