After Installing Elastic Search in production, many new developers leave it unconfigured. In this Post, we will look into important config file changes and its implications.
By default elastic search's cluster name is elastic search. It is mandatory to keep it unique per cluster (if you plan on one) or single endpoint. All nodes join the same clustername, so in case of production, chances of multiple VPS which has cluster name as elasticsearch forming a cluster is High, meaning your data which should be limited to single endpoint or VPS is distributed among all the nodes and you might not even know it.
what happens if cluster.name is not changed:
- Time taken for elasticsearch to start will be more.
- Slower Indexing and Results (since data is distributed).
- If one of the endpoints with which we formed cluster unknowingly is missing that means you have incomplete data.
Although not mandatory in SIngle Endpoints/VPS, when it comes to cluster you need to have identification of nodes joined. so setting this option in config file and restarting elasticsearch starts with your given node name.
If you dont want to store elasticsearch data in default location or if you're using NFS or encrypted disk. Then option should be set to your mounted path.
This option wouldn't matter unless you're using private and public networks. Like AWS or if you want your endpoint to serve your instance on Public Interface.
If you set it to private address like 192.168.x.x then its access is restricted to endpoints in that private network only.
**Note: If you want to use it only on localhost set it as 127.0.0.1, else this will be exposed on public network.
- When you're exposing it on Public network, make sure you have some authentication and authorization plugin like shield is used, or your data is at RISK.
- when forming a cluster if you don't set proper number of master nodes then chances of the cluster entering split brain issue is high and you cluster might lose data.
- if you're planning for a cluster then keeping data nodes separate and having at least 1 replica will keep your cluster in high availability state and also serves as Disaster Management technique.