三台es实例的集群日志出现org.elasticsearch.discovery.MasterNotDiscoveredException问题

[WARN ][r.suppressed] [elasticsearch-01] path: /_cat/health, params: {pretty=, v=}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.1.0.jar:7.1.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.1.0.jar:7.1.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.1.0.jar:7.1.0]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.1.0.jar:7.1.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.1.0.jar:7.1.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_11]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_11]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_11]

解决方法

(1)确认hostname配置正确

echo ${HOSTNAME}

(2)确认elasticsearch.yml文件下面配置项正确

# 三台实例保证相同
cluster.name: my-cluster
# 设置成对应的 ${HOSTNAME}
node.name: es-01
# 设置成三台实例的 ${HOSTNAME}
discovery.seed_hosts: ["es-01", "es-02", "es-03"]
cluster.initial_master_nodes: ["es-01", "es-02", "es-03"]

(3)如果上面都确认过了,还是出现该错误,那么再调整日志级别,打印更详细的日志

logger.org.elasticsearch.cluster.coordination.ClusterBootstrapService: TRACE
logger.org.elasticsearch.discovery: TRACE

我的机器由2网卡enp0s3、enp0s8,我通过调整日志级别后,在日志中发现es自动发现集群node从各个机器的enp0s3上10.0.2.15的IP上发现,应该是network.host绑定0.0.0.0导致的

# 增加 network.publish_host 配置
network.publish_host: 192.168.1.100