≡菜单

如何在Linux中的多个节点上设置Apache Zookeeper集群

Apache Zookeeper如果您在基础架构中运行Apache zookeeper,则应将其设置为以集群模式运行。 Zookeeper集群称为合奏。

为了使群集始终处于运行状态,群集中的大多数节点都应处于运行状态。因此,始终建议在奇数个服务器中运行zookeeper集群。例如,具有3个节点的群集,或具有5个节点的群集,等等。

在本教程中,我们’ll在以下服务器上设置具有3个节点设置的zookeeper集群:node1,node2和node3。

Java先决条件

对于Zookeeper,您应该已经在系统上安装了Java。 JKD版本6或更高版本可与Zookeeper一起使用。

以下内容将在您的系统上安装最新的Java版本:

yum install java-1.8.0-openjdk

验证是否正确安装了Java。

# java -version
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

验证:以独立模式启动Zookeeper进行测试

在启动Zookeeper集群之前,首先以单节点配置(没有集群设置)在单台计算机上启动Zookeeper,以确保其正常运行。

这样,我们’将隔离所有与群集无关的问题,并首先在各个节点上进行修复。

在这个例子中,我’在/ opt / zookeeper目录下安装了zookeeper。这使用的是最新的zookeeper 3.4.9版本:

ZOOKEEPER_HOME=/opt/zookeeper

在节点1上,使用zookeeper’的样本配置文件zoo_sample.cfg作为基线。

cd $ZOOKEEPER_HOME/conf
cp zoo_sample.cfg zoo.cfg

从现在开始,我们’将使用zoo.cfg作为我们的配置文件。我们’稍后将为我们的集群设置进行修改。

在node1上,执行以下命令以启动单节点zookeeper。

cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
  org.apache.zookeeper.server.quorum.QuorumPeerMain \
  conf / zoo.cfg

在上面的命令中:

  • 指定启动zookeeper时应包含的jar文件。这包括zookeeper jar文件,log4j,slf4j和slf4j-api jar文件。所有这些jar文件都随Zookeeper安装一起提供,您无需’不必单独下载它们。
  • QuorumPeerMain是应被启动以启动Zookeeper的主类的名称。
  • conf / zoo.cfg is the zookeeper configuration file.

如果一切顺利,你’将在屏幕上获得以下输出。在以下各行的开头,将带有时间戳,后跟“[myid:] – INFO ”

[main:[email protected]] - Reading configuration from: conf / zoo.cfg
[main:[email protected]] - autopurge.snapRetainCount set to 3
[main:[email protected]] - autopurge.purgeInterval set to 0
[main:[email protected]] - Purge task is not scheduled.
[main:[email protected]] - Either no config or no quorum defined in config, running  in standalone mode
[main:[email protected]] - Reading configuration from: conf / zoo.cfg
[main:[email protected]] - Starting server
[main:[email protected]] - Server environment:zookeeper.version=3.4.9-1757313, built 上  08/23/2016 06:50 GMT
[main:[email protected]] - Server environment:host.name=node1.thegeekstuff.com
[main:[email protected]] - Server environment:java.version=1.8.0_91
[main:[email protected]] - Server environment:java.vendor=Oracle Corporation
[main:[email protected]] - Server environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el7_2.x86_64/jre
[main:[email protected]] - Server environment:java.class.path=zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf
[main:[email protected]] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[main:[email protected]] - Server environment:java.io.tmpdir=/tmp
[main:[email protected]] - Server environment:java.compiler=NA
[main:[email protected]] - Server environment:os.name=Linux
[main:[email protected]] - Server environment:os.arch=amd64
[main:[email protected]] - Server environment:os.version=3.10.0-327.18.2.el7.x86_64
[main:[email protected]] - Server environment:user.name=root
[main:[email protected]] - Server environment:user.home=/root
[main:[email protected]] - Server environment:user.dir=/opt/zookeeper
[main:[email protected]] - tickTime set to 2000
[main:[email protected]] - minSessionTimeout set to -1
[main:[email protected]] - maxSessionTimeout set to -1
[main:[email protected]] - binding to port 0.0.0.0/0.0.0.0:2181
[main:[email protected]] - Reading snapshot /tmp/zookeeper/version-2/snapshot.363

注意:现在我们知道这在单个节点上可以正常工作,请按Ctrl-C并显示出来。

在node1和node3上重复上述测试,以确保Zookeeper在单用户独立模式下可在所有节点上工作。

Zookeeper启动期间可能的错误和解决方案

在上述独立模式Zookeeper启动测试期间,您可能会遇到以下错误:

错误1:您可能会收到以下信息“java.lang.NoClassDefFoundError:org / slf4j / LoggerFactory” error

Exception in thread "main" java.lang.NoClassDefFoundError:org / slf4j / LoggerFactory
  at org.apache.zookeeper.server.quorum.QuorumPeerMain.clinit(QuorumPeerMain.java:64)
Caused 通过: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

解决方案1:当您不这样做时,将发生上述错误’t have slf4j-log4j12’类路径中的s jar。在启动过程中,如下所示包含此jar文件。

java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain conf / zoo.cfg	

错误2:您可能会收到以下信息“错误[main:QuorumPeerMain @ 85]–无效的配置,异常退出” error

[我的身份:] - INFO  [main:[email protected]] - Reading configuration from: zoo.cfg
[myid:] - 错误[main:QuorumPeerMain @ 85]-无效的配置,异常退出
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing zoo.cfg
  at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:144)
  at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:101)
  at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused 通过: java.lang.IllegalArgumentException: zoo.cfg1 file is missing
 at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:128)
 ... 2 more

解决方案2:可能发生上述错误’找不到Zookeepers配置文件zoo.cfg。确保你’ve mentioned “conf/zoo.cfg”在命令末尾的命令行路径中,如下所示。

java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
 org.apache.zookeeper.server.quorum.QuorumPeerMain \
 conf / zoo.cfg	

错误3:您可能会收到以下找不到主类的错误消息。

Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain

解决方案3:确保从Zookeeper主目录启动Zookeeper。例如,如果您’在/ opt / zookeeper下安装了zookeeper,如下所示启动它:

export ZOOKEEPER_HOME=/opt/zookeeper
cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
 org.apache.zookeeper.server.quorum.QuorumPeerMain \
 conf / zoo.cfg	

设置Zookeeper群集:修改zoo.cfg文件

将以下行添加到$ ZOOKEEPER_HOME / conf / zoo.cfg文件中。这些参数是群集设置所必需的。

initLimit=5
syncLimit=2
server.1=node1.thegeekstuff.com:2888:3888
server.2=node2.thegeekstuff.com:2888:3888
server.3=node3.thegeekstuff.com:2888:3888

在上面:

  • initLimit这是超时限制,它指示仲裁中的Zookeeper节点之一必须连接到领导者的时间长度。
  • syncLimit这指定了各个节点与引导者之间不同步(即过期)的间隔的限制。
  • 以上两个init和sync限制是使用tickTime计算的。默认情况下,zoo.cfg中的tickTime设置为2000。这意味着2000毫秒。因此,当我们将initLimit设置为5时,将其乘以tickTime即可以秒为单位进行计算。因此,initLimit = 5 * 2000 = 10000 = 10秒。 syncLimit = 2 * 2000 = 4000 = 4秒。
  • server.1,server.2和server.3将列出所有三个节点。在这种情况下,除了提供完整的主机名,您还可以指定节点的ip地址。
  • 唐’t change the “:2888:3888”在节点的末端。 Zookeeper节点将使用这些端口将各个跟随者节点连接到领导者节点。另一个端口用于领导人选举。

另外,在zoo.cfg中,默认情况下,dataDir将指向/ tmp / zookeeper目录。将此更改为其他内容。

在zoo.cfg中,将dataDir设置为以下内容:

dataDir=/var/zookeeper

确保已创建此目录。

mkdir /var/zookeeper

注意:在所有节点(即node1,node2和node3)上进行上述zoo.cfg更改

在单个节点上创建唯一的Zookeeper ID

在节点1上,创建一个唯一的Zookeeper ID,并将其存储在“myid”该文件应位于由“dataDir” in zoo.cfg.

在节点1上,唯一ID为“1”,它将存储在/ var / zookeeper / myid文件中。

# echo "1" > /var/zookeeper/myid

# cat /var/zookeeper/myid 
1

在node2上,唯一ID为“2”.

echo "2" > /var/zookeeper/myid

在node3上,唯一ID为“3”.

echo "3" > /var/zookeeper/myid

注意:如果您不’t set the 我的身份 properly, when you start the zookeper you’设置以下内容“/ var / zookeeper / myid文件丢失” error message:

[我的身份:] - INFO  [main:[email protected]] - Reading configuration from: conf / zoo.cfg
[myid:] - INFO  [main:[email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[myid:] - INFO  [main:[email protected]] - Resolved hostname: node2.thegeekstuff.com to address: /192.168.101.2
[myid:] - INFO  [main:[email protected]] - Resolved hostname: node3.thegeekstuff.com to address: /192.168.101.3
[myid:] - WARN  [main:[email protected]] - No server failure will be tolerated. You need at least 3 servers.
[myid:] - INFO  [main:[email protected]] - Defaulting to majority quorums
[myid:] - 错误[main:QuorumPeerMain @ 85]-无效的配置,异常退出
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing conf / zoo.cfg
  at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:144)
  at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:101)
  at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused 通过: java.lang.IllegalArgumentException: / var / zookeeper / myid文件丢失
  at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:362)
  at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:140)
  ... 2 more
Invalid config, exiting abnormally

注意:您’ll see “[myid:]”上面没有任何ID号。但是,一旦解决了问题,就可以在node1的日志文件中’ll see “[myid:1]”. On node2, you’将会看到[myid:2],node3将显示[myid:3]。这是识别日志消息来自哪个Zookeeper节点的简便快捷方法。

启动Zookeeper集群

现在,要启动集群,请在所有单个节点上一个接一个地启动Zookeeper,如下所示:

export ZOOKEEPER_HOME=/opt/zookeeper
cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
 org.apache.zookeeper.server.quorum.QuorumPeerMain \
 conf / zoo.cfg

注意:最好的办法是将上述各行放入zookeeper-start.sh中并使用 nohup命令 在后台启动它,如下所示:

nohup zookeeper-start.sh &

注意:要停止Zookeeper群集,请在所有单个节点上使用 grep命令 找到动物园管理员的流程,并使用 杀死命令 终止它。

在此阶段,在node1上,您’会开始收到一些这样的错误消息。您现在可以忽略它们。

我们收到此错误,因为当前只有node1启动。一旦node2和node3启动,我们’将不再看到此错误消息。

[myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Notification time out: 400
[myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Cannot open channel to 2 at election address /192.168.101.2:3888
[myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Cannot open channel to 3 at election address /192.168.101.3:3888
java.net.ConnectException: Connection refused
  at java.net.PlainSocketImpl.socketConnect(Native Method)
  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  at java.net.Socket.connect(Socket.java:589)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
  at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
  at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
  at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)

注意:仅通过查看上面的日志消息行,我们就知道它来自node1,如它所说“[myid:1]”在每行的最前面。

在node2和node3上启动Zookeeper之后,我们’在各个节点上的所有日志中将看到以下内容,表明zookeeper集群已启动并正在运行。

在每行的前面都有一个时间戳,然后是“[myid:1] –INFO [QuorumPeer [myid = 1] / 0:0:0:0:0:0:0:0:0:2181:”

[email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[email protected]] - Notification time out: 6400
[/192.168.101.1:3888:[email protected]] - Received connection request /192.168.101.2:56214
[WorkerReceiver[myid=1]:[email protected]] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
[WorkerReceiver[myid=1]:[email protected]] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
[email protected]] - FOLLOWING
[email protected]] - TCP NoDelay set to: true
[email protected]] - Server environment:zookeeper.version=3.4.9-1757313, built 上  08/23/2016 06:50 GMT
[email protected]] - Server environment:host.name=node1.thegeekstuff.com
[email protected]] - Server environment:java.version=1.8.0_91
[email protected]] - Server environment:java.vendor=Oracle Corporation
[email protected]] - Server environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el7_2.x86_64/jre
[email protected]] - Server environment:java.class.path=zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf
[email protected]] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[email protected]] - Server environment:java.io.tmpdir=/tmp
[email protected]] - Server environment:java.compiler=NA
[email protected]] - Server environment:os.name=Linux
[email protected]] - Server environment:os.arch=amd64
[email protected]] - Server environment:os.version=3.10.0-327.18.2.el7.x86_64
[email protected]] - Server environment:user.name=root
[email protected]] - Server environment:user.home=/root
[email protected]] - Server environment:user.dir=/opt/zookeeper
[email protected]] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /var/zookeeper/version-2 snapdir /var/zookeeper/version-2
[email protected]] - FOLLOWING - LEADER ELECTION TOOK - 8869
[email protected]] - Resolved hostname: node2.thegeekstuff.com to address: /192.168.101.2
[email protected]] - Resolved hostname: node3.thegeekstuff.com to address: /192.168.101.3
[email protected]] - Getting a diff from the leader 0x0
[email protected]] - Snapshotting: 0x0 to /var/zookeeper/version-2/snapshot.0

如果您喜欢这篇文章,您可能还会喜欢..

  1. 50个Linux Sysadmin教程
  2. 50个最常用的Linux命令(包括示例)
  3. 排名前25位的最佳Linux性能监视和调试工具
  4. 妈妈,我找到了! – 15个实用的Linux Find命令示例
  5. Linux 101 Hacks第二版电子书 Linux 101黑客手册

Bash 101 Hacks书 Sed和Awk 101黑客手册 Nagios Core 3书 Vim 101黑客手册

{ 3 评论… 加一 }

  • Sankalp 2016年11月4日,上午6:33

    感谢您的分享,这非常有效。
    我按照Zookeeper网站进行设置,但我错过的关键事项对您的文章有所帮助:
    1.在所有跟随者zookeeper实例中复制zoo.cfg
    2.在领导者之后对所有关注者启动Zookeeper
    3. Create 我的身份 file for leader zookeeper as well and do entry there.

  • 古文利克(GüvenlikKamera) 2016年11月15日,上午6:37

    感谢您分享美好的经历。有效。

  • Sankalp 一月18,2017,10:28下午

    这确实是有帮助的,但是zookeeper中是否有一种方法可以动态添加节点即插即用?
    IN当前设置中,我们需要预先获取群集信息。

发表评论