Some quick tips on how to get started using pdsh:
Set up your environment:
export PDSH_SSH_ARGS_APPEND=”-o ConnectTimeout=5 -o CheckHostIP=no -o StrictHostKeyChecking=no” (Add this to your .bashrc to save time.)
mount -t iso9660 -o loop rhel-server-6.4-x86_64-dvd\[ED2000.COM\].iso iso/
ln -s iso rhel6.4
vi /etc/yum.repos.d/rhel.repo
[os]
name = Linux OS Packages
baseurl = file:///opt/rhel6.4
enabled=1
gpgcheck = 0
对于IO来说,Gmetad默认15秒向gmond取一次xml数据,如果gmond和gmetad都是在同一个节点,这样就相当于本地io请求。同时gmetad请求完xml文件后,还需要对其解析,也就是说按默认设置每15秒需要解析一个10m级别的xml文件,这样cpu的压力就会很大。同时它还有写入RRD数据库,还要处理来自web客户端的解析请求,也会读RRD数据库。这样本身的IO CPU 网络压力就很大,因此这个节点至少应该是个空闲的而且能力比较强的节点。
官网的文档[HDFSHighAvailabilityWithQJM.html]和[HdfsRollingUpgrade.html](Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards.)很详细,但是没有一个整体的案例。这里整理下操作记录下来。
[hadoop@hadoop-master1 hadoop-2.6.3]$ bin/hdfs namenode -upgrade
...
16/01/08 09:13:54 INFO namenode.NameNode: createNameNode [-upgrade]
...
16/01/08 09:13:57 INFO namenode.FSImage: Starting upgrade of local storage directories.
old LV = -47; old CTime = 0.
new LV = -60; new CTime = 1452244437060
16/01/08 09:13:57 INFO namenode.NNUpgradeUtil: Starting upgrade of storage directory /data/tmp/dfs/name
16/01/08 09:13:57 INFO namenode.FSImageTransactionalStorageInspector: No version file in /data/tmp/dfs/name
16/01/08 09:13:57 INFO namenode.NNUpgradeUtil: Performing upgrade of storage directory /data/tmp/dfs/name
16/01/08 09:13:57 INFO namenode.FSNamesystem: Need to save fs image? false (staleImage=false, haEnabled=true, isRollingUpgrade=false)
...
官网文档上说,除了升级了namenode的本地元数据外,sharededitlog也被升级了的。
查看journalnode的日志,确实journalnode也升级了:
123456789101112
[hadoop@hadoop-master1 hadoop-2.6.3]$ less logs/hadoop-hadoop-journalnode-hadoop-master1.log
...
2016-01-08 09:13:57,070 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Starting upgrade of edits directory /data/journal/zfcluster
2016-01-08 09:13:57,072 INFO org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil: Starting upgrade of storage directory /data/journal/zfcluster
2016-01-08 09:13:57,185 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Starting upgrade of edits directory: .
old LV = -47; old CTime = 0.
new LV = -60; new CTime = 1452244437060
2016-01-08 09:13:57,185 INFO org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil: Performing upgrade of storage directory /data/journal/zfcluster
2016-01-08 09:13:57,222 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Updating lastWriterEpoch from 2 to 3 for client /172.17.0.1
2016-01-08 09:16:57,731 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Updating lastPromisedEpoch from 3 to 4 for client /172.17.0.1
2016-01-08 09:16:57,735 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Scanning storage FileJournalManager(root=/data/journal/zfcluster)
...
升级的namenode是前台运行的,不要关闭这个进程。接下来把另一台namenode同步一下。
同步另一台namenode
12345678910111213141516171819202122
[hadoop@hadoop-master2 hadoop-2.6.3]$ bin/hdfs namenode -bootstrapStandby
...
=====================================================
About to bootstrap Standby ID nn2 from:
Nameservice ID: zfcluster
Other Namenode ID: nn1
Other NN's HTTP address: http://hadoop-master1:50070
Other NN's IPC address: hadoop-master1/172.17.0.1:8020
Namespace ID: 639021326
Block pool ID: BP-1695500896-172.17.0.1-1452152050513
Cluster ID: CID-7d5c31d8-5cd4-46c8-8e04-49151578e5bb
Layout version: -60
isUpgradeFinalized: false
=====================================================
16/01/08 09:15:19 INFO ha.BootstrapStandby: The active NameNode is in Upgrade. Prepare the upgrade for the standby NameNode as well.
16/01/08 09:15:19 INFO common.Storage: Lock on /data/tmp/dfs/name/in_use.lock acquired by nodename 5008@hadoop-master2
16/01/08 09:15:21 INFO namenode.TransferFsImage: Opening connection to http://hadoop-master1:50070/imagetransfer?getimage=1&txid=1126&storageInfo=-60:639021326:1452244437060:CID-7d5c31d8-5cd4-46c8-8e04-49151578e5bb
16/01/08 09:15:21 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
16/01/08 09:15:21 INFO namenode.TransferFsImage: Transfer took 0.00s at 0.00 KB/s
16/01/08 09:15:21 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000001126 size 977 bytes.
16/01/08 09:15:21 INFO namenode.NNUpgradeUtil: Performing upgrade of storage directory /data/tmp/dfs/name
...
重新启动集群
ctrl+c关闭hadoop-master1 upgrade的namenode。启动整个集群。
1234567891011121314151617181920
[hadoop@hadoop-master1 hadoop-2.6.3]$ sbin/start-dfs.sh
16/01/08 09:16:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop-master1 hadoop-master2]
hadoop-master1: starting namenode, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-namenode-hadoop-master1.out
hadoop-master2: starting namenode, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-namenode-hadoop-master2.out
hadoop-slaver3: starting datanode, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-datanode-hadoop-slaver3.out
hadoop-slaver2: starting datanode, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-datanode-hadoop-slaver2.out
hadoop-slaver1: starting datanode, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-datanode-hadoop-slaver1.out
Starting journal nodes [hadoop-master1]
hadoop-master1: journalnode running as process 31047. Stop it first.
16/01/08 09:16:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [hadoop-master1 hadoop-master2]
hadoop-master2: starting zkfc, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-zkfc-hadoop-master2.out
hadoop-master1: starting zkfc, logging to /home/hadoop/hadoop-2.6.3/logs/hadoop-hadoop-zkfc-hadoop-master1.out
[hadoop@hadoop-master1 hadoop-2.6.3]$ jps
31047 JournalNode
244 QuorumPeerMain
31596 DFSZKFailoverController
31655 Jps
31294 NameNode
2016-01-08 06:15:36,746 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [172.17.0.1:8485]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 1/1. 1 exceptions thrown:
172.17.0.1:8485: Asked for firstTxId 1022 which is in the middle of file /data/journal/zfcluster/current/edits_0000000000000001021-0000000000000001022
at org.apache.hadoop.hdfs.server.namenode.FileJournalManager.getRemoteEditLogs(FileJournalManager.java:198)
at org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:640)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:181)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:203)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:17453)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)