折磨了一个多星期,最后还是调通了。折磨源于不自知,源于孤单,源于自负,后来通过扩展、查阅资料、请教同事顺利解决。简单部署可以查看README.md 。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
yum install docker-engine-1.12.6 docker-engine-selinux-1.12.6 -y
cd kube-deploy
vi hosts
vi k8s.profile
# 把deploy同步到其他实体机,同时把k8s.profile映射到/etc/profile.d
./rsync-deploy.sh
cd docker-multinode/
./master.sh or ./worker.sh
docker save gcr.io/google_containers/etcd-amd64:3.0.4 | docker-bs load
docker save quay.io/coreos/flannel:v0.6.1-amd64 | docker-bs load
cd kube-deploy/hadoop/kubenetes/
./prepare.sh
kubectl create -f hadoop-master2.yaml
kubectl create -f hadoop-slaver.yaml
Tip:其实使用一套配置就可以启动多个集群,在 kubectl create
后面加上 -n namespace
即可。
比如:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@cu2 kubenetes]# kubectl create namespace hd1
[root@cu2 kubenetes]# kubectl create namespace hd2
[root@cu2 kubenetes]# ./prepare.sh hd1
[root@cu2 kubenetes]# kubectl create -f hadoop-master2.yaml -n hd1
[root@cu2 kubenetes]# kubectl create -f hadoop-slaver.yaml -n hd1
[root@cu2 kubenetes]# ./prepare.sh hd2
[root@cu2 kubenetes]# kubectl create -f hadoop-master2.yaml -n hd2
[root@cu2 kubenetes]# kubectl create -f hadoop-slaver.yaml -n hd2
[root@cu2 kubenetes]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
hd1 hadoop-master2 1/1 Running 0 28s
hd1 slaver-rc-fdcsw 1/1 Running 0 18s
hd1 slaver-rc-qv964 1/1 Running 0 18s
hd2 hadoop-master2 1/1 Running 0 26s
hd2 slaver-rc-0vdfk 1/1 Running 0 17s
hd2 slaver-rc-r7g84 1/1 Running 0 17s
...
现在想来其实就是 dockerd –ip-masq=false 的问题(所有涉及的dockerd都需要加)。 还有就是一台机器单机下的容器互相访问,源IP都错也是安装了openvpn所导致,对所有过eth0的都加了MASQUERADE。
根源就在于请求的源地址被替换,也就是iptables的转发进行了SNAT。关于iptables转发这篇文章讲的非常清晰;IPtables之四:NAT原理和配置 。
所遇到的问题
没加ip-masq之前,namenode收到datanode的请求后,源地址是flannel.0的ip: 10.1.98.0。
namenode对应的日志为:
1
2
3
2017-04-09 07:22:06,920 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* registerDatanode: from DatanodeRegistration(10.1.98.0, datanodeUuid=5086c549-f3bb-4ef6-8f56-05b1f7adb7d3, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-522174fa-6e7b-4c3f-ae99-23c3018e35d7;nsid=1613705851;c=0) storage 5086c549-f3bb-4ef6-8f56-05b1f7adb7d3
2017-04-09 07:22:06,920 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/10.1.98.0:50010
2017-04-09 07:22:06,921 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.98.0:50010
一开始以为是flannel的问题,换成yum安装,然后同时flannel把backend切换成vxlan后,还是一样的问题。
最后请教搞网络的同事,应该是请求的源地址被替换了,也就定位到iptables。然后通过查看文档,其实前面也有看到过对应的文章,但是看不明白不知道缘由。
iptables的部分相关信息:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[root@cu2 ~]# iptables -S -t nat
...
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -j PREROUTING_direct
-A PREROUTING -j PREROUTING_ZONES_SOURCE
-A PREROUTING -j PREROUTING_ZONES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j OUTPUT_direct
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 10.1.34.0/24 ! -o docker0 -j MASQUERADE
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -j POSTROUTING_direct
-A POSTROUTING -j POSTROUTING_ZONES_SOURCE
-A POSTROUTING -j POSTROUTING_ZONES
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-75CPIAPDB4MAVFWI -s 10.1.40.3/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-75CPIAPDB4MAVFWI -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.1.40.3:53
-A KUBE-SEP-IWNPEB4T46P6VG5J -s 192.168.0.148/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-IWNPEB4T46P6VG5J -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-IWNPEB4T46P6VG5J --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 192.168.0.148:6443
-A KUBE-SEP-UYUINV25NDNSKNUW -s 10.1.40.3/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-UYUINV25NDNSKNUW -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.1.40.3:53
-A KUBE-SEP-XDHL2OHX2ICPQHKI -s 10.1.40.2/32 -m comment --comment "kube-system/kubernetes-dashboard:" -j KUBE-MARK-MASQ
-A KUBE-SEP-XDHL2OHX2ICPQHKI -p tcp -m comment --comment "kube-system/kubernetes-dashboard:" -m tcp -j DNAT --to-destination 10.1.40.2:9090
-A KUBE-SERVICES -d 10.0.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 10.0.0.95/32 -p tcp -m comment --comment "kube-system/kubernetes-dashboard: cluster IP" -m tcp --dport 80 -j KUBE-SVC-XGLOHA7QRQ3V22RZ
-A KUBE-SERVICES -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-75CPIAPDB4MAVFWI
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-IWNPEB4T46P6VG5J --mask 255.255.255.255 --rsource -j KUBE-SEP-IWNPEB4T46P6VG5J
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-IWNPEB4T46P6VG5J
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-UYUINV25NDNSKNUW
-A KUBE-SVC-XGLOHA7QRQ3V22RZ -m comment --comment "kube-system/kubernetes-dashboard:" -j KUBE-SEP-XDHL2OHX2ICPQHKI
在dockerd服务脚本加上 --ip-masq=false
后,-A POSTROUTING -s 10.1.34.0/24 ! -o docker0 -j MASQUERADE
这一句就没有了,也就是不会进行源地址重写了,这样请求发送到namenode后还是datanode容器的IP。问题解决,原因简单的让人欲哭无泪啊。
写yaml遇到的一些其他问题:
当然还有很多其他的问题,这篇就写这么多,优化工作后面的弄好了再写。
中间过程步骤记录
主要就是记录心路历程,如果以后遇到同样的问题能让自己快速回想起来。如果仅仅为了部署,可以跳过该部分,直接后最后的常用命令。
记录下中间 通过yum安装etcd和flanneld 的过程。物理机安装flanneld会把配置docker环境变量(/run/flannel/subnet.env)加入启动脚本。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
安装docker-v1.12
https://docs.docker.com/v1.12/
https://docs.docker.com/v1.12/engine/installation/linux/centos/
# 删掉原来的
yum-config-manager --disable docker-ce*
yum remove -y docker-ce*
sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF
https://yum.dockerproject.org/repo/main/centos/7/Packages/
[root@cu3 ~]# yum --showduplicates list docker-engine | expand
docker-engine.x86_64 1.12.6-1.el7.centos dockerrepo
[root@cu3 yum.repos.d]# yum install docker-engine-1.12.6 docker-engine-selinux-1.12.6
https://kubernetes.io/docs/getting-started-guides/centos/centos_manual_config/
cat > /etc/yum.repos.d/virt7-docker-common-release.repo <<EOF
[virt7-docker-common-release]
name=virt7-docker-common-release
baseurl=http://cbs.centos.org/repos/virt7-docker-common-release/x86_64/os/
gpgcheck=0
EOF
yum -y install --enablerepo=virt7-docker-common-release etcd flannel
yum -y install --enablerepo=virt7-docker-common-release flannel
- ETCD配置
[root@cu3 docker-multinode]#
etcdctl mkdir /kube-centos/network
etcdctl set /kube-centos/network/config "{ \"Network\": \"10.1.0.0/16\", \"SubnetLen\": 24, \"Backend\": { \"Type\": \"vxlan\" } }"
- FlANNEL
[root@cu3 ~]# cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="http://cu3:2379"
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/kube-centos/network"
# Any additional options that you want to pass
#FLANNEL_OPTIONS=""
[root@cu2 yum.repos.d]# systemctl daemon-reload
[root@cu2 yum.repos.d]# cat /run/flannel/subnet.env
[root@cu2 ~]# systemctl cat docker
...
# /usr/lib/systemd/system/docker.service.d/flannel.conf
[Service]
EnvironmentFile=-/run/flannel/docker
测试过程中有yaml配置中启动sshd,然后启动容器后,通过手动启动namenode、datanode的方式来测试:
1
2
3
4
5
6
7
cd hadoop-2.6.5
gosu hadoop mkdir /data/bigdata
gosu hadoop sbin/hadoop-daemon.sh start datanode
cd hadoop-2.6.5/
gosu hadoop bin/hadoop namenode -format
gosu hadoop sbin/hadoop-daemon.sh start namenode
后来发现问题出在iptables后,又回到原来的docker-bootstrap启动,需要删除flannel.1的网络:
1
2
3
4
5
6
# yum安装flanneld后停止 https://kubernetes.io/docs/getting-started-guides/scratch/
ip link set flannel.1 down
ip link delete flannel.1
route -n
rm /usr/lib/systemd/system/docker.service.d/flannel.conf
开了防火墙的话,把容器的端加入到信任列表:
1
2
3
4
5
systemctl enable firewalld && systemctl start firewalld
firewall-cmd --zone=trusted --add-source=10.0.0.0/8 --permanent
firewall-cmd --zone=trusted --add-source=192.168.0.0/16 --permanent
firewall-cmd --reload
一些有趣的命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
查看用了哪些镜像
[root@cu2 /]# kubectl get pods --all-namespaces -o jsonpath="{..image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -c
2 gcr.io/google_containers/dnsmasq-metrics-amd64:1.0
2 gcr.io/google_containers/exechealthz-amd64:1.2
12 gcr.io/google_containers/hyperkube-amd64:v1.5.5
2 gcr.io/google_containers/kube-addon-manager-amd64:v6.1
2 gcr.io/google_containers/kubedns-amd64:1.9
2 gcr.io/google_containers/kube-dnsmasq-amd64:1.4
2 gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.0
修改默认kubectl的配置
[root@cu2 ~]# vi $KUBECONFIG
apiVersion: v1
kind: Config
preferences: {}
current-context: default
clusters:
- cluster:
server: http://localhost:8080
name: default
contexts:
- context:
cluster: default
user: ""
namespace: kube-system
name: default
users: {}
如果kubectl没有下载,可以从镜像启动的容器里面获取
[root@cu2 docker-multinode]# docker exec -ti 0c0360bcc2c3 bash
root@cu2:/# cp kubectl /var/run/
[root@cu2 run]# mv kubectl /data/kubernetes/kube-deploy/docker-multinode/
获取容器IP
https://kubernetes.io/docs/user-guide/jsonpath/
[root@cu2 ~]# kubectl get pods -o wide -l run=redis -o jsonpath={..podIP}
10.1.75.2 10.1.75.3 10.1.58.3 10.1.58.2 10.1.33.3
网络共用: --net
docker run -ti --entrypoint=sh --net=container:8e9f21956469f4ef7e5b9d91798788ab83f380795d2825cdacae0ed28f5ba03b gcr.io/google_containers/skydns-amd64:1.0
格式化输出
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}"
[root@cu2 ~]# export POD_COL="custom-columns=NAME:.metadata.name,RESTARTS:.status.containerStatuses[*].restartCount,CONTAINERS:.spec.containers[*].name,IP:.status.podIP,HOST:.spec.nodeName"
[root@cu2 ~]# kubectl get pods -o $POD_COL
kubectl get po -l k8s-app=kube-dns -o=custom-columns=NAME:.metadata.name,CONTAINERS:.spec.containers[*].name
[root@cu2 kubernetes]# kubectl get po --all-namespaces -o=custom-columns=NAME:.metadata.name,CONTAINERS:.spec.containers[*].name
kubectl get po --all-namespaces {range .items[*]}{.metadata.name}{“\t”}{end}
备份
echo "$(docker ps | grep -v IMAGE | awk '{print $2}' )
$(docker-bs ps | grep -v IMAGE | awk '{print $2}' )" | sort -u | while read image ; do docker save $image>$(echo $image | tr '[/:]' _).tar ; done
加Label
cat /etc/hosts | grep -E "\scu[0-9]\s" | awk '{print "kubectl label nodes "$1" hostname="$2}' | while read line ; do sh -c "$line" ; done
扩容
[root@cu2 kubernetes]# kubectl run redis --image=redis:3.2.8
[root@cu2 kubernetes]# kubectl scale --replicas=9 deployment/redis
echo " $( kubectl describe pods hadoop-master2 | grep -E "Node|Container ID" | awk -F/ '{print $NF}' | tr '\n' ' ' | awk '{print "ssh "$1" \rdocker exec -ti "$2" bash"}' ) "
测试DNS是否成功:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[root@cu2 kube-deploy]# vi busybox.yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always
[root@cu3 kube-deploy]# kubectl create -f busybox.yaml
pod "busybox" created
[root@cu3 kube-deploy]# kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox 0/1 ContainerCreating 0 11s
[root@cu3 kube-deploy]# kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox 1/1 Running 0 1m
[root@cu3 kube-deploy]# kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local
用容器的MYSQL的做客户端
kubectl run -it --rm --image=mysql:5.6 mysql-client -- mysql -h mysql -ppassword
小结一点:日志的重要性!
1
2
3
4
5
[root@cu2 kubenetes]# docker ps -a | grep kubelet
[root@cu2 kubenetes]# docker logs --tail=200 7432da457558
E0417 11:39:40.194844 22528 configmap.go:174] Couldn't get configMap hadoop/dta-hadoop-config: configmaps "dta-hadoop-config" not found
E0417 11:39:40.194910 22528 configmap.go:174] Couldn't get configMap hadoop/dta-bin-config: configmaps "dta-bin-config" not found
监控heapster的一些错误,还没调好
1
2
3
4
5
6
7
8
9
10
11
[root@cu2 ~]# kubectl exec -ti heapster-564189836-shn2q -n kube-system -- sh
/ #
/ #
没pod的数据
/ # /heapster --source=https://kubernetes.default --sink=log --heapster-port=8083 -v 10
E0329 10:11:53.823641 1 reflector.go:203] k8s.io/heapster/metrics/processors/node_autoscaling_enricher.go:100: Failed to list *api.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: i/o timeout
$heapster/metrics
$heapster/api/v1/model/debug/allkeys
其他一些配置
1
2
3
4
5
6
other_args=" --registry-mirror=https://docker.mirrors.ustc.edu.cn "
--insecure-registry gcr.io
iptables -S -t nat
其他一些资源
statefulset
–END