Article
kubeadm部署k8s(资源已有)
上一篇安装的文章操作过程中需要用到代理,期间会遇到穿插了各种问题,显的有点乱。在本地虚拟机安装调通后,今天把测试环境也升级了一下。再写一篇思路清晰一点的总结。
安装需要的rpm和docker images可以通过百度网盘下载:http://pan.baidu.com/s/1hrRs5MW 。
预先需要做的工作,这些都已经配置好了的:
- 时间同步,
- 主机名,
- /etc/hosts,
- 防火墙,
- selinux,
- 无密钥登录,
- 安装docker-1.12.6
主机集群的情况:
- 机器:cu[1-5]
- 主节点: cu3
- 跳板机: cu2(有外网IP)
# 首先做YUM本地仓库,并把docker镜像导入到所有node节点
首先在一台主机上部署YUM本地仓库
[root@cu2 ~]# cd /var/www/html/kubernetes/
[root@cu2 kubernetes]# createrepo .
[root@cu2 kubernetes]# ll
total 42500
-rw-r--r-- 1 hadoop hadoop 8974214 Aug 10 15:22 1a6f5f73f43077a50d877df505481e5a3d765c979b89fda16b8b9622b9ebd9a4-kubeadm-1.7.2-0.x86_64.rpm
-rw-r--r-- 1 hadoop hadoop 17372710 Aug 10 15:22 1e508e26f2b02971a7ff5f034b48a6077d613e0b222e0ec973351117b4ff45ea-kubelet-1.7.2-0.x86_64.rpm
-rw-r--r-- 1 hadoop hadoop 9361006 Aug 10 15:22 dc8329515fc3245404fea51839241b58774e577d7736f99f21276e764c309db5-kubectl-1.7.2-0.x86_64.rpm
-rw-r--r-- 1 hadoop hadoop 7800562 Aug 10 15:22 e7a4403227dd24036f3b0615663a371c4e07a95be5fee53505e647fd8ae58aa6-kubernetes-cni-0.5.1-0.x86_64.rpm
drwxr-xr-x 2 root root 4096 Aug 10 15:58 repodata
(所有node)导入新镜像
在cu2上操作,导入docker镜像
docker load </home/hadoop/kubeadm.tar
ssh cu1 docker load </home/hadoop/kubeadm.tar
ssh cu3 docker load </home/hadoop/kubeadm.tar
ssh cu4 docker load </home/hadoop/kubeadm.tar
ssh cu5 docker load </home/hadoop/kubeadm.tar
Loaded image: gcr.io/google_containers/etcd-amd64:3.0.17
Loaded image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.3
Loaded image: gcr.io/google_containers/kube-controller-manager-amd64:v1.7.2
Loaded image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4
Loaded image: gcr.io/google_containers/heapster-amd64:v1.3.0
Loaded image: gcr.io/google_containers/kube-scheduler-amd64:v1.7.2
Loaded image: gcr.io/google_containers/heapster-grafana-amd64:v4.4.1
Loaded image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4
Loaded image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4
Loaded image: centos:centos6
Loaded image: gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1
Loaded image: gcr.io/google_containers/pause-amd64:3.0
Loaded image: nginx:latest
Loaded image: gcr.io/google_containers/kube-apiserver-amd64:v1.7.2
Loaded image: gcr.io/google_containers/kube-proxy-amd64:v1.7.2
Loaded image: quay.io/coreos/flannel:v0.8.0-amd64
YUM仓库配置
在cu2上操作
cat > /etc/yum.repos.d/dta.repo <<EOF
[K8S]
name=K8S Local
baseurl=http://cu2:801/kubernetes
enabled=1
gpgcheck=0
EOF
for h in cu{1,3:5} ; do scp /etc/yum.repos.d/dta.repo $h:/etc/yum.repos.d/ ; done
# 安装kubeadm、kubelet
pdsh -w cu[1-5] "yum clean all; yum install -y kubelet kubeadm; systemctl enable kubelet "
# 使用kubeadm部署集群
# master节点
初始化
[root@cu3 ~]# kubeadm init --skip-preflight-checks --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.7.2
启动后会卡在了 ** Created API client, waiting for the control plane to become ready ** , 不要关闭当前的窗口。新开一个窗口,查看并定位解决错误:
问题1
新打开一个窗口,查看 /var/log/messages 有如下错误:
Aug 12 23:40:10 cu3 kubelet: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
docker和kubelet的cgroup driver不一样,修改kubelet的配置。同时把docker启动参数 masq 一起改了。
[root@cu3 ~]# sed -i 's/KUBELET_CGROUP_ARGS=--cgroup-driver=systemd/KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs/' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[root@cu3 ~]# sed -i 's#/usr/bin/dockerd.*#/usr/bin/dockerd --ip-masq=false#' /usr/lib/systemd/system/docker.service
[root@cu3 ~]# systemctl daemon-reload; systemctl restart docker kubelet
多开几个窗口来解决问题,不会影响kubeadm运行的。就是说,由于其他的问题导致kubeadm中间卡住,只要你解决了问题,kubeadm就会继续配置直到成功。
初始化完后,窗口完整日志如下:
[root@cu3 ~]# kubeadm init --skip-preflight-checks --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.7.2
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.2
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [cu3 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.148]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 494.001036 seconds
[token] Using token: ad430d.beff5be4b98dceec
[apiconfig] Created RBAC rules
[addons] Applied essential addon: kube-proxy
[addons] Applied essential addon: kube-dns
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token ad430d.beff5be4b98dceec 192.168.0.148:6443
然后按照上面的提示,把kubectl要用的配置文件弄好:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
到这里K8S的基础服务controller,apiserver,scheduler是起来了,但是dns还是有问题:
[root@cu3 kubeadm]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-cu3 1/1 Running 0 6m
kube-system kube-apiserver-cu3 1/1 Running 0 5m
kube-system kube-controller-manager-cu3 1/1 Running 0 6m
kube-system kube-dns-2425271678-wwnkp 0/3 Pending 0 6m
kube-system kube-proxy-ptnlx 1/1 Running 0 6m
kube-system kube-scheduler-cu3 1/1 Running 0 6m
dns的容器是使用bridge网络,需要配置网络才能跑起来。有如下错误日志:
Aug 12 23:54:04 cu3 kubelet: W0812 23:54:04.800316 12886 cni.go:189] Unable to update cni config: No networks found in /etc/cni/net.d
Aug 12 23:54:04 cu3 kubelet: E0812 23:54:04.800472 12886 kubelet.go:2136] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
下载 https://github.com/winse/docker-hadoop/tree/master/kube-deploy/kubeadm 目录下的 flannel 配置:
flannel配置文件稍微改了一下,在官网的文件基础上 cni-conf.json 增加了: "ipMasq": false,
# 配置网络
[root@cu3 kubeadm]# kubectl apply -f kube-flannel.yml
kubectl apply -f kube-flannel-rbac.yml
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
[root@cu3 kubeadm]# kubectl apply -f kube-flannel-rbac.yml
clusterrole "flannel" created
clusterrolebinding "flannel" created
# 等待一段时间后,dns的pods也启动好了
[root@cu3 kubeadm]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-cu3 1/1 Running 0 7m
kube-system kube-apiserver-cu3 1/1 Running 0 7m
kube-system kube-controller-manager-cu3 1/1 Running 0 7m
kube-system kube-dns-2425271678-wwnkp 3/3 Running 0 8m
kube-system kube-flannel-ds-dbvkj 2/2 Running 0 38s
kube-system kube-proxy-ptnlx 1/1 Running 0 8m
kube-system kube-scheduler-cu3 1/1 Running 0 7m
# Node节点部署
配置kubelet、docker
sed -i 's/KUBELET_CGROUP_ARGS=--cgroup-driver=systemd/KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs/' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
sed -i 's#/usr/bin/dockerd.*#/usr/bin/dockerd --ip-masq=false#' /usr/lib/systemd/system/docker.service
systemctl daemon-reload; systemctl restart docker kubelet
注意:加了 ip-masq=false 后,docker0就不能上外网了。也就是说用docker命令单独起的docker容器不能上外网了!
ExecStart=/usr/bin/dockerd --ip-masq=false
加入集群
kubeadm join --token ad430d.beff5be4b98dceec 192.168.0.148:6443 --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[discovery] Trying to connect to API Server "192.168.0.148:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.148:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.0.148:6443"
[discovery] Successfully established connection with API Server "192.168.0.148:6443"
[bootstrap] Detected server version: v1.7.2
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
CU2是跳板机,把kubectl的config配置拷贝过来,然后就可以在CU2上面运行命令:
[root@cu2 kube-deploy]# kubectl get nodes
NAME STATUS AGE VERSION
cu2 NotReady <invalid> v1.7.2
cu3 Ready 25m v1.7.2
[root@cu2 kube-deploy]# kubectl proxy
Starting to serve on 127.0.0.1:8001
我SecureCRT Socks代理做在这台机器上,本地浏览器访问 http://localhost:8001/ui。。。咔咔
5台机器都添加成功后:
[root@cu3 ~]# kubectl get nodes
NAME STATUS AGE VERSION
cu1 Ready 32s v1.7.2
cu2 Ready 3m v1.7.2
cu3 Ready 29m v1.7.2
cu4 Ready 26s v1.7.2
cu5 Ready 20s v1.7.2
所有节点防火墙配置(由于是云主机,增加防火墙):
firewall-cmd --zone=trusted --add-source=192.168.0.0/16 --permanent
firewall-cmd --zone=trusted --add-source=10.0.0.0/8 --permanent
firewall-cmd --complete-reload
# SOURCE IP测试
上次操作时有Sourceip的问题,现在应该不存在。。。看了iptables-save的信息,没有cni0/cbr0的相关的数据
还是再来测一遍:
kubectl run centos --image=cu.esw.cn/library/java:jdk8 --command -- vi
kubectl scale --replicas=4 deployment/centos
[root@cu2 kube-deploy]# pods
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default centos-3954723268-62tpc 1/1 Running 0 <invalid> 10.244.2.2 cu1
default centos-3954723268-6cmf9 1/1 Running 0 <invalid> 10.244.1.2 cu2
default centos-3954723268-blfc4 1/1 Running 0 <invalid> 10.244.3.2 cu4
default centos-3954723268-tb1rn 1/1 Running 0 <invalid> 10.244.4.2 cu5
default nexus-djr9c 1/1 Running 0 2m 192.168.0.37 cu1
# ping互通没问题 TEST
[root@cu2 hadoop]# ./pod_bash centos-3954723268-62tpc default
[root@centos-3024873821-4490r /]# ping 10.244.4.2 -c 1
# 源IP没问题 TEST
[root@centos-3954723268-62tpc opt]# yum install epel-release -y
[root@centos-3954723268-62tpc opt]# yum install -y nginx
[root@centos-3954723268-62tpc opt]# service nginx start
[root@centos-3954723268-blfc4 opt]# curl 10.244.2.2
[root@centos-3954723268-tb1rn opt]# curl 10.244.2.2
[root@centos-3954723268-62tpc opt]# less /var/log/nginx/access.log
# DNS/heaspter
奇了怪了,这次重新安装DNS时没遇到问题,heaspter安装也一次通过。
在cu3起的pods上执行 nslookup kubernetes.default 也是通的!
# 监控
# -- heaspter
[root@cu2 kubeadm]# kubectl apply -f heapster/influxdb/
deployment "monitoring-grafana" created
service "monitoring-grafana" created
serviceaccount "heapster" created
deployment "heapster" created
service "heapster" created
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created
[root@cu2 kubeadm]# kubectl apply -f heapster/rbac/
clusterrolebinding "heapster" created
# -- dashboard
[root@cu2 kubeadm]# kubectl apply -f kubernetes-dashboard.yaml
serviceaccount "kubernetes-dashboard" created
clusterrolebinding "kubernetes-dashboard" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
[root@cu2 kubeadm]# kubectl get service --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.96.0.1 <none> 443/TCP 18m
kube-system kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 18m
kube-system kubernetes-dashboard 10.104.165.81 <none> 80/TCP 5m
等一小段时间,查看所有的服务:
[root@cu2 kubeadm]# kubectl get services --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.96.0.1 <none> 443/TCP 2h
kube-system heapster 10.102.176.168 <none> 80/TCP 3m
kube-system kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 2h
kube-system kubernetes-dashboard 10.110.2.118 <none> 80/TCP 2m
kube-system monitoring-grafana 10.106.251.155 <none> 80/TCP 3m
kube-system monitoring-influxdb 10.100.168.147 <none> 8086/TCP 3m
直接访问 10.106.251.155 或者查看 monitoring的pod 日志,查看heaspter的状态。dashboard上面出图要等一小段时间才行。
如果通过 monitoring-grafana 的IP访问能看到CLUSTER和POD的监控图,但是dashboard上的图就是出不来,可以重新部署dashboard:
kubectl delete -f kubernetes-dashboard.yaml
kubectl create -f kubernetes-dashboard.yaml
到此整个K8S就在测试环境上重新运行起来了。
harbor就不安装了,平时没怎么用,也就5台机器直接save然后load工作量也不多。
# 参考
- https://github.com/kubernetes/kubernetes/issues/40969
- http://mp.weixin.qq.com/s?__biz=MzI4MTQyMDAxMA==&mid=2247483665&idx=1&sn=d8b61666fe0a0965336d15250e2648cb&scene=0
- http://cizixs.com/2017/05/23/container-network-cni
- https://github.com/containernetworking/cni/blob/master/SPEC.md#network-configuration
–END
Related
Related posts
-
杀鸡焉用牛刀:DuckDB 正取代部分 Spark 场景
2026-02-16
-
WIN 挂载 S3:像本地文件夹一样用对象存储
2026-02-10
-
n8n 终于还是部署到 Docker 了,经验就是要反反复复地去验证:要想少走弯路,就按官方推荐的最佳实践
2025-12-29
-
无需 Docker:n8n 2.x internal 模式下 Python Task Runner 配置实践
2025-12-25