k8s 快速部署 1.5
本篇文章可能已不再适用于最新版本的 k8s,您可以看我的另一篇文章:Kubernetes - 快速部署(1.7)。
k8s 即 kubernetes,它是一个由谷歌开源的容器管理框架,提供完善的容器管理功能,如:Docker 容器编排,服务发现,状态监视等,据说它融合了谷歌多年的容器运营经验,所以目前为止,在容器管理界它是最成熟的,但它并不容易使用,比起 Docker 自带的容器管理框架 Swarm 要复杂的多,有人称 Kubernetes 的集群部署是地狱级的,这并不夸张。 Kubernetes 在 1.5 版本以后,谷歌简化了它的部署流程,小编看了官网的介绍,只需要在各个机器上执行一两条命令就可以搭建一个 k8s 集群,所以赶紧小试了一把,然而实际上并没那么容易,今天就记录在此。下面我们就来搭建一个 k8s 集群,并安装一个 WEB UI 应用(Dashboard)做为示例。
环境描述
- 三台机器:node1.docker.com, node2.docker.com, node3.docker.com
- 操作系统:CentOS 7.2
- k8s 版本:v1.5.1
- 好多地方需要用 root 权限,所以笔者在这里直接用 root 用户了
在每台机器上安装 k8s
这一步的难点在于相关资 源的下载,这些镜像都是在谷歌的服务器上,国内不能下载,这里有几个选择,一是用加速器,二是在网上找一下,看有没有其它人共享下载好的 k8s 与相关镜像,有的话最好,三是购买几台国外的服务器来部署 k8s,当然还有其它方式,相信这难不倒诸位,笔者推荐第一种。
以下是本文用到的所有镜像,如果你用上述的第二种方式,请提前下载好这些镜像:
[root@node1 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/weaveworks/weave-npc 1.9.2 6d47d7ef52cf 3 days ago 58.23 MB
docker.io/weaveworks/weave-kube 1.9.2 c187d4ccbf10 3 days ago 163.2 MB
gcr.io/google_containers/kube-proxy-amd64 v1.5.3 932ee3606ada 2 weeks ago 173.5 MB
gcr.io/google_containers/kube-scheduler-amd64 v1.5.3 cb0ce9bb60f9 2 weeks ago 54 MB
gcr.io/google_containers/kube-controller-manager-amd64 v1.5.3 25304c6f1bb2 2 weeks ago 102.8 MB
gcr.io/google_containers/kube-apiserver-amd64 v1.5.3 93d8b30a8f27 2 weeks ago 125.9 MB
gcr.io/google_containers/kubernetes-dashboard-amd64 v1.5.1 1180413103fd 7 weeks ago 103.6 MB
gcr.io/google_containers/etcd-amd64 3.0.14-kubeadm 856e39ac7be3 3 months ago 174.9 MB
gcr.io/google_containers/kubedns-amd64 1.9 26cf1ed9b144 3 months ago 47 MB
gcr.io/google_containers/dnsmasq-metrics-amd64 1.0 5271aabced07 4 months ago 14 MB
gcr.io/google_containers/kube-dnsmasq-amd64 1.4 3ec65756a89b 5 months ago 5.126 MB
gcr.io/google_containers/kube-discovery-amd64 1.0 c5e0c9a457fc 5 months ago 134.2 MB
gcr.io/google_containers/exechealthz-amd64 1.2 93a43bfb39bf 5 months ago 8.375 MB
gcr.io/google_containers/pause-amd64 3.0 99e59f495ffa 10 months ago 746.9 kB
解决了下载问题就开始安装 k8s 各组件了,因为笔者是 CentOS 系统,执行以下命令安装:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
setenforce 0
yum install -y docker kubelet kubeadm kubectl kubernetes-cni
systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet
如果是 Ubuntu 或 HypriotOS 系统,执行以下命令安装:
apt-get update && apt-get install -y apt-transport-https
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
# Install docker if you don't have it already.
apt-get install -y docker.io
apt-get install -y kubelet kubeadm kubectl kubernetes-cni
初始化主节点
官网说如果在后面打算用 flannel 做为 pod 网络,那么需要在 kubeadm init 后面加上 –pod-network-cidr 10.244.0.0/16 这个参数,刚开始笔者就是 flannel 做为 pod 网络的,后来发现网络总是不通,原因还没搞明白,而且好多人也遇到了同样的问题,导致后面安装 WEB UI 应用也会出问题,所有这里就不用加这个参数了。 如果镜像没有提前下载好,这一步也需要连接谷歌服务器,不然会停在 “Created API client” 这一步。
[root@node1 ~]# kubeadm init
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[init] Using Kubernetes version: v1.5.3
[tokens] Generated token: "f69e65.6dffddf74bd6f4a6"
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 23.557702 seconds
[apiclient] Waiting for at least one node to register and become ready
[apiclient] First node is ready after 3.504181 seconds
[apiclient] Creating a test deployment
[apiclient] Test deployment succeeded
[token-discovery] Created the kube-discovery deployment, waiting for it to become ready
[token-discovery] kube-discovery is ready after 4.503198 seconds
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns
Your Kubernetes master has initialized successfully!
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node:
kubeadm join --token=f69e65.6dffddf74bd6f4a6 10.100.124.236
如果成功了,上面会输出一个口令,用这个口令可以让其它机器加入到这个集群,也就是最后那条命令,先记下来,后面会用到。
安装 Pod 网络
官方文档上给出来 6 种网络,因为已经碰过壁,这里就直接选 Weave Net,它的安装方式简单致极:
kubectl apply -f https://git.io/weave-kube
然后来看一下安装状态:
[root@node1 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-fnrc3 1/1 Running 0 2d
kube-system etcd-leap236.lenovo.com 1/1 Running 0 2d
kube-system kube-apiserver-node1.docker.com 1/1 Running 0 2d
kube-system kube-controller-manager-leap236.lenovo.com 1/1 Running 0 2d
kube-system kube-discovery-1769846148-17vgk 1/1 Running 0 2d
kube-system kube-dns-2924299975-92p6j 4/4 Running 0 2d
kube-system kube-proxy-d1mb3 1/1 Running 0 2d
kube-system kube-scheduler-node1.docker.com 1/1 Running 0 2d
kube-system kubernetes-dashboard-3203831700-1tk4p 1/1 Running 0 1d
等到 kube-dns 那一行后面变成 4⁄4 就算安装成功了。
将其它机器加入集群
刚才 kubeadm init 那一步最后输出一行命令,现在复制它,并且在其它两台机器上执行:
[root@node2 ~]# kubeadm join --token=f69e65.6dffddf74bd6f4a6 10.100.124.236
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] Starting the kubelet service
[tokens] Validating provided token
[discovery] Created cluster info discovery client, requesting info from "http://10.100.124.236:9898/cluster-info/v1/?token-id=f69e65"
[discovery] Cluster info object received, verifying signature using given token
[discovery] Cluster info signature and contents are valid, will use API endpoints [https://10.100.124.236:6443]
[bootstrap] Trying to connect to endpoint https://10.100.124.236:6443
[bootstrap] Detected server version: v1.5.3
[bootstrap] Successfully established connection with endpoint "https://10.100.124.236:6443"
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:node2.docker.com | CA: false
Not before: 2017-03-02 12:15:00 +0000 UTC Not After: 2018-03-02 12:15:00 +0000 UTC
[csr] Generating kubelet configuration
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
然后在 node1 上执查看所有节点:
[root@node1 ~]# kubectl get nodes
NAME STATUS AGE
node1.docker.com Ready,master 2d
node2.docker.com Ready 2d
node3.docker.com Ready 2d
安装一个 Dashboard
Dashboard 是一个很好的 Kubernetes 集群管理工具,它的安装同样简单:
[root@node1 ~]# kubectl create -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
看一下安装情况:
[root@node1 ~]# kubectl get pods --all-namespaces
kube-system dummy-2088944543-fnrc3 1/1 Running 0 2d
kube-system etcd-node1.docker.com 1/1 Running 0 2d
kube-system kube-apiserver-node1.docker.com 1/1 Running 0 2d
kube-system kube-controller-manager-leap236.lenovo.com 1/1 Running 0 2d
kube-system kube-discovery-1769846148-17vgk 1/1 Running 0 2d
kube-system kube-dns-2924299975-92p6j 4/4 Running 0 2d
kube-system kube-proxy-54h8v 1/1 Running 0 2d
kube-system kube-proxy-5gqwb 1/1 Running 0 2d
kube-system kube-proxy-d1mb3 1/1 Running 0 2d
kube-system kube-scheduler-node1.docker.com 1/1 Running 0 2d
kube-system kubernetes-dashboard-3203831700-1tk4p 1/1 Running 0 1d
kube-system weave-net-f9xvq 2/2 Running 0 2d
kube-system weave-net-px4pt 2/2 Running 0 2d
这时它又会下载需要的镜像,需要等一会,如果你看到 kubernetes-dashboard 那一行 一直是 ContainerCreating 状态,那么用以下方式可以查看它的运行日志,以便排错:
[root@node1 ~]# kubectl describe -n kube-system pods kubernetes-dashboard-3203831700-1tk4p
...
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
54s 54s 1 {default-scheduler } Normal Scheduled Successfully assigned leap-n3 to leap238.lenovo.com
<invalid> <invalid> 1 {kubelet leap238.lenovo.com} spec.containers{leap3} Normal Started Started container with docker id c8d 804eea4
<invalid> <invalid> 1 {kubelet leap238.lenovo.com} spec.containers{leap2}
...
启动 Dashboard
这里也需要注意一下,官方说启动命令是这样:
kubectl proxy
但这样启动起来以后只允许本地访问,例如:curl localhost:8001/ui ,但也许你需要在其它主机上的浏览器中访问。
那就需要加一些启参数了,另外,这个命令是阻塞式的,我们需要它在后台去运行,下面是最终命令:
kubectl proxy --address='10.100.124.236' --accept-hosts='.+' &
如果不加 –address=’10.100.124.236′ –accept-hosts=’.+’ 这两个参数,在访问的时候会得到一个 Unauthorized 这样的页面。
在访问的时候,记得在 url 后面不能少了 /ui 。
好,现在可以打开你的浏览器访问了:http://10.100.124.236:8001/ui 。
问题总结
笔者在安装过程中其实还遇到了很多问题,但最困难的还是墙的问题,最彻底的办法还是用加速器,先把所需的镜像下载下来以后,再建一个私有仓库,把镜像同步到所有机器上,这样就不需要每个机器都下载镜像,启动时会快很多。
还有就是在安装 Dashboard 后,明明我把代理就停了,也执行了unset http_proxy https_proxy ftp_proxy no_proxy,但访问时说因为代理问题无法访问,然后我把所有服务都停了(包括:删除 Dashboard 与 pod 网络,kubeadm reset 各个主机),最后代理相关的东西全停了(因为所有东西已经下载完了,不需要代理了),再重新装网络与 Dashboard ,问题就解决了。