问题探讨与解决
问题一
# tail -f /var/log/messages Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
发生以上的错误,可用以下的办法解决
kubelet 1.10
vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
找到KUBELET_CGROUP_ARGS=--cgroup-driver=systemd 后面增加--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"
存档里开
systemctl daemon-reload
systemctl restart kubelet
以上问题为Kubernetes版本与docker版本不相容导致cgroup功能失效 原文网址:https://itw01.com/2ZZ5ESH.html
kubelet 1.11以后
#vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
加上
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"
找到ExecStart 最后面增加 $KUBELET_CGROUP_ARGS
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBELET_CGROUP_ARGS
问题二
出现以下错误,全部主机都要设定
# tail -f /var/log/messagesCPUAccounting not enabled for pid: 23283MemoryAccounting not enabled for pid: 23283解决方法# systemctl show docker | grep Accounting #show出现都是CPU=no Memory=on# systemctl set-property docker.service MemoryAccounting=yes #把memory开启# systemctl set-property docker.service CPUAccounting=yes #把CPU开启# systemctl show docker | grep Accounting# grep -Ri accounting /etc/systemd/
问题三
master init失败 (kubeadm init --config /etc/kubernetes/config.yaml)
# kubeadm reset# rm -rf /var/lib/etcd/*
再重新init
# kubeadm init --config /etc/kubernetes/config.yaml
问题四
# kubectl get pods --all-namespaces
遇到 coredns ContainerCreating及kube-dns消失 及 kube-flannel CrashLoopBackOff
# echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf ; sysctl -p
还要有gateway喔
问题五
监控hpa CPU使用量监控安装
https://github.com/kubernetes-incubator/metrics-server
# wget https://github.com/kubernetes-incubator/metrics-server
解压缩后
# kubectl create -f metrics-server/deploy/1.8+/
hpa就可以看到deploy的CPU使用量
假如过几分钟没看到CPU使用%,出现未知,在往下做
metrics server yaml档问题
遇到metrics无法起来 找不到 10255 port 解决方法 只有k8s 1.11会遇到的问题
# kubectl -n kube-system edit deploy metrics-server找到...containers: - name: metrics-server image: gcr.io/google_containers/metrics-server-amd64:v0.2.1 imagePullPolicy: Always volumeMounts: - name: tmp-dir mountPath: /tmp command: - /metrics-server - --source=kubernetes.summary_api:''...
找到 - --source=kubernetes 如果没有 补上command:和- /metrics-server和- --source=kubernetes.summary_api:这几行
--source=kubernetes.summary_api:后面增加 https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
...containers: - name: metrics-server image: gcr.io/google_containers/metrics-server-amd64:v0.2.1 imagePullPolicy: Always volumeMounts: - name: tmp-dir mountPath: /tmp command: - /metrics-server - --source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true...
存档离开
过3分钟 kubectl get hpa 就看的到cpu使用率