Kubernetes v1.27.6 三节点(Cilium+API Gateway)完整部署文档
Kubernetes v1.27.6 三节点(Cilium+API Gateway)完整部署文档
平台:Rocky Linux 8
节点(干净空白系统):
k8s-node01— 192.168.1.233(CP+Worker)k8s-node02— 192.168.1.234(CP+Worker)k8s-node03— 192.168.1.235(CP+Worker)
API LB(HAProxy):192.168.1.203(运行在 192.168.1.203)(集群外)
CNI:Cilium
LoadBalancer(裸机):MetalLB (L2)
Ingress / API Gateway
分布式存储:Rook v1.16 + Ceph v19.2(OSD 使用每节点/dev/sdb)
目标:所有 control-plane 节点也可调度 Pod(去掉 NoSchedule 污点),提供动态可扩容的持久卷(RBD,StorageClass 支持扩容)。
说明:本文档为可复制执行的操作手册,按步骤执行即可。执行前请在测试环境演练;执行会擦除
/dev/sdb上数据(Rook/Ceph 部分)。
目录
- 先决条件与准备
- 节点通用初始化(系统配置、containerd、kubeadm 安装)
- HAProxy 作为 external LB(在 192.168.1.233)配置
- kubeadm 初始化(controlPlane via 192.168.1.233)与其它 control-plane 加入
- 使 control-plane 也做 worker(移除污点)
- 安装 Cilium(CNI)
- 安装 MetalLB(L2)并配置 IPPool
- 安装 Gateway API + Kong
- 安装 Istio(服务网格)
- 部署 Rook v1.16 + Ceph v19.2(动态扩容存储) — 最详细部分
- 测试用例(网络、LB、Ingress、Istio、存储、扩容)
- 常用调试命令与故障排查
- 备份、监控与日常运维建议
- 附:关键配置文件(可直接复制)
1. 先决条件与风险提示
- 三节点网络互通,时间同步(建议 chrony/ntp)。
- 三台均为 Rocky 8,root 权限。
- 每台有空盘
/dev/sdb(会被格式化并用于 Ceph OSD)。务必备份数据。 - 集群将运行 kubeadm v1.27.6。
- 本文档多处需要
kubectl、helm、istioctl、cilium等工具,请确保能从管理主机使用。 - 操作对系统变动较多,请按步骤执行并在每步确认无误再进入下一步。
2. 节点通用初始化(在 3 台节点上执行)
以 root 执行或 sudo。
2.1 主机名与 hosts
在三节点分别设置主机名:
hostnamectl set-hostname k8s-node01 # 在 233
hostnamectl set-hostname k8s-node02 # 在 234
hostnamectl set-hostname k8s-node03 # 在 235
将集群名写入 /etc/hosts(在三台都写):
192.168.1.233 k8s-node01
192.168.1.234 k8s-node02
192.168.1.235 k8s-node03
# 设置时区:
timedatectl set-timezone Asia/Shanghai
# 安装rsyslog(检测默认是否有/var/log/message日志文件)
dnf install rsyslog -y
systemctl start rsyslog && systemctl enable rsyslog
设置系统代理(如有)
# 创建全局代理配置
mkdir -p /etc/systemd/system.conf.d/
tee /etc/systemd/system.conf.d/proxy.conf <<EOF
[Manager]
DefaultEnvironment="http_proxy=http://proxy-server:port"
DefaultEnvironment="https_proxy=http://proxy-server:port"
DefaultEnvironment="no_proxy=localhost,127.0.0.1,192.168.0.0/16,.svc,.svc.cluster.local,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster.local" # 设置本地和内网不走代理
EOF
tee /etc/systemd/system.conf.d/proxy.conf <<EOF
[Manager]
DefaultEnvironment="http_proxy=http://192.168.1.44:7890"
DefaultEnvironment="https_proxy=http://192.168.1.44:7890"
DefaultEnvironment="no_proxy=localhost,127.0.0.1,192.168.0.0/16,.svc,.svc.cluster.local,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster.local" # 设置本地和内网不走代理
EOF
# 重新加载 systemd 配置
sudo systemctl daemon-reload
# 重启需要代理的服务
sudo systemctl restart <service-name>
2.2 关闭 swap、设置内核参数并加载 br_netfilter
swapoff -a
sed -i '/ swap / s/^/#/' /etc/fstab
cat >/etc/modules-load.d/k8s.conf <<EOF
br_netfilter
EOF
modprobe br_netfilter
cat >/etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.vs.ignore_no_rs_error=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_timestamps=1
net.ipv4.neigh.default.gc_thresh1=1024
net.ipv4.neigh.default.gc_thresh2=2048
net.ipv4.neigh.default.gc_thresh3=4096
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
sysctl --system
2.3 安装必要工具与关闭防火墙(按需)
# 跟新系统包到最新
dnf update -y
# 安装工具和依赖
dnf install -y dnf-utils ipvsadm telnet wget net-tools conntrack ipset jq iptables curl sysstat libseccomp socat nfs-utils fuse fuse-devel vim
# 关闭selinux
setenforce 0
sed -i 's/^SELINUX=enforcing/SELINUX=disable/' /etc/selinux/config
# 关闭防火墙
systemctl stop firewalld
systemctl disable --now firewalld # 若保留 firewall,请开放 kubeadm 所需端口
2.4 安装 containerd(systemd cgroup)
# 删除原有容器运行时
dnf remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
# 配置 repository
dnf -y install dnf-plugins-core
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# 安装 containerd
dnf -y install containerd
# 配置 containerd
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
#编辑:/etc/containerd/config.toml 替换pause镜像(有网络问题时执行)默认sandbox_image为谷歌镜像
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9" # 注意调整版本
#安装crictl命令
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.35.0/crictl-v1.35.0-linux-amd64.tar.gz
tar -zxvf crictl-v1.35.0-linux-amd64.tar.gz -C /usr/local/bin
# 验证安装
crictl --version
# 配置crictl命令 不配置在拉取镜像时无法正常拉取:
cat <<EOF> /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
# 配置containerd网络代理(如有)
mkdir -p /etc/systemd/system/containerd.service.d
tee /etc/systemd/system/containerd.service.d/http-proxy.conf <<EOF
[Service]
Environment="HTTP_PROXY=http://proxy-server:port"
Environment="HTTPS_PROXY=http://proxy-server:port"
Environment="NO_PROXY=localhost,127.0.0.1,192.168.1.0/24" # 设置本地和内网不走代理
EOF
tee /etc/systemd/system/containerd.service.d/http-proxy.conf <<EOF
[Service]
Environment="HTTP_PROXY=http://192.168.1.44:7890"
Environment="HTTPS_PROXY=http://192.168.1.44:7890"
Environment="NO_PROXY=localhost,127.0.0.1,192.168.1.0/24" # 设置本地和内网不走代理
EOF
# 启动containerd
systemctl daemon-reload
systemctl enable --now containerd
2.5 安装helm
wget https://get.helm.sh/helm-v4.0.2-linux-amd64.tar.gz
tar -zxvf helm-v4.0.2-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
# 验证安装
helm version
2.6 安装 kubeadm/kubelet/kubectl v1.27.6
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
dnf install -y kubeadm-1.27.6-0 kubelet-1.27.6-0 kubectl-1.27.6-0 --disableexcludes=kubernetes
systemctl enable --now kubelet
# 可锁定版本
# dnf install -y python3-jinja2 # 若生成模板脚本会用到
3. HAProxy 作为 external LB
你选择将 HAProxy 放在 192.168.1.233(203 是 HAProxy 对外 IP)。下面示例假设 HAProxy 进程绑定在 0.0.0.0:6443 并将请求轮询到 3 个 apiserver 的 6443。
3.1 在 k8s-node01 安装 HAProxy
dnf install -y haproxy
3.2 编辑 /etc/haproxy/haproxy.cfg
global
log /dev/log local0
maxconn 2000
user haproxy
group haproxy
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin
defaults
mode tcp
log global
option tcplog
timeout connect 10s
timeout client 1m
timeout server 1m
listen k8s-api
bind 0.0.0.0:6443
mode tcp
option tcplog
option tcp-check
balance roundrobin
server k8s-node01 192.168.1.233:6443 check fall 3 rise 2
server k8s-node02 192.168.1.234:6443 check fall 3 rise 2
server k8s-node03 192.168.1.235:6443 check fall 3 rise 2
注意:如果 HAProxy 运行在 233,但 apiserver 也在 233:后端仍写 192.168.1.233:6443。
重启 HAProxy:
systemctl enable --now haproxy
systemctl status haproxy
测试:
telnet 192.168.1.203 6443
# 或
ss -lnt | grep 6443
4. kubeadm 初始化
(在 k8s-node01 执行,controlPlaneEndpoint 指向 HAProxy 地址 192.168.1.203:6443)
创建 kubeadm-config.yaml:
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.27.6
controlPlaneEndpoint: "192.168.1.203:6443"
networking:
podSubnet: "10.244.0.0/16"
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "192.168.1.233"
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
kubeletExtraArgs:
cgroup-driver: "systemd"
执行:
kubeadm init --config=kubeadm-config.yaml --upload-certs
记下 kubeadm join 输出(token、ca-cert-hash、certificate-key),用于其它 control-plane 与 worker 加入。
配置 kubectl(普通用户):
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
5. 在其它节点加入
假设你要把三台都做 control-plane: control-plane(node02 / node03)
在 node02/node03 使用 kubeadm join(使用 init 输出的 --control-plane --certificate-key <key>):
kubeadm join 192.168.1.203:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash> \
--control-plane --certificate-key <cert-key>
随后在 k8s-node01 上确认:
kubectl get nodes -o wide
kubectl get cs
6. 允许 control-plane 调度 Pod(移除 NoSchedule 污点)
默认 kubeadm 可能给 control-plane 打 node-role.kubernetes.io/control-plane:NoSchedule 污点。移除它使节点既是 master 也是 worker:
kubectl taint nodes k8s-node01 node-role.kubernetes.io/control-plane:NoSchedule-
kubectl taint nodes k8s-node02 node-role.kubernetes.io/control-plane:NoSchedule-
kubectl taint nodes k8s-node03 node-role.kubernetes.io/control-plane:NoSchedule-
验证:
kubectl get nodes -o wide
kubectl describe node k8s-node01 | grep -i taint -A2
7. 网络插件安装
在任一有 kubectl 的节点:
7.1 安装 cilium
1️⃣安装CLI
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
tar xzvf cilium-linux-amd64.tar.gz
mv cilium /usr/local/bin/
2️⃣安装 Cilium
# clusterPoolIPv4PodCIDR ipv4NativeRoutingCIDR 需与前面集群配置一致
cilium install \
--version 1.18.3 \
\
--set kubeProxyReplacement=false \
\
--set ipam.mode=kubernetes \
--set routingMode=native \
--set ipv4NativeRoutingCIDR=10.244.0.0/16 \
\
--set enableIPv4Masquerade=true \
\
--set enableNodePort=false \
--set enableLoadBalancer=false \
--set enableHostPort=false \
\
--set hubble.enabled=false
#安装完成后,确认 Cilium 是否已正确部署并运行:
kubectl get pods -n kube-system -l k8s-app=cilium
NAME READY STATUS RESTARTS AGE
cilium-frd9m 1/1 Running 0 4h8m
cilium-jbkjb 1/1 Running 0 4h8m
cilium-kmvrc 1/1 Running 0 4h8m
kubectl get gatewayclass
NAME CONTROLLER ACCEPTED AGE
cilium io.cilium/gateway-controller True 23s
注:若日后需要启用 kube-proxy replacement,可在稳态集群上再配置
cilium install --set kubeProxyReplacement=strict ...。启用前请确保内核、containerd 与 socketLB 的支持满足 Cilium 要求。
7.2安装metaLB(L2 模式)
安装 manifest(官方)并创建 IP 地址池。
1️⃣ 安装 MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/main/config/manifests/metallb-native.yaml
kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
controller-5b46566d45-tpgvm 1/1 Running 0 39s
speaker-rcwsc 1/1 Running 0 39s
speaker-spsgf 1/1 Running 0 39s
speaker-zjb5p 1/1 Running 0 39s
2️⃣ 配置 IPPool
保存为 metallb-pool.yaml:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: external-pool
namespace: metallb-system
spec:
addresses:
- 192.168.1.169/32 # 你分配给集群的外部 IP
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2
namespace: metallb-system
spec:
ipAddressPools:
- external-pool
应用:
kubectl apply -f metallb-pool.yaml
kubectl get ipaddresspools -n metallb-system
NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES
external-pool true false ["192.168.1.169/32"]
验证:部署一个 LoadBalancer 类型的 nginx 并确认分配 IP。
8. 安装 Gateway
8.1 安装 Gateway API CRDs
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/latest/download/standard-install.yaml
# 或者下载yaml文件到本地:
# http crd
wget -O standard-install.yaml https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml
# tcp crd
wget -O experimental-install.yaml https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/experimental-install.yaml
8.2 安装 Envoy Gateway
kubectl apply --server-side -f https://github.com/envoyproxy/gateway/releases/download/latest/install.yaml
验证 Kong Pod 就绪:
kubectl get pods -n envoy-gateway-system
kubectl get svc -n envoy-gateway-system
8.3 创建GatewayClass
gatewayclass.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: envoy-gateway
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
kubectl apply -f gatewayclass.yaml
kubectl get gatewayclass
8.4 创建 Gateway(HTTP + TCP)
gateway.yaml
# gateway.yaml - 多端口配置
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: eg-gateway
spec:
gatewayClassName: envoy-gateway
listeners:
# HTTP 监听器
- name: http
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: All
kinds:
- kind: HTTPRoute
# TCP 监听器组
- name: tcp-db
protocol: TCP
port: 3306
allowedRoutes:
namespaces:
from: All
kinds:
- kind: TCPRoute
- name: tcp-redis
protocol: TCP
port: 6379
allowedRoutes:
namespaces:
from: All
kinds:
- kind: TCPRoute
- name: tcp-ssh
protocol: TCP
port: 22
allowedRoutes:
namespaces:
from: All
kinds:
- kind: TCPRoute
kubectl apply -f gateway.yaml
kubectl get gateway
NAME CLASS ADDRESS PROGRAMMED AGE
eg-gateway envoy-gateway 192.168.1.169 True 18h
kubectl describe gateway eg-gateway
8.5 部署http 测试用例
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80
HTTPRoute
referencegrant-grafana.yaml 或者使用Route与service同namespace,这样就不需要这个授权,再route中指定gateway的ns即可
跨namespace访问时需要创建授权
apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
name: allow-grafana
namespace: monitoring
spec:
from:
- group: gateway.networking.k8s.io
kind: HTTPRoute
namespace: default
to:
- group: "" # core API group
kind: Service
name: kube-prometheus-grafana
httproute.yaml
# 根据不同的svc做url路由
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: production-routes
namespace: default # 注意与service同ns,即可不需要上面的referencegrant授权
spec:
parentRefs:
- name: eg-gateway
namespace: default
sectionName: http
port: 80
rules:
# nginx
- matches:
- path:
type: PathPrefix
value: /nginx
filters:
- type: URLRewrite
urlRewrite:
path:
type: ReplacePrefixMatch
replacePrefixMatch: /
backendRefs:
- name: nginx
port: 80
# grafana
- matches:
- path:
type: PathPrefix
value: /grafana
filters:
- type: RequestHeaderModifier
requestHeaderModifier:
add:
- name: X-Forwarded-Prefix
value: /grafana
backendRefs:
- name: kube-prometheus-grafana
namespace: monitoring
port: 80
8.6 部署tcp测试用例
mysql_dy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
labels:
app: mysql
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
value: "root123"
ports:
- containerPort: 3306
---
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
selector:
app: mysql
ports:
- name: tcp
protocol: TCP
port: 3306
targetPort: 3306
type: ClusterIP
TCPRoute
tcproute.yaml
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TCPRoute
metadata:
name: tcp-route
namespace: default
spec:
parentRefs:
- name: eg-gateway # 你创建的 Gateway 名称
sectionName: tcp-db # 对应 Gateway 中的 tcp-db 监听器
rules:
- backendRefs:
- name: mysql # Service 名称
port: 3306
weight: 1
9. Rook v1.16 + Ceph v19.2(动态扩容存储) —— 详细步骤
目标:创建 RBD StorageClass,启用
allowVolumeExpansion: true,并演示 PVC 扩容流程。
9.0 前置:再次确认 /dev/sdb 空且可用(在三台)
在三节点分别运行并确认:
lsblk
fdisk -l /dev/sdb
# 若有分区则清理(危险操作)
sgdisk --zap-all /dev/sdb
wipefs -a /dev/sdb
9.1 安装 Rook CRDs、common、operator(v1.16.8 示例)
kubectl create namespace rook-ceph || true
kubectl apply -f https://raw.githubusercontent.com/rook/rook/v1.16.8/deploy/examples/crds.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/v1.16.8/deploy/examples/common.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/v1.16.8/deploy/examples/operator.yaml
kubectl -n rook-ceph get pods -w
等待 rook-ceph-operator、rook-discover 就绪。
9.2 创建 CephCluster(ceph v19.2.0 示例)
保存为 ceph-cluster.yaml(已经为你的主机和 /dev/sdb 定制):
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: quay.io/ceph/ceph:v19.2.0
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
mgr:
modules:
- name: dashboard
enabled: true
dashboard:
enabled: true
network:
hostNetwork: false
storage:
useAllNodes: false
useAllDevices: false
nodes:
- name: k8s-node01
devices:
- name: "sdb"
- name: k8s-node02
devices:
- name: "sdb"
- name: k8s-node03
devices:
- name: "sdb"
应用并等待 pods 启动与 OSD 准备:
kubectl apply -f ceph-cluster.yaml
kubectl -n rook-ceph get cephcluster rook-ceph -o yaml
kubectl -n rook-ceph get pods -o wide
# 观察 rook-ceph-osd-* prepare/install 日志
9.3 创建 CephBlockPool(replicated pool)
保存 ceph-block-pool.yaml:
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
应用:
kubectl apply -f ceph-block-pool.yaml
9.4 创建 RBD StorageClass(支持扩容)
保存 storageclass-rbd.yaml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
应用:
kubectl apply -f storageclass-rbd.yaml
kubectl get sc
9.5 安装 toolbox(方便运行 ceph 工具)
kubectl apply -f https://raw.githubusercontent.com/rook/rook/v1.16.8/deploy/examples/toolbox.yaml
kubectl -n rook-ceph get pods -l app=rook-ceph-tools
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
# 在容器内执行
ceph -s
ceph osd tree
9.6 创建初始 PVC(5Gi)
pvc-test.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
storageClassName: rook-ceph-block
创建并挂载测试 Pod:
kubectl apply -f pvc-test.yaml
# 创建一个 Pod 挂载并测试
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pvc-tester
spec:
containers:
- name: c
image: busybox
command: ["sleep", "3600"]
volumeMounts:
- mountPath: /data
name: vol
volumes:
- name: vol
persistentVolumeClaim:
claimName: test-pvc
EOF
kubectl exec -it pvc-tester -- sh -c "df -h /data; echo hello >/data/hello; cat /data/hello"
9.7 扩容 PVC(从 5Gi -> 10Gi)
编辑 PVC 的 spec.resources.requests.storage 或使用 kubectl patch:
kubectl patch pvc test-pvc -p '{"spec": {"resources": {"requests": {"storage": "10Gi"}}}}'
确认 PVC 状态:
kubectl get pvc test-pvc
kubectl describe pvc test-pvc
CSI 控制器会触发扩容;需注意文件系统在线扩容(如果 Pod 已挂载,部分文件系统自动扩容,否则需要在 Pod 内运行扩容命令,如 resize2fs,但 RBD CSI & kubelet 通常支持在线扩容)。
验证大小:
kubectl exec -it pvc-tester -- sh -c "df -h /data"
若未反映新大小,可能需要在 Pod 内运行文件系统扩展命令(视文件系统而定)。
10. 常用检测与排障命令(速查)
# Kubernetes
kubectl get nodes
kubectl get pods -A
kubectl describe pod <pod> -n <ns>
kubectl logs <pod> -n <ns>
# HAProxy
systemctl status haproxy
echo "show stat" | socat stdio /run/haproxy/admin.sock
# Cilium
cilium status
kubectl -n kube-system get pods -l k8s-app=cilium
# MetalLB
kubectl -n metallb-system get all
# Istio
kubectl -n istio-system get pods
# Rook/Ceph
kubectl -n rook-ceph get pods
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
kubectl -n rook-ceph get cephcluster
kubectl -n rook-ceph get cephblockpool
kubectl -n rook-ceph logs -l app=rook-ceph-osd -c prepare

浙公网安备 33010602011771号