排障现场记不住?收藏这篇就够,或者直接 AI 对话式学习 🤪

0️⃣ TL;DR 一图速查

步骤 场景 常用命令
观察 看 Pod 状态 / 资源用量 kubectl get po -n <ns>
kubectl top po/nodes
操作 CRUD / 扩缩容 kubectl apply/delete -f
kubectl scale deploy ...
排障 进盒子 / 测连通 kubectl exec -it ...
kubectl port-forward ...
运维 版本回滚 / 集群切换 kubectl rollout undo ...
kubectl config use-context ...

1️⃣ 观察|先搞清楚发生了什么

1
2
3
4
kubectl get po -n <ns>          # 列出 Pod,快速看状态
kubectl describe po <name> -n <ns> # 详细事件 / 条目
kubectl logs <po/rs/deploy> -c <container> -n <ns> # 日志(加 -f 实时跟)
kubectl top po/nodes # CPU / 内存用量(需装 metrics-server)

2️⃣ 操作|对象的 CRUD

1
2
3
4
kubectl apply -f <file|dir>     # 创建 / 更新一切(IaC 友好)
kubectl delete -f <file|dir> # 按清单删除
kubectl edit <kind> <name> -n <ns> # 临时在线热改(会起默认编辑器)
kubectl scale deploy <name> --replicas=N -n <ns> # 手动扩缩容

3️⃣ 排障|进盒子、测连通

1
2
kubectl exec -it <po> -c <container> -- /bin/sh   # SSH 替代,进容器
kubectl port-forward svc/<svc> 8080:80 -n <ns> # 本地调试

4️⃣ 运维|版本演进 & 环境切换

1
2
kubectl rollout history|status|undo deploy <name> -n <ns>   # 版本管理三件套
kubectl config get-contexts / use-context <ctx> # 多集群切换

常用通配符备忘

  • n <ns> 指定命名空间,默认 default
  • o yaml|json 输出可直接回 apply
  • -all-namespaces 横扫全集群

牢牢记住 “get → describe/logs → exec” 这条排障黄金链路,再加上 apply / delete 两极操作,你就具备了在生产现场看得见、改得掉、救得了的核心功力。


有意思的,用了下 Warp 这个终端,问下它:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
I'll tell you about the core kubectl commands that are essential for managing Kubernetes clusters:

Core kubectl Commands

Cluster and Context Management
• kubectl config view - View kubeconfig settings
• kubectl config current-context - Show current context
• kubectl config use-context <context> - Switch between contexts
• kubectl cluster-info - Display cluster information

Resource Management
• kubectl get <resource> - List resources (pods, services, deployments, etc.)
• kubectl describe <resource> <name> - Show detailed information about a resource
• kubectl create -f <file> - Create resources from a file
• kubectl apply -f <file> - Apply configuration changes
• kubectl delete <resource> <name> - Delete a resource

Pod Operations
• kubectl get pods - List all pods
• kubectl logs <pod-name> - View pod logs
• kubectl exec -it <pod-name> -- <command> - Execute commands in a pod
• kubectl port-forward <pod-name> <local-port>:<pod-port> - Forward ports

Deployment Management
• kubectl create deployment <name> --image=<image> - Create a deployment
• kubectl scale deployment <name> --replicas=<count> - Scale deployment
• kubectl rollout status deployment/<name> - Check rollout status
• kubectl rollout history deployment/<name> - View rollout history

Service and Networking
• kubectl expose deployment <name> --port=<port> - Expose deployment as service
• kubectl get services - List services
• kubectl get ingress - List ingress resources

Debugging and Troubleshooting
• kubectl top nodes - Show node resource usage
• kubectl top pods - Show pod resource usage
• kubectl get events - View cluster events
• kubectl describe node <node-name> - Get node details

Common Resource Types
• pods (po) - Running containers
• services (svc) - Network services
• deployments (deploy) - Deployment configurations
• replicasets (rs) - Replica management
• configmaps (cm) - Configuration data
• secrets - Sensitive data
• nodes (no) - Cluster nodes
• namespaces (ns) - Resource isolation

Useful Flags
• -n <namespace> - Specify namespace
• --all-namespaces - Show resources across all namespaces
• -o yaml/json - Output in YAML or JSON format
• --watch - Watch for changes
• --dry-run=client - Preview changes without applying

Would you like me to demonstrate any of these commands or help you with specific kubectl operations?