162 lines
4.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 记录一次k8s pod之间ip无法访问问题排查与定位
![img.png](../../k8s-node-pod-network-k8snodepod.png)
### 问题展现现象
node之间通信正常
部分node上的pod无法通信
### 排查有问题node
#### 使用启动网络测试工具
##### 环境准备
docker
数据库mysql
##### 使用有状态副本集合
```bash
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
k8s.kuboard.cn/displayName: 有状态内网穿透集群
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
name: network-server-cluster-start
namespace: default
spec:
podManagementPolicy: OrderedReady
replicas: 10
revisionHistoryLimit: 10
selector:
matchLabels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
serviceName: network-server-cluster-start
template:
metadata:
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
topologyKey: kubernetes.io/hostname
containers:
- env:
- name: spring.datasource.url
value: >-
jdbc:mysql://mysql-host:mysql-port/wu_lazy_cloud_netty_server_cluster?allowMultiQueries=true&useUnicode=true&autoReconnect=true&useAffectedRows=true&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true&databaseTerm=SCHEMA
- name: JAVA_OPTS
value: '-Xms64m -Xmx128m'
- name: spring.datasource.username
value: root
- name: spring.datasource.password
value: laihui
- name: spring.lazy.netty.server.node-id
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: spring.lazy.netty.server.node-port
value: '7101'
envFrom:
- configMapRef:
name: wu-lazy-cloud-heartbeat-server-cluster-start-conf
image: >-
registry.cn-hangzhou.aliyuncs.com/wu-lazy/wu-lazy-cloud-heartbeat-server-cluster-start:1.2.7-JDK17-NATIVE-SNAPSHOT
imagePullPolicy: Always
name: network-server-cluster-start
ports:
- containerPort: 7101
hostPort: 7101
name: tcp7101
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
name: network-server-cluster-start
namespace: default
spec:
ipFamilyPolicy: SingleStack
ports:
- name: 6eqe4d
port: 7101
protocol: TCP
targetPort: 7101
selector:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: network-server-cluster-start
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
data:
spring.lazy.netty.server.node-host: '${HOSTNAME}.network-server-cluster-start.default.svc.cluster.local'
kind: ConfigMap
metadata:
name: wu-lazy-cloud-heartbeat-server-cluster-start-conf
namespace: default
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
name: network-server-cluster-start-web
namespace: default
spec:
ports:
- name: sjmxma
nodePort: 33201
port: 6101
protocol: TCP
targetPort: 6101
selector:
k8s.kuboard.cn/name: network-server-cluster-start
sessionAffinity: None
type: NodePort
```
##### 配置参数需要调整的
::: tip 其中 mysql-host、mysql-port 替换成你本地具体数据库
::: 注意副本数量调整为你可以调度的节点数据量
##### 打开页面 http://集群IP:33201/netty-server-ui/index.html (默认账号/密码admin/admin
##### 初始化菜单、添加角色、用户授权角色
##### 打开集群管理页面(查看异常状态节点)
![在这里插入图片描述](https://img-blog.csdnimg.cn/direct/88a473b88ee24e86b4a4915c98652636.png)