使用探针对容器进行健康检查
更新时间:2024-09-25
使用探针对容器进行健康检查
Kubernetes中,容器的健康检查由kubelet定期执行,kubelet通过存活探针和业务探针来检查容器的状态和运行情况。当前BCI支持探针如下:
探针 | 说明 | 使用场景 |
---|---|---|
应用存活探针( Liveness Probe) | 用于检查容器是否正常运行。如果检查成功,则表示容器正常运行。如果检查失败,系统会根据配置的容器重启策略进行相应的处理。如果未配置该探针,则默认容器一直正常运行。 | * 当应用程序处于运行状态但无法进行进一步操作时,Liveness Probe将捕获到deadlock,重启对应的容器,使得应用程序在存在bug的情况下依然能够运行。另外,长时间运行的应用程序最终可能会转换到broken状态,此时除了重新启动,无法恢复。Liveness Probe可以检测并补救这种情况。 |
应用业务探针(Readiness Probe) | 用于检查容器是否已经就绪,可以为请求提供服务。如果检查成功,则表示容器已经准备就绪,可以接收业务请求。如果检查失败,则表示容器没有准备就绪,系统将停止向该容器发送任何请求,直至重新检查成功。 | 如果应用程序暂时无法对外部流量提供服务,例如应用程序需要在启动期间加载大量数据或配置文件,此时,如果不想终止应用程序,也不想向其发送请求,可以通过Readiness Probe来检测和缓解这种情况。 |
应用启动探针 (StartupProbe Probe) | 用于检查容器是否启动成功,有时候,会有一些现有的应用在启动时需要较长的初始化时间。 要这种情况下,若要不影响对死锁作出快速响应的探测,设置存活探测参数是要技巧的。 | 比如服务A启动时间很慢,需要60s。这个时候如果还是用存活探针就会进入死循环,因为当存活探针开始探测时,服务并没有起来,发现探测失败就会触发restartPolicy。此时简单调大存活探针的initialDelay时间不灵活且可能带来更多问题,因此可以通过应用启动探针来环境这种情况。 |
配置示例
您可以通过容器的livenessProbe和readinessProbe字段来设置Liveness Probe或者Readiness Probe,配置详情见配置存活、就绪和启动探针
1.配置应用存活探针( Liveness Probe)
Plain Text
1apiVersion: v1
2kind: Pod
3metadata:
4 annotations:
5 myannotation: "myannotation"
6 labels:
7 app: bci-test-vk
8 mylabel: "mylabel"
9 name: liveness-test
10 namespace: default
11spec:
12 enableServiceLinks: false
13 nodeSelector:
14 type: virtual-kubelet
15 tolerations:
16 - effect: NoSchedule
17 key: virtual-kubelet.io/provider
18 operator: Equal
19 value: baidu
20 - effect: NoExecute
21 key: node.kubernetes.io/not-ready
22 operator: Exists
23 tolerationSeconds: 300
24 - effect: NoExecute
25 key: node.kubernetes.io/unreachable
26 operator: Exists
27 tolerationSeconds: 300
28 containers:
29 - image: hub.baidubce.com/cce/nginx-alpine-go
30 imagePullPolicy: IfNotPresent
31 name: c01
32 workingDir: /work
33 ports:
34 - containerPort: 8080
35 protocol: TCP
36 resources:
37 limits:
38 cpu: 250m
39 memory: 512Mi
40 requests:
41 cpu: 250m
42 memory: 512Mi
43 livenessProbe:
44 exec:
45 command:
46 - /bin/sh
47 - '-c'
48 - sleep 1 && exit 0
49 failureThreshold: 3
50 initialDelaySeconds: 5
51 periodSeconds: 30
52 successThreshold: 1
53 timeoutSeconds: 10
2.配置应用业务探针(Readiness Probe)
Plain Text
1apiVersion: v1
2kind: Pod
3metadata:
4 annotations:
5 myannotation: "myannotation"
6 labels:
7 app: bci-test-vk
8 mylabel: "mylabel"
9 name: readiness-test
10 namespace: default
11spec:
12 enableServiceLinks: false
13 nodeSelector:
14 type: virtual-kubelet
15 tolerations:
16 - effect: NoSchedule
17 key: virtual-kubelet.io/provider
18 operator: Equal
19 value: baidu
20 - effect: NoExecute
21 key: node.kubernetes.io/not-ready
22 operator: Exists
23 tolerationSeconds: 300
24 - effect: NoExecute
25 key: node.kubernetes.io/unreachable
26 operator: Exists
27 tolerationSeconds: 300
28 containers:
29 - image: hub.baidubce.com/cce/nginx-alpine-go
30 imagePullPolicy: IfNotPresent
31 name: c01
32 workingDir: /work
33 ports:
34 - containerPort: 8080
35 protocol: TCP
36 resources:
37 limits:
38 cpu: 250m
39 memory: 512Mi
40 requests:
41 cpu: 250m
42 memory: 512Mi
43 readinessProbe:
44 exec:
45 command:
46 - /bin/sh
47 - '-c'
48 - sleep 1 && exit 0
49 failureThreshold: 3
50 initialDelaySeconds: 5
51 periodSeconds: 30
52 successThreshold: 1
53 timeoutSeconds: 10
3.配置应用启动探针(StartupProbe Probe)
Plain Text
1apiVersion: v1
2kind: Pod
3metadata:
4 annotations:
5 myannotation: "myannotation"
6 labels:
7 app: bci-test-vk
8 mylabel: "mylabel"
9 name: startup-test
10 namespace: default
11spec:
12 enableServiceLinks: false
13 nodeSelector:
14 type: virtual-kubelet
15 tolerations:
16 - effect: NoSchedule
17 key: virtual-kubelet.io/provider
18 operator: Equal
19 value: baidu
20 - effect: NoExecute
21 key: node.kubernetes.io/not-ready
22 operator: Exists
23 tolerationSeconds: 300
24 - effect: NoExecute
25 key: node.kubernetes.io/unreachable
26 operator: Exists
27 tolerationSeconds: 300
28 containers:
29 - image: hub.baidubce.com/cce/nginx-alpine-go
30 imagePullPolicy: IfNotPresent
31 name: c01
32 workingDir: /work
33 ports:
34 - containerPort: 8080
35 protocol: TCP
36 resources:
37 limits:
38 cpu: 250m
39 memory: 512Mi
40 requests:
41 cpu: 250m
42 memory: 512Mi
43 startupProbe:
44 exec:
45 command:
46 - /bin/sh
47 - '-c'
48 - sleep 1 && exit 0
49 failureThreshold: 3
50 initialDelaySeconds: 5
51 periodSeconds: 30
52 successThreshold: 1
53 timeoutSeconds: 10