使用探针对容器进行健康检查
更新时间:2024-09-25
使用探针对容器进行健康检查
Kubernetes中,容器的健康检查由kubelet定期执行,kubelet通过存活探针和业务探针来检查容器的状态和运行情况。当前BCI支持探针如下:
| 探针 | 说明 | 使用场景 | 
|---|---|---|
| 应用存活探针( Liveness Probe) | 用于检查容器是否正常运行。如果检查成功,则表示容器正常运行。如果检查失败,系统会根据配置的容器重启策略进行相应的处理。如果未配置该探针,则默认容器一直正常运行。 | * 当应用程序处于运行状态但无法进行进一步操作时,Liveness Probe将捕获到deadlock,重启对应的容器,使得应用程序在存在bug的情况下依然能够运行。另外,长时间运行的应用程序最终可能会转换到broken状态,此时除了重新启动,无法恢复。Liveness Probe可以检测并补救这种情况。 | 
| 应用业务探针(Readiness Probe) | 用于检查容器是否已经就绪,可以为请求提供服务。如果检查成功,则表示容器已经准备就绪,可以接收业务请求。如果检查失败,则表示容器没有准备就绪,系统将停止向该容器发送任何请求,直至重新检查成功。 | 如果应用程序暂时无法对外部流量提供服务,例如应用程序需要在启动期间加载大量数据或配置文件,此时,如果不想终止应用程序,也不想向其发送请求,可以通过Readiness Probe来检测和缓解这种情况。 | 
| 应用启动探针 (StartupProbe Probe) | 用于检查容器是否启动成功,有时候,会有一些现有的应用在启动时需要较长的初始化时间。 要这种情况下,若要不影响对死锁作出快速响应的探测,设置存活探测参数是要技巧的。 | 比如服务A启动时间很慢,需要60s。这个时候如果还是用存活探针就会进入死循环,因为当存活探针开始探测时,服务并没有起来,发现探测失败就会触发restartPolicy。此时简单调大存活探针的initialDelay时间不灵活且可能带来更多问题,因此可以通过应用启动探针来环境这种情况。 | 
配置示例
您可以通过容器的livenessProbe和readinessProbe字段来设置Liveness Probe或者Readiness Probe,配置详情见配置存活、就绪和启动探针
1.配置应用存活探针( Liveness Probe)
                Plain Text
                
            
            1apiVersion: v1
2kind: Pod
3metadata:
4  annotations:
5    myannotation: "myannotation"
6  labels:
7    app: bci-test-vk
8    mylabel: "mylabel"
9  name: liveness-test
10  namespace: default
11spec:
12  enableServiceLinks: false
13  nodeSelector:
14    type: virtual-kubelet
15  tolerations:
16    - effect: NoSchedule
17      key: virtual-kubelet.io/provider
18      operator: Equal
19      value: baidu
20    - effect: NoExecute
21      key: node.kubernetes.io/not-ready
22      operator: Exists
23      tolerationSeconds: 300
24    - effect: NoExecute
25      key: node.kubernetes.io/unreachable
26      operator: Exists
27      tolerationSeconds: 300
28  containers:
29  - image: hub.baidubce.com/cce/nginx-alpine-go
30    imagePullPolicy: IfNotPresent
31    name: c01
32    workingDir: /work
33    ports:
34    - containerPort: 8080
35      protocol: TCP
36    resources:
37      limits:
38        cpu: 250m
39        memory: 512Mi
40      requests:
41        cpu: 250m
42        memory: 512Mi
43    livenessProbe:
44      exec:
45        command:
46          - /bin/sh
47          - '-c'
48          - sleep 1 && exit 0
49      failureThreshold: 3
50      initialDelaySeconds: 5
51      periodSeconds: 30
52      successThreshold: 1
53      timeoutSeconds: 10
            2.配置应用业务探针(Readiness Probe)
                Plain Text
                
            
            1apiVersion: v1
2kind: Pod
3metadata:
4  annotations:
5    myannotation: "myannotation"
6  labels:
7    app: bci-test-vk
8    mylabel: "mylabel"
9  name: readiness-test
10  namespace: default
11spec:
12  enableServiceLinks: false
13  nodeSelector:
14    type: virtual-kubelet
15  tolerations:
16    - effect: NoSchedule
17      key: virtual-kubelet.io/provider
18      operator: Equal
19      value: baidu
20    - effect: NoExecute
21      key: node.kubernetes.io/not-ready
22      operator: Exists
23      tolerationSeconds: 300
24    - effect: NoExecute
25      key: node.kubernetes.io/unreachable
26      operator: Exists
27      tolerationSeconds: 300
28  containers:
29  - image: hub.baidubce.com/cce/nginx-alpine-go
30    imagePullPolicy: IfNotPresent
31    name: c01
32    workingDir: /work
33    ports:
34    - containerPort: 8080
35      protocol: TCP
36    resources:
37      limits:
38        cpu: 250m
39        memory: 512Mi
40      requests:
41        cpu: 250m
42        memory: 512Mi
43    readinessProbe:
44      exec:
45        command:
46          - /bin/sh
47          - '-c'
48          - sleep 1 && exit 0
49      failureThreshold: 3
50      initialDelaySeconds: 5
51      periodSeconds: 30
52      successThreshold: 1
53      timeoutSeconds: 10
            3.配置应用启动探针(StartupProbe Probe)
                Plain Text
                
            
            1apiVersion: v1
2kind: Pod
3metadata:
4  annotations:
5    myannotation: "myannotation"
6  labels:
7    app: bci-test-vk
8    mylabel: "mylabel"
9  name: startup-test
10  namespace: default
11spec:
12  enableServiceLinks: false
13  nodeSelector:
14    type: virtual-kubelet
15  tolerations:
16    - effect: NoSchedule
17      key: virtual-kubelet.io/provider
18      operator: Equal
19      value: baidu
20    - effect: NoExecute
21      key: node.kubernetes.io/not-ready
22      operator: Exists
23      tolerationSeconds: 300
24    - effect: NoExecute
25      key: node.kubernetes.io/unreachable
26      operator: Exists
27      tolerationSeconds: 300
28  containers:
29  - image: hub.baidubce.com/cce/nginx-alpine-go
30    imagePullPolicy: IfNotPresent
31    name: c01
32    workingDir: /work
33    ports:
34    - containerPort: 8080
35      protocol: TCP
36    resources:
37      limits:
38        cpu: 250m
39        memory: 512Mi
40      requests:
41        cpu: 250m
42        memory: 512Mi
43    startupProbe:
44      exec:
45        command:
46          - /bin/sh
47          - '-c'
48          - sleep 1 && exit 0
49      failureThreshold: 3
50      initialDelaySeconds: 5
51      periodSeconds: 30
52      successThreshold: 1
53      timeoutSeconds: 10
            