故障诊断接口
更新时间:2025-08-04
创建诊断任务
描述
发起集群诊断任务
请求结构
Plain Text
1POST /v2/cluster/{clusterId}/diagnosis HTTP/1.1
请求头域
除公共头域外,无其它特殊头域。
请求参数
参数名称 | 类型 | 是否必需 | 参数位置 | 描述 |
---|---|---|---|---|
clusterId | String | 是 | Path | 集群ID |
type | String | 是 | Body | 诊断类型,枚举值: |
target | Target | 是 | Body | 诊断对象 |
exsitedOption | ExistedOption | 否 | Body | 重试选项 |
返回头域
除公共头域,无其它特殊头域。
返回参数
参数名称 | 类型 | 描述 |
---|---|---|
taskId | String | 诊断任务ID |
请求示例
Plain Text
1POST /v2/cluster/{clusterId}/diagnosis HTTP/1.1
2Host: cce.bj.baidubce.com
3ContentType: application/json
4Authorization: bce-auth-v1/f81d3b34e48048fbb2634dc7882d7e21/2019-03-11T04:17:29Z/3600/host/74c506f68c65e26c633bfa104c863fffac5190fdec1ec24b7c03eb5d67d2e1de
5
6{
7 "type": "pod",
8 "target": {
9 "nodeName": "192.168.4.77",
10 "namespace": "kube-system",
11 "podName": "nfd-worker-kq8v8"
12 }
13}
返回示例
Plain Text
1 Content-Type: application/json; charset=utf-8
2 Date: Thu, 28 Jul 2022 03:25:43 GMT
3 X-Bce-Gateway-Region: BJ
4 X-Bce-Request-Id: b42840ec-a200-49c9-86bd-58687b7009bb
5
6{"taskId":"cce-3bbg57ai-20250123-ab912e9c"}
获取集群诊断任务列表
描述
获取集群诊断任务列表
请求结构
Plain Text
1GET /v2/cluster/{clusterId}/diagnoses?type=node HTTP/1.1
请求头域
除公共头域外,无其它特殊头域。
请求参数
参数名称 | 类型 | 是否必需 | 参数位置 | 描述 |
---|---|---|---|---|
clusterId | String | 是 | Path | 集群ID |
type | String | 是 | Query | 诊断类型: pod/node |
pageSize | Integer | 否 | Query | 分页查询巡检实例列表每页监控实例个数,范围为1-100,缺省值为10 |
pageNo | Integer | 否 | Query | 分页查询巡检实例列表分页页码数,缺省值为1 |
order | String | 否 | Query | 诊断实例列表排序方式:默认:desc |
orderBy | String | 否 | Query | 诊断实例列表排序字段:默认为诊断开始时间:inspectStartTime, inspectEndTime |
resultFilter | String | 否 | Query | 按照结果过滤;取值:"doing" "succeeded" "failed"。 |
返回头域
除公共头域,无其它特殊头域。
返回参数
参数名称 | 类型 | 描述 |
---|---|---|
orderBy | String | 诊断报告列表排序依据字段: 默认巡检开始时间 |
order | String | 诊断报告列表排序方式:desc倒序(默认),asc升序 |
pageNo | Integer | 诊断报告列表分页当前页码数 |
pageSize | Integer | 当前页诊断报告个数 |
totalCount | Integer | 诊断报告总个数 |
resultFilter | String | 结果过滤 |
diagnosisReports | List |
诊断报告列表 |
请求示例
Plain Text
1GET /v2/cluster/cce-e5kxhgpb/diagnoses?type=node HTTP/1.1
2Host: cce.bj.baidubce.com
3ContentType: application/json
4Authorization: bce-auth-v1/f81d3b34e48048fbb2634dc7882d7e21/2019-03-11T04:17:29Z/3600/host/74c506f68c65e26c633bfa104c863fffac5190fdec1ec24b7c03eb5d67d2e1de
返回示例
Plain Text
1 Content-Type: application/json; charset=utf-8
2 Date: Thu, 28 Jul 2022 03:25:43 GMT
3 X-Bce-Gateway-Region: BJ
4 X-Bce-Request-Id: b42840ec-a200-49c9-86bd-58687b7009bb
5
6{
7 "diagnosisReports": [
8 {
9 "diagnosisType": "node",
10 "endTime": "2025-03-07T20:59:04+08:00",
11 "result": "succeeded",
12 "startTime": "2025-03-07T20:58:35+08:00",
13 "target": {
14 "namespace": "",
15 "nodeName": "10.0.0.128",
16 "podName": ""
17 },
18 "taskId": "cce-e5kxhgpb-20250307-353aed59",
19 "taskPhases": [
20 {
21 "phaseId": 1,
22 "phaseName": "identifying",
23 "phaseResult": "succeeded",
24 "phaseTime": "2025-03-07T20:58:35+08:00"
25 },
26 {
27 "phaseId": 2,
28 "phaseName": "collecting",
29 "phaseResult": "succeeded",
30 "phaseTime": "2025-03-07T20:59:03.659762883+08:00"
31 },
32 {
33 "phaseId": 3,
34 "phaseName": "evaluating",
35 "phaseResult": "succeeded",
36 "phaseTime": "2025-03-07T20:59:03.659762883+08:00"
37 },
38 {
39 "phaseId": 4,
40 "phaseName": "analyzing",
41 "phaseResult": "succeeded",
42 "phaseTime": "2025-03-07T20:59:04.115190339+08:00"
43 }
44 ]
45 },
46 {
47 "diagnosisType": "node",
48 "endTime": "2025-03-07T20:37:21+08:00",
49 "result": "succeeded",
50 "startTime": "2025-03-07T20:36:45+08:00",
51 "target": {
52 "namespace": "",
53 "nodeName": "10.0.0.128",
54 "podName": ""
55 },
56 "taskId": "cce-e5kxhgpb-20250307-0523eb62",
57 "taskPhases": [
58 {
59 "phaseId": 1,
60 "phaseName": "identifying",
61 "phaseResult": "succeeded",
62 "phaseTime": "2025-03-07T20:36:45+08:00"
63 },
64 {
65 "phaseId": 2,
66 "phaseName": "collecting",
67 "phaseResult": "succeeded",
68 "phaseTime": "2025-03-07T20:37:20.155058924+08:00"
69 },
70 {
71 "phaseId": 3,
72 "phaseName": "evaluating",
73 "phaseResult": "succeeded",
74 "phaseTime": "2025-03-07T20:37:20.155058924+08:00"
75 },
76 {
77 "phaseId": 4,
78 "phaseName": "analyzing",
79 "phaseResult": "succeeded",
80 "phaseTime": "2025-03-07T20:37:20.72098931+08:00"
81 }
82 ]
83 }
84 ],
85 "order": "desc",
86 "orderBy": "task_start_time",
87 "pageNo": 1,
88 "pageSize": 10,
89 "totalCount": 2
90 }
获取集群诊断报告详情
描述
获取集群诊断报告详情
请求结构
Plain Text
1GET /v2/cluster/{clusterId}/diagnosis/{taskId}/report HTTP/1.1
请求头域
除公共头域外,无其它特殊头域。
请求参数
参数名称 | 类型 | 是否必需 | 参数位置 | 描述 |
---|---|---|---|---|
clusterId | String | 是 | Path | 集群ID |
taskId | String | 是 | Path | 诊断任务ID |
返回头域
除公共头域,无其它特殊头域。
返回参数
参数名称 | 类型 | 描述 |
---|---|---|
taskId | String | 诊断任务ID |
diagnosisType | String | 诊断类型pod/node |
completed | Bool | 诊断报告是否完成 |
taskResult | String | 诊断任务结果健康: normal异常: abnormal诊断中: doing诊断失败: failed |
startTime | DateTime | 任务开始时间 |
endTime | DateTime | 任务结束时间 |
itemsCount | Interger | 任务执行的诊断项数量 |
target | Target | 诊断目标对象 |
reportItems | DiagnosisReportItem | 诊断项列表 |
conclusion | Conclusion | 诊断结论 |
请求示例
Plain Text
1GET /v2/cluster/cce-bzbdgc04/diagnosis/cce-bzbdgc04-20250307-827d32d7/report HTTP/1.1
2Host: cce.bj.baidubce.com
3ContentType: application/json
4Authorization: bce-auth-v1/f81d3b34e48048fbb2634dc7882d7e21/2019-03-11T04:17:29Z/3600/host/74c506f68c65e26c633bfa104c863fffac5190fdec1ec24b7c03eb5d67d2e1de
返回示例
Plain Text
1 Content-Type: application/json; charset=utf-8
2 Date: Thu, 28 Jul 2022 03:25:43 GMT
3 X-Bce-Gateway-Region: BJ
4 X-Bce-Request-Id: b42840ec-a200-49c9-86bd-58687b7009bb
5
6{
7 "completed": true,
8 "conclusion": {
9 "cause": "",
10 "problem": "",
11 "result": "normal",
12 "suggestion": ""
13 },
14 "diagnosisType": "pod",
15 "endTime": "2025-03-07T21:09:50+08:00",
16 "itemsCount": 73,
17 "reportItems": {
18 "Cluster Component": {
19 "ClusterBLBGroupDiag": {
20 "composedResult": "normal",
21 "description": "检查集群 API Server BLB 6443端口对应目标组是否配置正常。若配置异常,可能导致集群无法访问。",
22 "enable": true,
23 "exactMessage": "",
24 "grade": "error",
25 "itemId": 740004,
26 "itemNameZH": "APIServer BLB 6443端口目标组配置是否正常",
27 "result": true,
28 "suggestion": "1. 前往BLB应用型实例页面找到集群关联的BLB实例,检查BLB实例目标组的配置。2. 如果找不到BLB实例,请提CCE工单。",
29 "value": "Normal"
30 },
31 "ClusterBLBInstanceDiag": {
32 "composedResult": "normal",
33 "description": "检查集群 API Server负载均衡实例是否存在。若集群API Server负载均衡实例不存在,会造成集群不可用。",
34 "enable": true,
35 "exactMessage": "",
36 "grade": "error",
37 "itemId": 740002,
38 "itemNameZH": "APIServer BLB 实例是否存在",
39 "result": true,
40 "suggestion": "1. 前往BLB应用型实例页面检查集群关联的BLB实例是否存在。2. 如果找不到BLB实例,请提CCE工单。",
41 "value": "Normal"
42 },
43 "ClusterBLBInstanceStatusDiag": {
44 "composedResult": "normal",
45 "description": "检查集群 API Server BLB实例状态。若实例状态异常,将会影响集群可用性。",
46 "enable": true,
47 "exactMessage": "",
48 "grade": "error",
49 "itemId": 740003,
50 "itemNameZH": "APIServer BLB 实例状态是否正常",
51 "result": true,
52 "suggestion": "1. 前往BLB应用型实例页面找到集群关联的BLB实例,在实例详情里检查BLB实例状态。2。 如果找不到BLB实例,请提CCE工单。",
53 "value": "Normal"
54 },
55 "ClusterBLBPortConfigDiag": {
56 "composedResult": "normal",
57 "description": "检查集群 API Server BLB 6443端口监听配置。若配置异常,将导致集群无法访问。",
58 "enable": true,
59 "exactMessage": "",
60 "grade": "error",
61 "itemId": 740001,
62 "itemNameZH": "APIServer BLB 6443端口监听配置是否正常",
63 "result": true,
64 "suggestion": "1. 前往BLB应用型实例页面找到集群关联的BLB实例,检查BLB实例监听设置。2. 如果找不到BLB实例,请提CCE工单。",
65 "value": "Normal"
66 },
67 "DNSServiceAvailableDiag": {
68 "composedResult": "normal",
69 "description": "检查集群 DNS 服务的 Cluster IP 是否正常分配,集群 DNS 服务异常会造成集群功能异常,影响业务。",
70 "enable": true,
71 "exactMessage": "",
72 "grade": "error",
73 "itemId": 740005,
74 "itemNameZH": "DNS Service是否正常",
75 "result": true,
76 "suggestion": "检查 CoreDNS Pod运行状态和运行日志,排查 DNS 问题。",
77 "value": "Normal"
78 },
79 "VPCENINodeAvailableZoneDiag": {
80 "composedResult": "normal",
81 "description": "检查 VPC-ENI 模式下,检查节点和容器子网是否在同一可用区,并且同一可用区下的容器子网剩余 IP 数大于5个。不在同一可用区的节点将无法正常工作。",
82 "enable": true,
83 "exactMessage": "",
84 "grade": "error",
85 "itemId": 740007,
86 "itemNameZH": "VPC-ENI模式节点和容器子网是否在同一可用区",
87 "result": true,
88 "suggestion": "查看 CCE 集群详情,找到容器网络,添加与节点同一可用区的子网。",
89 "value": "Normal"
90 },
91 "VPCENIRemainingIPDiag": {
92 "composedResult": "normal",
93 "description": "检查 VPC-ENI 模式下,集群内配置的子网剩余IP是否小于10个,每个 Pod 占用一个IP。当可用IP耗尽后,新创建的 Pod 分配不到IP,所以无法正常启动。",
94 "enable": true,
95 "exactMessage": "",
96 "grade": "warning",
97 "itemId": 740008,
98 "itemNameZH": "VPC-ENI模式子网剩余IP数不足",
99 "result": true,
100 "suggestion": "查看 CCE 集群详情,找到容器网络,添加与节点同一可用区的子网。",
101 "value": "Normal"
102 },
103 "VPCRouteRemainingPodCIDRDiag": {
104 "composedResult": "normal",
105 "description": "检查 VPC 路由模式下,集群剩余可用 PodCIDR 网段是否少于5个。每个节点消耗一个 PodCIDR 网段,集群可添加的节点少于5个。Pod 网段耗尽后,新添加的节点将无法正常工作。",
106 "enable": true,
107 "exactMessage": "cluster mode is not VPCNetwork, current mode: vpc-eni",
108 "grade": "error",
109 "itemId": 740006,
110 "itemNameZH": "VPC路由模式剩余Pod网段数不足",
111 "result": true,
112 "suggestion": "CCE提工单扩容。",
113 "value": "Normal"
114 }
115 }
116 },
117 "startTime": "2025-03-07T21:09:11+08:00",
118 "target": {
119 "namespace": "kube-system",
120 "nodeName": "192.168.0.142",
121 "podName": "coredns-795c547975-mbqvw"
122 },
123 "taskId": "cce-bzbdgc04-20250307-827d32d7",
124 "taskResult": "normal"
125 }
批量创建诊断任务
描述
批量创建节点诊断任务
注意:
- 批量创建节点诊断任务,最多选择20个节点。
- Pod诊断不支持批量。
请求结构
Plain Text
1POST /v2/cluster/{clusterId}/diagnoses HTTP/1.1
请求头域
除公共头域外,无其它特殊头域。
请求参数
参数名称 | 类型 | 是否必需 | 参数位置 | 描述 |
---|---|---|---|---|
clusterId | String | 是 | Path | 集群ID |
type | String | 是 | Body | 诊断类型node |
targets | List |
是 | Body | 诊断对象 |
返回头域
除公共头域,无其它特殊头域。
返回参数
参数名称 | 类型 | 描述 |
---|---|---|
taskIds | List |
诊断任务ID |
请求示例
Plain Text
1POST /v2/cluster/{clusterId}/diagnosis HTTP/1.1
2Host: cce.bj.baidubce.com
3ContentType: application/json
4Authorization: bce-auth-v1/f81d3b34e48048fbb2634dc7882d7e21/2019-03-11T04:17:29Z/3600/host/74c506f68c65e26c633bfa104c863fffac5190fdec1ec24b7c03eb5d67d2e1de
5
6{
7 "type": "node",
8 "targets": [
9 {
10 "nodeName": "192.168.4.77"
11 },
12 {
13 "nodeName": "192.168.4.78"
14 }
15 ]
16}
返回示例
Plain Text
1 Content-Type: application/json; charset=utf-8
2 Date: Thu, 28 Jul 2022 03:25:43 GMT
3 X-Bce-Gateway-Region: BJ
4 X-Bce-Request-Id: b42840ec-a200-49c9-86bd-58687b7009bb
5
6{"taskIds": ["cce-3bbg57ai-20250123-ab912e9c", "cce-3bbg57ai-20250123-sa61279t"]}
返回错误码
错误码 | http code | 说明 |
---|---|---|
cce.warning.AccessDenied | 403 | 权限不足 |
cce.warning.InvalidParam | 400 | 无效请求参数 |
cce.warning.NoSuchObject | 404 | 未找到资源 |
cce.warning.MalformedJSON | 400 | 请求参数解析失败 |
cce.warning.ClusterNotFound | 404 | 集群未找到 |
cce.warning.IAMUnauthorized | 403 | 鉴权失败 |
cce.warning.OperationNotAllowed | 403 | 鉴权失败 |
cce.error.InternalServerError | 500 | 所有未定义的其他错误。 |