可用区打散及亲和性调度
更新时间:2025-07-01
高可用以及高性能是分布式任务执行过程中的重要要求。在安装了virtual-kubelet的集群中,您可以通过Kubernetes原生调度语义实现分布式任务的跨可用区打散,以达到高可用区部署的要求,或者通过Kubernetes原生调度语义实现分布式任务在指定可用区中的亲和性部署,以达到高性能部署的要求。本文介绍如何实现BCI Pod可用区打散或亲和调度。
背景信息
如果您需要将Pod分散部署到多个不同的可用区,或者部署到某个指定可用区,以实现高可用或者高性能的需求,您可以通过Kubernetes原生调度语义中的Pod拓扑分布约束(topologySpreadConstraints)、节点亲和性(nodeAffinity)和Pod亲和性(podAffinity)来实现。
重要:
- 使用该功能前需要确认,集群中是否已经使用了 topologySpreadConstraints、nodeAffinity、podAffinity 将pod调度到非BCI虚拟节点上,如果已使用了该功能,不能在BCI上开启拓扑调度和亲和性调度
- 使用该功能需要在安装virtual-kubelet helm 时,开启 handleTopology、handleAffinity 功能
更多信息,请参见Kubernetes官方文档:
使用前提及注意事项
-
使用可用区打散功能需要在的pod Annotation中配置多个不同可用区的子网,Annotation配置见下文示例
- 目前可用区打散和亲和性调度,仅支持设置topologyKey为topology.kubernetes.io/zone的用法。
- 使用可用区或和亲和调度功能前,请通过工单周知BCI,BCI需要为用户单独配置资源供给
配置示例
示例一:通过topologySpreadConstraints实现可用区打散
提示:目前 pod.spec.topologySpreadConstraints 支持的字段:labelSelector、maxSkew、topologyKey、whenUnsatisfiable
以下为配置拓扑打散约束的示例
- 在工作负载声明中增加拓扑分布约束。Pod的Spec字段中或Deployment、Job等工作负载的PodTemplate的Spec字段中,可以通过以下方式声明一个拓扑分布约束。
YAML
1 topologySpreadConstraints:
2 - maxSkew: <integer>
3 topologyKey: <string>
4 whenUnsatisfiable: <string>
5 labelSelector: <object>
- 本示例将创建一个在多个可用区上均匀分布的Deployment。关于参数的详细信息,请参见topologySpreadConstraints字段。以下为该Deployment的YAML文件。
YAML
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 labels:
5 name: spread-nginx
6 type: pod-perf-test
7 name: spread-nginx-test
8spec:
9 replicas: 1
10 selector:
11 matchLabels:
12 name: spread-nginx
13 type: pod-perf-test
14 template:
15 metadata:
16 labels:
17 name: spread-nginx
18 type: pod-perf-test
19 annotations:
20 bci.virtual-kubelet.io/bci-subnet-ids: "sbn-A,sbn-B,sbn-C" # 修改为真实的子网信息
21 spec:
22 containers:
23 - image: registry.baidubce.com/qatest/nginx:1.23.0
24 imagePullPolicy: IfNotPresent
25 name: nginx-test
26 resources:
27 limits:
28 cpu: 1
29 memory: 2Gi
30 requests:
31 cpu: 1
32 memory: 2Gi
33 terminationMessagePath: /dev/termination-log
34 terminationMessagePolicy: File
35 nodeSelector:
36 type: virtual-kubelet
37 topologySpreadConstraints:
38 - maxSkew: 1
39 topologyKey: topology.kubernetes.io/zone
40 whenUnsatisfiable: DoNotSchedule
41 labelSelector:
42 matchLabels:
43 name: spread-nginx
44 type: pod-perf-test
45 tolerations:
46 - effect: NoSchedule
47 key: virtual-kubelet.io/provider
48 operator: Equal
49 value: baidu
- 创建工作负载。将上面的代码保存为deployment.yaml,并执行以下命令将Deployment提交到集群中。
Shell
1kubectl apply -f deployment.yaml
- 验证工作负载调度结果。通过以下命令查看生产出的Pod所在的节点。
Shell
1kubectl get po -oyaml | grep "bci.virtual-kubelet.io/bci-logical-zone"
2# 输出结果
3 bci.virtual-kubelet.io/bci-logical-zone: zoneF
4 bci.virtual-kubelet.io/bci-logical-zone: zoneD
5 bci.virtual-kubelet.io/bci-logical-zone: zoneF
6 bci.virtual-kubelet.io/bci-logical-zone: zoneD
示例二:通过nodeAffinity和podAffinity实现可用区亲和
-
在工作负载声明中增加亲和性约束。本示例将创建在单个可用区上聚集分布的Deployment。关于参数的详细信息,请参见节点亲和性。以下为该Deployment的YAML文件。
podAffinity 示例:
YAML1apiVersion: apps/v1 2kind: Deployment 3metadata: 4 labels: 5 name: pod-perf-test 6 type: pod-perf-test 7 name: pod-perf-test 8spec: 9 replicas: 1 10 revisionHistoryLimit: 10 11 selector: 12 matchLabels: 13 name: pod-perf-test 14 type: pod-perf-test 15 strategy: 16 rollingUpdate: 17 maxSurge: 25% 18 maxUnavailable: 25% 19 type: RollingUpdate 20 template: 21 metadata: 22 creationTimestamp: null 23 labels: 24 name: pod-perf-test 25 type: pod-perf-test 26 annotations: 27 bci.virtual-kubelet.io/bci-subnet-ids: "sbn-A,sbn-B,sbn-C" # 修改为真实的子网信息 28 spec: 29 containers: 30 - args: 31 - /bin/sh 32 - -c 33 - sleep 3600000 34 image: registry.baidubce.com/qatest/nginx:1.23.0 35 imagePullPolicy: IfNotPresent 36 name: pod-perf-test 37 resources: 38 limits: 39 cpu: 1 40 memory: 2Gi 41 requests: 42 cpu: 1 43 memory: 2Gi 44 terminationMessagePath: /dev/termination-log 45 terminationMessagePolicy: File 46 affinity: 47 podAffinity: 48 requiredDuringSchedulingIgnoredDuringExecution: 49 - labelSelector: 50 matchExpressions: 51 - key: name 52 operator: In 53 values: 54 - spread-nginx 55 topologyKey: topology.kubernetes.io/zone 56 nodeSelector: 57 type: virtual-kubelet 58 restartPolicy: Always 59 tolerations: 60 - effect: NoSchedule 61 key: virtual-kubelet.io/provider 62 operator: Equal 63 value: baidu
nodeAffinity 示例:
YAML1apiVersion: apps/v1 2kind: Deployment 3metadata: 4 annotations: 5 deployment.kubernetes.io/revision: "1" 6 labels: 7 name: pod-perf-test 8 type: pod-perf-test 9 name: pod-perf-test 10 namespace: default 11spec: 12 replicas: 1 13 revisionHistoryLimit: 10 14 selector: 15 matchLabels: 16 name: pod-perf-test 17 type: pod-perf-test 18 strategy: 19 rollingUpdate: 20 maxSurge: 25% 21 maxUnavailable: 25% 22 type: RollingUpdate 23 template: 24 metadata: 25 creationTimestamp: null 26 labels: 27 name: pod-perf-test 28 type: pod-perf-test 29 annotations: 30 bci.virtual-kubelet.io/bci-subnet-ids: "sbn-A,sbn-B,sbn-C" # 修改为真实的子网信息 31 spec: 32 containers: 33 - args: 34 - /bin/sh 35 - -c 36 - sleep 3600000 37 image: registry.baidubce.com/qatest/nginx:1.23.0 38 name: pod-perf-test 39 resources: 40 limits: 41 cpu: 1 42 memory: 2Gi 43 requests: 44 cpu: 1 45 memory: 2Gi 46 terminationMessagePath: /dev/termination-log 47 terminationMessagePolicy: File 48 dnsPolicy: Default 49 affinity: 50 nodeAffinity: 51 requiredDuringSchedulingIgnoredDuringExecution: 52 nodeSelectorTerms: 53 - matchExpressions: 54 - key: topology.kubernetes.io/zone 55 operator: In 56 values: 57 - zoneD 58 nodeSelector: 59 type: virtual-kubelet 60 tolerations: 61 - effect: NoSchedule 62 key: virtual-kubelet.io/provider 63 operator: Equal 64 value: baidu
- 创建工作负载。将上面的代码保存为deployment.yaml,并执行以下命令将Deployment提交到集群中。
Shell
1kubectl apply -f deployment.yaml
- 验证工作负载调度结果。通过以下命令查看生产出的Pod所在的节点。
Shell
1kubectl get po -oyaml | grep "bci.virtual-kubelet.io/bci-logical-zone"
2#输出结果
3 bci.virtual-kubelet.io/bci-logical-zone: zoneD
4 bci.virtual-kubelet.io/bci-logical-zone: zoneD
5 bci.virtual-kubelet.io/bci-logical-zone: zoneD