创建GPU实例
更新时间:2025-02-26
本文介绍如何创建并使用BCI GPU实例。
BCI GPU规格说明
BCI提供以下型号 GPU Pod 规格,不同的 GPU 卡型号和大小会对应不同的 CPU、内存选项,请在创建工作负载时根据您的实际需求选择最合适规格,并进行资源分配。
BCI 规格名称 | CPU | 内存 | GPU类型 | 显存 | GPU卡数 |
bci.gna2.c8m36.1a10 | 8 | 36 | Nvidia A10 PCIE | 24*1 | 1 |
bci.gna2.c18m74.1a10 | 18 | 74 | Nvidia A10 PCIE | 24*1 | 1 |
bci.gna2.c30m118.2a10 | 30 | 118 | Nvidia A10 PCIE | 24*2 | 2 |
bci.gna2.c62m240.4a10 | 62 | 240 | Nvidia A10 PCIE | 24*4 | 4 |
创建实例
配置说明如下
- 指定GPU型号,需要在 Annotation 指定 GPU型号,请注意, Annotation需要配置在Pod Spec中,而不是Deployment Spec中。
Plain Text
1annotations:
2 bci.virtual-kubelet.io/bci-gpu-type: "Nvidia A10 PCIE"
- 指定资源配置:GPU卡数、CPU和内存数量
Plain Text
1resources:
2 limits:
3 nvidia.com/gpu: 1 # GPU卡数
4 cpu: 8 # CPU核数
5 memory: 36Gi # MEM数量
6 requests:
7 nvidia.com/gpu: 1 # GPU卡数
8 cpu: 8 # CPU核数
9 memory: 36Gi # MEM数量
完整业务YAML示例
Plain Text
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: spot-deployment-test-gpu-wzy
5 labels:
6 run: ooo
7spec:
8 replicas: 1
9 selector:
10 matchLabels:
11 run: ooo
12 template:
13 metadata:
14 creationTimestamp: null
15 labels:
16 run: ooo
17 annotations:
18 bci.virtual-kubelet.io/bci-gpu-type: "Nvidia A10 PCIE"
19 bci.virtual-kubelet.io/bci-logical-zone: "zoneF" # 填写对应资源的可用区
20 bci.virtual-kubelet.io/bci-subnet-id: "xxxxxx" # 子网需要和可用区对应
21 name: spot-deployment-test-wzy-bid
22 spec:
23 volumes:
24 - name: podinfo
25 downwardAPI:
26 defaultMode: 420
27 items:
28 - fieldRef:
29 apiVersion: v1
30 fieldPath: metadata.labels['mylabel']
31 path: mylabel
32 - fieldRef:
33 apiVersion: v1
34 fieldPath: metadata.annotations['myannotation']
35 path: myannotation
36 - fieldRef:
37 apiVersion: v1
38 fieldPath: metadata.labels
39 path: labels
40 - fieldRef:
41 apiVersion: v1
42 fieldPath: metadata.annotations
43 path: annotations
44 - path: workload_cpu_limit
45 resourceFieldRef:
46 containerName: ooo1
47 divisor: 1m
48 resource: limits.cpu
49 - path: workload_cpu_request
50 resourceFieldRef:
51 containerName: ooo1
52 divisor: 1m
53 resource: requests.cpu
54 - path: workload_mem_limit
55 resourceFieldRef:
56 containerName: ooo1
57 divisor: 1Mi
58 resource: limits.memory
59 - path: workload_mem_request
60 resourceFieldRef:
61 containerName: ooo1
62 divisor: 1Mi
63 resource: requests.memory
64 nodeSelector:
65 type: "virtual-kubelet"
66 tolerations:
67 - key: "virtual-kubelet.io/provider"
68 operator: "Equal"
69 value: "baidu"
70 effect: "NoSchedule"
71 containers:
72 - image: hub.baidubce.com/cce/nginx-alpine-go
73 name: ooo1
74 env:
75 - name: "MY_CPU_LIMIT"
76 valueFrom:
77 resourceFieldRef:
78 containerName: ooo1
79 resource: limits.cpu
80 - name: "MY_CPU_REQ"
81 valueFrom:
82 resourceFieldRef:
83 containerName: ooo1
84 resource: requests.cpu
85 - name: "MY_IP"
86 valueFrom:
87 fieldRef:
88 apiVersion: v1
89 fieldPath: status.podIP
90 volumeMounts:
91 - name: podinfo
92 mountPath: /etc/podinfo
93 resources:
94 limits:
95 nvidia.com/gpu: 1 # GPU卡数
96 cpu: 8 # CPU核数
97 memory: 36Gi # MEM数量
98 requests:
99 nvidia.com/gpu: 1 # GPU卡数
100 cpu: 8 # CPU核数
101 memory: 36Gi # MEM数量