synchronization

This commit is contained in:
2025-08-25 17:53:08 +08:00
commit c201eb5ef9
318 changed files with 23092 additions and 0 deletions

View File

@@ -0,0 +1,90 @@
> 本文作者丁辉
# GPU容器化基础环境准备
## Linux下载并安装GPU驱动(根据自身环境情况而定)
[请查看此文档](https://gitee.com/offends/Kubernetes/blob/main/GPU/Linux%E4%B8%8B%E8%BD%BD%E5%B9%B6%E5%AE%89%E8%A3%85GPU%E9%A9%B1%E5%8A%A8.md)
## 安装 NVIDIA 驱动程序 nvidia-container-toolkit
[官方文档](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
- **Centos**
配置生产存储库
```bash
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
tee /etc/yum.repos.d/nvidia-container-toolkit.repo
```
配置存储库以使用实验包(可选)
```bash
yum-config-manager --enable nvidia-container-toolkit-experimental
```
安装 NVIDIA Container Toolkit 软件包
```bash
yum install -y nvidia-container-toolkit
```
- **Ubuntu**
配置生产存储库
```bash
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
```
配置存储库以使用实验包(可选)
```bash
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
```
安装 NVIDIA Container Toolkit 软件包
```bash
apt-get update && apt-get install -y nvidia-container-toolkit
```
## 容器对接GPU
> 以 Docker 运行时举例
1. 使用 `nvidia-ctk` 修改配置文件
```bash
nvidia-ctk runtime configure --nvidia-set-as-default
```
> 无需担心此命令会覆盖源有的配置文件内容, 它只会通过修改来改变你当前的配置文件内容
**参数解释**
| 参数 | 描述 | 使用 |
| :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| `--runtime=` | 指定当前容器运行时: docker,containerd,crio 等(默认会自动选择当前容器运行时) | `nvidia-ctk runtime configure --runtime=docker` |
| `--config=` | 指定容器运行时的配置文件的位置 | `nvidia-ctk runtime configure --config=/etc/docker/daemon.json` |
| `--nvidia-set-as-default` | 指定 NVIDIA 容器运行时作为默认运行时 | `nvidia-ctk runtime configure --nvidia-set-as-default` |
2. 重启服务
```bash
systemctl restart docker
```
3. 测试
```bash
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
```
> 查看是否成功打印 GPU 信息