91 lines
3.0 KiB
Markdown
91 lines
3.0 KiB
Markdown
> 本文作者丁辉
|
|
|
|
# GPU容器化基础环境准备
|
|
|
|
## Linux下载并安装GPU驱动(根据自身环境情况而定)
|
|
|
|
[请查看此文档](https://gitee.com/offends/Kubernetes/blob/main/GPU/Linux%E4%B8%8B%E8%BD%BD%E5%B9%B6%E5%AE%89%E8%A3%85GPU%E9%A9%B1%E5%8A%A8.md)
|
|
|
|
## 安装 NVIDIA 驱动程序 nvidia-container-toolkit
|
|
|
|
[官方文档](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
|
|
|
|
- **Centos**
|
|
|
|
配置生产存储库
|
|
|
|
```bash
|
|
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
|
|
tee /etc/yum.repos.d/nvidia-container-toolkit.repo
|
|
```
|
|
|
|
配置存储库以使用实验包(可选)
|
|
|
|
```bash
|
|
yum-config-manager --enable nvidia-container-toolkit-experimental
|
|
```
|
|
|
|
安装 NVIDIA Container Toolkit 软件包
|
|
|
|
```bash
|
|
yum install -y nvidia-container-toolkit
|
|
```
|
|
|
|
- **Ubuntu**
|
|
|
|
配置生产存储库
|
|
|
|
```bash
|
|
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
|
|
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
|
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
|
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
|
```
|
|
|
|
配置存储库以使用实验包(可选)
|
|
|
|
```bash
|
|
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
|
```
|
|
|
|
安装 NVIDIA Container Toolkit 软件包
|
|
|
|
```bash
|
|
apt-get update && apt-get install -y nvidia-container-toolkit
|
|
```
|
|
|
|
## 容器对接GPU
|
|
|
|
> 以 Docker 运行时举例
|
|
|
|
1. 使用 `nvidia-ctk` 修改配置文件
|
|
|
|
```bash
|
|
nvidia-ctk runtime configure --nvidia-set-as-default
|
|
```
|
|
|
|
> 无需担心此命令会覆盖源有的配置文件内容, 它只会通过修改来改变你当前的配置文件内容
|
|
|
|
**参数解释**
|
|
|
|
| 参数 | 描述 | 使用 |
|
|
| :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
|
|
| `--runtime=` | 指定当前容器运行时: docker,containerd,crio 等(默认会自动选择当前容器运行时) | `nvidia-ctk runtime configure --runtime=docker` |
|
|
| `--config=` | 指定容器运行时的配置文件的位置 | `nvidia-ctk runtime configure --config=/etc/docker/daemon.json` |
|
|
| `--nvidia-set-as-default` | 指定 NVIDIA 容器运行时作为默认运行时 | `nvidia-ctk runtime configure --nvidia-set-as-default` |
|
|
|
|
2. 重启服务
|
|
|
|
```bash
|
|
systemctl restart docker
|
|
```
|
|
|
|
3. 测试
|
|
|
|
```bash
|
|
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
|
|
```
|
|
|
|
> 查看是否成功打印 GPU 信息
|
|
|