synchronization
This commit is contained in:
90
Docker/Docs/Docker使用GPU.md
Normal file
90
Docker/Docs/Docker使用GPU.md
Normal file
@@ -0,0 +1,90 @@
|
||||
> 本文作者丁辉
|
||||
|
||||
# GPU容器化基础环境准备
|
||||
|
||||
## Linux下载并安装GPU驱动(根据自身环境情况而定)
|
||||
|
||||
[请查看此文档](https://gitee.com/offends/Kubernetes/blob/main/GPU/Linux%E4%B8%8B%E8%BD%BD%E5%B9%B6%E5%AE%89%E8%A3%85GPU%E9%A9%B1%E5%8A%A8.md)
|
||||
|
||||
## 安装 NVIDIA 驱动程序 nvidia-container-toolkit
|
||||
|
||||
[官方文档](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
|
||||
|
||||
- **Centos**
|
||||
|
||||
配置生产存储库
|
||||
|
||||
```bash
|
||||
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
|
||||
tee /etc/yum.repos.d/nvidia-container-toolkit.repo
|
||||
```
|
||||
|
||||
配置存储库以使用实验包(可选)
|
||||
|
||||
```bash
|
||||
yum-config-manager --enable nvidia-container-toolkit-experimental
|
||||
```
|
||||
|
||||
安装 NVIDIA Container Toolkit 软件包
|
||||
|
||||
```bash
|
||||
yum install -y nvidia-container-toolkit
|
||||
```
|
||||
|
||||
- **Ubuntu**
|
||||
|
||||
配置生产存储库
|
||||
|
||||
```bash
|
||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
|
||||
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
||||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
||||
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
```
|
||||
|
||||
配置存储库以使用实验包(可选)
|
||||
|
||||
```bash
|
||||
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
```
|
||||
|
||||
安装 NVIDIA Container Toolkit 软件包
|
||||
|
||||
```bash
|
||||
apt-get update && apt-get install -y nvidia-container-toolkit
|
||||
```
|
||||
|
||||
## 容器对接GPU
|
||||
|
||||
> 以 Docker 运行时举例
|
||||
|
||||
1. 使用 `nvidia-ctk` 修改配置文件
|
||||
|
||||
```bash
|
||||
nvidia-ctk runtime configure --nvidia-set-as-default
|
||||
```
|
||||
|
||||
> 无需担心此命令会覆盖源有的配置文件内容, 它只会通过修改来改变你当前的配置文件内容
|
||||
|
||||
**参数解释**
|
||||
|
||||
| 参数 | 描述 | 使用 |
|
||||
| :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
|
||||
| `--runtime=` | 指定当前容器运行时: docker,containerd,crio 等(默认会自动选择当前容器运行时) | `nvidia-ctk runtime configure --runtime=docker` |
|
||||
| `--config=` | 指定容器运行时的配置文件的位置 | `nvidia-ctk runtime configure --config=/etc/docker/daemon.json` |
|
||||
| `--nvidia-set-as-default` | 指定 NVIDIA 容器运行时作为默认运行时 | `nvidia-ctk runtime configure --nvidia-set-as-default` |
|
||||
|
||||
2. 重启服务
|
||||
|
||||
```bash
|
||||
systemctl restart docker
|
||||
```
|
||||
|
||||
3. 测试
|
||||
|
||||
```bash
|
||||
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
|
||||
```
|
||||
|
||||
> 查看是否成功打印 GPU 信息
|
||||
|
Reference in New Issue
Block a user