NVIDIA GPU A800(80G)物理机双机部署指南 1.4 下载模型文件 DeepSeek 模型体积较大,建议通过 huggingface 或 modelscope 平台进行下载,并将模型存储在每台计算节点的 NVME 磁盘中,例如存储在 /mnt/nvme1n1/model 目录下。推荐使用 DeepSeekR1ChannelINT8 模型,模型存储路径示例:/mnt/nvme1n1/model/DeepSeekR1ChannelINT8 。 二、起停服务 2.1 配置 DeepSeek 在 /home/sglang 目录下创建 srun.sh 文件,文件内容如下: !/bin/bash SBATCH N 2 SBATCH ntasks2 SBATCH exclusive SBATCH partitionbatch SBATCH J deepseek SBATCH o log/logds%J.out SBATCH gresgpu:8 export NCCLDEBUGINFO export NCCLSOCKETIFNAMEbond0 export NCCLIBHCAmlx50,mlx51,mlx55,mlx56 export OMPNUMTHREADS8 export HFDATASETSNUMTHREADS8 export TRANSFORMERSOFFLINE1 export SGLANGIMGsglangv0.4.5.post3cu125.sif export MODELDIR/mnt/nvme1n1/model/DeepSeekR1ChannelINT8/ export MODELNAMEDeepSeekR1 export NODES$(scontrol show hostnames $SLURMJOBNODELIST) export MASTERADDR$(scontrol show hostnames "$SLURMJOBNODELIST" head n 1 hostname i) readarray t NODEARRAY <<< "$NODES" for i in "${!NODEARRAY[@]}"; do NODENAME"${NODEARRAY[$i]}" NODERANK"$i" echo $NODENAME,$NODERANK,$SLURMNNODES srun nodes1 nodelist$NODENAME ntasks1 gresgpu:8 cpuspertask 64 output"log/logds%J.out" error"log/logds%J.err" apptainer exec nv nohome writabletmpfs B $MODELDIR:/root/.cache/huggingface $SGLANGIMG python3 m sglang.launchserver modelpath /root/.cache/huggingface servedmodelname $MODELNAME host 0.0.0.0 port 8000 trustremotecode tensorparallelsize 16 enabletorchcompile torchcompilemaxbs 8 quantization w8a8int8 distinitaddr $MASTERADDR:5000 nnodes $SLURMNNODES noderank $NODERANK & done wait 用户可根据实际情况修改以下三项配置内容: SGLANGIMG:指定 sglang apptainer 镜像容器,后续可根据需求自行升级替换。示例配置:export SGLANGIMGsglangv0.4.5.post3cu125.sif MODELDIR:设置模型的具体存储位置。示例配置:export MODELDIR/mnt/nvme1n1/model/DeepSeekR1ChannelINT8/ MODELNAME:定义显示的模型名称。示例配置:export MODELNAMEDeepSeekR1