Inference是广泛应用的机器学习框架，能够帮助模型开发人员实现多机多卡分布式训练。在联邦集群中，可以提交Inference作业来完成inference框架下的机学习任务。

前提条件

1、成员集群已经安装inference机器学习框架

2、联邦集群版本大于或者等于v1.14.8

3、成员集群具备Inference作业运行的资源条件

4、成员集群已经添加到联邦集群中

操作步骤

步骤一：在联邦集群中创建Inference任务的自定义资源定义

1、从官网下载Inference的CRD，使用联邦的接入配置创建于联邦的控制面

kubectl --kubeconfig karmada_kubeconfig apply -f inference_crd.yaml

2、查看Inference的CRD

kubectl --kubeconfig karmada_kubeconfig get crd inferences.inference.isuite.ctyun.cn

预期输出：

[root@ccseagent-hxk4joo11x cceone]# kubectl --kubeconfig karmada_kubeconfig get crd inferences.inference.isuite.ctyun.cn
NAME                  CREATED AT
inferences.inference.isuite.ctyun.cn   2026-04-02T09:26:26Z

步骤二：在联邦控制面创建Inference任务

1、使用接入配置，在联邦控制面创建自定义的Inference任务

kubectl --kubeconfig karmada_kubeconfig apply -f inference-sample.yaml

inference-sample.yaml文件内容如下：

apiVersion: inference.isuite.ctyun.cn/v1
kind: Inference
metadata:
  name: inference-sample
  namespace: default
spec:
  framework:
    jobMode: Single
    type: vLLM
  replicaSpecs:
    Master:
      replicas: 1
      template:
        metadata:
          annotations:
            prometheus.io/app-metrics: "true"
            prometheus.io/app-metrics-path: /metrics
            prometheus.io/app-metrics-port: "8000"
            prometheus.io/scrape: "true"
        spec:
          containers:
            - args:
                - vllm serve /data/models --served-model-name inference-btgkji --host
                  0.0.0.0 --trust-remote-code --tensor-parallel-size 1
              command:
                - sh
                - -c
              env:
                - name: POD_IP
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: status.podIP
                - name: PYTHONHASHSEED
                  value: "42"
              image: isuite-pub-registry-xinan1.crs-internal.ctyun.cn/isuite/nvidia-vllm:v0.10.0-py312-cu128-ubuntu22.04-amd64
              name: master
              resources:
                limits:
                  nvidia.com/gpu: "1"
                requests:
                  nvidia.com/gpu: "1"
              volumeMounts:
                - mountPath: /data/models
                  name: vllm-models
                - mountPath: /dev/shm
                  name: shm
            - command:
                - /isuite/isuite-adapter
              env:
                - name: ISUITE_INFERENCE_ENGINE
                  value: vLLM
                - name: ISUITE_INFERENCE_ADAPTER_PORT
                  value: "9099"
                - name: ISUITE_INFERENCE_ENGINE_ENDPOINT
                  value: http://localhost:8000/metrics
              image: registry-vpc-crs-xinan1.cnsp-internal.ctyun.cn/icce/isuite-adapter:20260206
              name: inference-adapter
              resources: {}
          hostPID: true
          nodeSelector:
            isuite.ctyun.cn/model-4uwpcaisqaff72bt-v1: cached
          volumes:
            - hostPath:
                path: /data/isuite/models/model-4uwpcaisqaff72bt/v1
              name: vllm-models
            - emptyDir:
                medium: Memory
              name: shm
  replicas: 1
  serviceConfig:
    elbID: ""
    name: inference-btgkji-svc
    ports:
      - name: inference-btgkji-svc-0
        port: 8000
        protocol: TCP
        targetPort: 8000
    type: ClusterIP

步骤三：在联邦控制面创建Inference任务的分发策略

1、使用接入配置，在联邦控制创建步骤二Inference任务的分发策略

kubectl --kubeconfig karmada_kubeconfig apply -f inference-sample-pp.yaml

tf-sample-pp.yaml内容如下：

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: inference-sample
spec:
  resourceSelectors:
    - apiVersion: inference.isuite.ctyun.cn/v1
      kind: Inference
      name: inference-sample
  placement:
    replicaScheduling:
      replicaDivisionPreference: Aggregated
      replicaSchedulingType: Divided

2、查看Inference任务的状态

kubectl --kubeconfig karmada_kubeconfig get Inference inference-sample

息壤智算

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

分布式容器云平台

分布式容器云平台

前提条件

操作步骤

步骤一：在联邦集群中创建Inference任务的自定义资源定义

步骤二：在联邦控制面创建Inference任务

步骤三：在联邦控制面创建Inference任务的分发策略

活动

息壤智算

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

分布式容器云平台

分布式容器云平台

前提条件

操作步骤

步骤一：在联邦集群中创建Inference任务的自定义资源定义

步骤二：在联邦控制面创建Inference任务

步骤三：在联邦控制面创建Inference任务的分发策略