Custom Profiles

Custom Profiles#

AMD Inference Microservice (AIM) supports custom profile configurations that extend beyond the built-in optimized and general profiles. Custom profiles enable users to define specialized configurations for unique hardware setups, model variants not supported by AIM, or specific performance requirements not covered by standard profiles.

Overview#

Four kinds of profiles can be distinguished based on their structure and location (in order of selection precedence, from highest to lowest):

Custom model-specific profiles
Custom general profiles
Built-in model-specific profiles
Built-in general profiles

Custom profiles follow the same YAML structure as standard profiles but are placed in the /workspace/aim-runtime/profiles/custom/ directory within the container. On the users’ side, custom profiles can be placed in a folder that must be mounted to the container at the path specified above. When AIM starts, it scans the custom profiles directory first, so custom profiles take precedence over built-in profiles.

Key Features:

Highest Search Precedence: Custom profiles are prioritized over built-in profiles
Flexible Deployment: Mount custom profiles via volumes
Experimental Safe: Test new configurations without building new AIM images

Custom profiles are ideal for performance tuning, hardware-specific optimizations, or deploying models that are not yet supported by AIM but are compatible with supported engines.

Creating Custom Profiles#

A profile can be defined as a YAML file. The file should adhere to the AIM profile schema. Please refer to the existing profiles for more examples.

Using Custom Profiles#

Assume you have a custom profile YAML for DeepSeek R1 Distill Qwen 32B model named vllm-mi300x-fp16-tp1-latency.yaml placed in the folder deepseek-ai/DeepSeek-R1-Distill-Qwen-32B.

It contains the following:

aim_id: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
model_id: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
metadata:
  engine: vllm
  gpu: MI300X
  gpu_count: 1
  manual_selection_only: false
  metric: latency
  precision: fp16
  type: unoptimized
engine_args:
  distributed_executor_backend: mp
  dtype: float16
  gpu-memory-utilization: 0.95
  tensor-parallel-size: 1
env_vars:
  HIP_FORCE_DEV_KERNARG: '1'
  NCCL_MIN_NCHANNELS: '112'
  PYTORCH_TUNABLEOP_ENABLED: '1'
  PYTORCH_TUNABLEOP_TUNING: '0'
  PYTORCH_TUNABLEOP_VERBOSE: '1'
  TORCH_BLAS_PREFER_HIPBLASLT: '1'
  VLLM_DO_NOT_TRACK: '1'

A custom general profile can look the same, but it should not contain aim_id and model_id. Also, metadata.type should be set to general. See Profile Structure chapter in the development documentation for details on each field.

Usage with Docker#

To use a custom profile with Docker, the directory containing the profile has to be mounted to the container. The target path is /workspace/aim-runtime/profiles/custom/. The mapped directory must have a certain structure (assuming that the directory to map is in the current working directory and is named custom-profiles):

    ├── org/
    │   └── model/
    │       └── profile.yaml
    └── general/
        └── profile.yaml

All profiles, including the custom ones, are validated against the AIM profile schema at runtime.

Running base image with custom general profile#

AIM_MODEL_ID environment variable is set to ensure the correct profile is selected.

docker run \
  -e AIM_MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
  -e AIM_GPU_MODEL=MI300X \
  -v $(pwd)/custom-profiles:/workspace/aim-runtime/profiles/custom \
  --device=/dev/kfd --device=/dev/dri \
  amdenterpriseai/aim-base:0.10 list-profiles

As a result, a custom general profile will be selected if it is present in the mounted directory. If there is no custom general profile, AIM will fall back to built-in general profiles and select the best match.

Running base image with custom model-specific profile#

AIM_ID environment variable is set to ensure the correct profile is selected.

docker run \
  -e AIM_ID=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
  -v $(pwd)/custom-profiles:/workspace/aim-runtime/profiles/custom \
  --device=/dev/kfd --device=/dev/dri \
  -p 8000:8000 \
  amdenterpriseai/aim-base:0.10

As a result, a custom model-specific profile will be selected if it is present in the mounted directory. If there is no such profile, AIM will fall back to built-in general profiles and select the best match.

Usage with Kubernetes#

To use custom profiles in Kubernetes, you need to create a ConfigMap or volume containing your custom profiles and mount it to the /workspace/aim-runtime/profiles/custom/ path in the container.

Creating ConfigMap with Custom Profile#

First, create a ConfigMap containing your custom profile:

kubectl create configmap custom-profiles \
  --from-file=profiles/custom/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/vllm-mi300x-fp16-tp1-latency.yaml \
  -n YOUR_K8S_NAMESPACE

Example Deployment with Custom Profile#

Here’s an example Kubernetes deployment that uses a custom model-specific profile:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aim-custom-profile-deployment
  labels:
    app: aim-custom-profile
spec:
  progressDeadlineSeconds: 3600
  replicas: 1
  selector:
    matchLabels:
      app: aim-custom-profile
  template:
    metadata:
      labels:
        app: aim-custom-profile
    spec:
      containers:
        - name: aim-custom-profile
          image: "amdenterpriseai/aim-base:0.10"
          imagePullPolicy: Always
          env:
            - name: AIM_ID
              value: "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
          ports:
            - name: http
              containerPort: 8000
          resources:
            requests:
              memory: "32Gi"
              cpu: "4"
              amd.com/gpu: "1"
            limits:
              memory: "32Gi"
              cpu: "4"
              amd.com/gpu: "1"
          startupProbe:
            httpGet:
              path: /v1/models
              port: http
            periodSeconds: 10
            failureThreshold: 120
          livenessProbe:
            httpGet:
              path: /health
              port: http
          readinessProbe:
            httpGet:
              path: /v1/models
              port: http
          volumeMounts:
            - name: ephemeral-storage
              mountPath: /tmp
            - name: dshm
              mountPath: /dev/shm
            - name: custom-profiles
              mountPath: /workspace/aim-runtime/profiles/custom
              readOnly: true
      volumes:
        - name: ephemeral-storage
          emptyDir:
            sizeLimit: 512Gi
        - name: dshm
          emptyDir:
            medium: Memory
            sizeLimit: 64Gi
        - name: custom-profiles
          configMap:
            name: custom-profiles

Use AIM_MODEL_ID environment variable instead of AIM_ID if you want to use a custom general profile.

Example Service#

apiVersion: v1
kind: Service
metadata:
  name: aim-custom-profile-service
  labels:
    app: aim-custom-profile
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 80
      targetPort: 8000
  selector:
    app: aim-custom-profile

Deployment and test commands#

Deploy pod and service configured on the previous step

kubectl apply -f . -n YOUR_K8S_NAMESPACE

Port forward the service to access it locally

kubectl port-forward service/aim-custom-profile-service 8000:80 -n YOUR_K8S_NAMESPACE

Test the inference endpoint

Make a request to the inference endpoint using curl:

curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
        "prompt": "San Francisco is a",
        "max_tokens": 7,
        "temperature": 0
    }'

Remove the deployment and service:

kubectl delete -f . -n YOUR_K8S_NAMESPACE

Remove config map:

kubectl delete configmap custom-profiles -n YOUR_K8S_NAMESPACE