AIM Engine

AIM Engine#

AIM (AMD Inference Microservice) Engine is a Kubernetes operator that simplifies the deployment and management of AI inference workloads on AMD GPUs. It provides a declarative, cloud-native approach to running ML models at scale.

Quick Example#

Deploy an inference service with a single resource:

apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMService
metadata:
  name: qwen-chat
spec:
  model:
    image: amdenterpriseai/aim-qwen-qwen3-32b:0.11.0

AIM images (like amdenterpriseai/aim-qwen-qwen3-32b) are container images that package open-source models optimized for AMD Instinct GPUs. Each image includes the model weights and a serving runtime tuned for specific GPU configurations and precision modes.

AIM Engine automatically resolves the model, selects an optimal runtime configuration for your hardware, deploys a KServe InferenceService, and optionally creates HTTP routing through Gateway API.

Where to Start#

Cluster Administrators — Installation covers prerequisites, KServe setup, GPU configuration, and cluster-wide defaults.
Developers & Integrators — Quickstart gets you from zero to a running inference endpoint in 5 minutes.
Data Scientists — Model Catalog lets you browse available models and deploy them for experimentation.

Key Features#

Simple Service Deployment — Deploy inference endpoints with minimal configuration using AIMService resources
Automatic Optimization — Smart template selection picks the best runtime profile based on GPU availability, precision, and optimization goals
Model Catalog — Maintain a catalog of available models with automatic discovery from container registries
Model Caching — Pre-download model artifacts to shared PVCs for faster startup and reduced bandwidth
HTTP Routing — Expose services through Gateway API with customizable path templates
Autoscaling — KEDA integration with OpenTelemetry metrics for demand-based scaling
Multi-tenancy — Namespace-scoped and cluster-scoped resources for flexible team isolation

AIM Engine

Contents

AIM Engine#

Quick Example#

Where to Start#

Key Features#