Installation#
This guide covers installing AIM Engine on a Kubernetes cluster.
Prerequisites#
Component |
Minimum Version |
Notes |
|---|---|---|
Kubernetes |
1.28+ |
Cluster with AMD GPU nodes |
— |
Advertises |
|
KServe |
v0.16.1 |
|
Gateway API |
v1.3.0 |
Required for HTTP routing |
cert-manager |
v1.16+ |
Required by KServe and optional metrics TLS |
Optional components:
Component |
Version |
Purpose |
|---|---|---|
KEDA |
2.18+ |
Autoscaling with OpenTelemetry metrics |
OpenTelemetry Operator |
0.101+ |
Custom metrics collection for autoscaling |
Longhorn or similar CSI |
— |
ReadWriteMany storage for model caching |
Install with Helm#
1. Install CRDs#
CRDs are distributed separately from the Helm chart and must be installed first:
helm install aim-engine-crds oci://docker.io/amdenterpriseai/aim-engine-crds-chart \
--version <version> \
--namespace aim-system \
--create-namespace
Or from a local file:
kubectl apply -f crds.yaml
kubectl wait --for=condition=Established crd --all --timeout=60s
2. Install the Operator#
helm install aim-engine oci://docker.io/amdenterpriseai/aim-engine-chart \
--version <version> \
--namespace aim-system \
--create-namespace
See Helm Chart Values for all configurable values (replicas, resources, metrics, CRD management, etc.).
3. Enable model discovery (optional)#
The Helm chart does not create an AIMClusterModelSource. To populate cluster models from a registry, apply an AIMClusterModelSource manifest yourself (for example from the samples under config/samples/ in this repository). See Model Catalog for details.
Install from Source#
Build and install from the repository:
git clone https://github.com/amd-enterprise-ai/aim-engine.git
cd aim-engine
# Generate CRDs and Helm chart
make crds
make helm
# Install CRDs
kubectl apply -f dist/crds.yaml
kubectl wait --for=condition=Established crd --all --timeout=60s
# Install the operator
helm install aim-engine ./dist/chart \
--namespace aim-system \
--create-namespace
Tip
This project uses mise to manage tool versions (Go, controller-gen, etc.). Run mise install and eval "$(mise activate bash)" to get the correct versions on your PATH. See Development Setup for details.
Common Configuration#
Enable Cluster Runtime Defaults#
Set up cluster-wide routing and storage defaults:
helm upgrade aim-engine oci://docker.io/amdenterpriseai/aim-engine-chart \
--namespace aim-system \
--set clusterRuntimeConfig.enable=true \
--set clusterRuntimeConfig.spec.routing.enabled=true \
--set clusterRuntimeConfig.spec.routing.gatewayRef.name=aim-gateway \
--set clusterRuntimeConfig.spec.routing.gatewayRef.namespace=kgateway-system
See Helm Chart Values for all available options.
Verify Installation#
Check that the operator is running:
kubectl get pods -n aim-system
Expected output:
NAME READY STATUS RESTARTS AGE
aim-engine-controller-manager-xxxxx-yyyyy 1/1 Running 0 30s
Verify CRDs are installed:
kubectl get crds | grep aim.eai.amd.com
Uninstalling#
# Remove the operator
helm uninstall aim-engine -n aim-system
# Remove CRDs
helm uninstall aim-engine-crds -n aim-system
Warning
Uninstalling the CRDs release deletes all AIM custom resources from the cluster. Remove the operator first, then the CRDs only if you want a full cleanup.
Next Steps#
Quickstart — Deploy your first inference service
KServe Configuration — Configure KServe for AIM Engine
Helm Chart Values — Full reference for all chart values