Installation Reference#

This page covers advanced installation options for AIM Engine. For a basic install, see the Getting Started guide.

Helm Chart Configuration#

All configuration is done through Helm values. See Helm Chart Values for the complete reference.

Operator Resources#

Adjust operator resource limits for larger clusters:

helm install aim-engine oci://docker.io/amdenterpriseai/aim-engine-chart \
  --version <version> \
  --namespace aim-system \
  --create-namespace \
  --set manager.resources.limits.memory=8Gi \
  --set manager.resources.requests.memory=512Mi

Leader Election#

Leader election is enabled by default (--leader-elect in manager.args). This ensures only one operator instance is active when running multiple replicas for high availability.

Metrics#

The metrics endpoint is enabled by default on port 8443 with TLS. To disable TLS for the metrics endpoint:

helm install aim-engine oci://docker.io/amdenterpriseai/aim-engine-chart \
  --version <version> \
  --namespace aim-system \
  --set 'manager.args={--leader-elect,--metrics-secure=false}'

Accelerator Detection#

The Helm chart deploys an AcceleratorDetector as DaemonSets on cluster nodes. It detects GPUs and CPUs and publishes the results as node labels via NFD, which AIM Engine uses for workload scheduling. Node Feature Discovery must be installed on the cluster (included with the AMD GPU Operator).

CRD Management#

CRDs are distributed as a separate Helm chart and should be installed before the operator. See Installation.

Next Steps#