API Reference

API Reference#

Packages#

aim.eai.amd.com/v1alpha2

aim.eai.amd.com/v1alpha2#

Package v1alpha2 contains API Schema definitions for the aim v1alpha2 API group.

Resource Types#

AIMClusterModel
AIMClusterModelList
AIMClusterProfile
AIMClusterProfileList
AIMClusterProfileSet
AIMClusterProfileSetList
AIMModel
AIMModelList
AIMProfile
AIMProfileCache
AIMProfileCacheList
AIMProfileList
AIMProfileSet
AIMProfileSetList
AIMService
AIMServiceList

AIMClusterModel#

AIMClusterModel is the Schema for cluster-scoped v1alpha2 AIM model resources. See AIMModel (api/v1alpha2/aimmodel_types.go) for the rationale of the CEL rules below — AIMClusterModel mirrors the namespace-scoped contract.

Appears in:

AIMClusterModelList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterModel`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMModelSpec
`status` AIMModelStatus

AIMClusterModelList#

AIMClusterModelList contains a list of AIMClusterModel.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterModelList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterModel array

AIMClusterProfile#

AIMClusterProfile is the Schema for cluster-scoped AIM profiles. Cluster profiles are visible across all namespaces. They can be created manually or, in the future, automatically during model discovery by a v1alpha2 model controller. Unlike namespace-scoped AIMProfiles, cluster profiles do not support caching configuration since caches are namespace-scoped.

Deployable profiles have both aimId and modelSources populated; base profiles (custom-model derivation source material) have neither. Mixed (one of the two set) is rejected to keep status.deployable derivable from spec.

Appears in:

AIMClusterProfileList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterProfile`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMClusterProfileSpec
`status` AIMProfileStatus

AIMClusterProfileList#

AIMClusterProfileList contains a list of AIMClusterProfile.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterProfileList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterProfile array

AIMClusterProfileSet#

AIMClusterProfileSet is the Schema for cluster-scoped profile derivation resources.

Appears in:

AIMClusterProfileSetList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterProfileSet`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMProfileSetSpec
`status` AIMProfileSetStatus

AIMClusterProfileSetList#

AIMClusterProfileSetList contains a list of AIMClusterProfileSet.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMClusterProfileSetList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterProfileSet array

AIMClusterProfileSpec#

AIMClusterProfileSpec defines the desired state of a cluster-scoped AIMClusterProfile.

Appears in:

AIMClusterProfile

Field	Description	Default	Validation
`aimId` string	AimId is the model architecture identifier (e.g., “qwen/qwen3-32b”). Primary matching axis for profile selection and custom weight onboarding. AimId is required for deployable profiles. Iteration 1 producers always emit deployable profiles, so AimId is effectively required there. Empty AimId is reserved for base profiles emitted by base-image discovery (custom-model derivation source material), which are not deployable until derived. Once set, AimId is immutable.		Optional: {}
`modelId` string	ModelId is the specific model / HuggingFace URI (e.g., “qwen/qwen3-32b-fp8”). Determines the cache path (/workspace/cache/{modelId}) and serves as a secondary discriminator for custom weight matching.		Optional: {}
`profileId` string	ProfileId is the on-disk profile identifier from the AIM image (e.g., “vllm-mi300x-fp8-tp1-latency”). Populated during discovery to link this CRD back to the profile YAML inside the container. Not required for manually created profiles.		Optional: {}
`engine` string	Engine identifies the inference engine (e.g., “vllm”, “tgi”).		Optional: {}
`metric` AIMMetric	Metric is the optimization target for this profile.		Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision is the numeric precision used by this profile.		Enum: [fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`type` AIMProfileType	Type indicates the optimization level. Hierarchy: optimized > general > preview > unoptimized.		Enum: [optimized general preview unoptimized] Optional: {}
`primary` boolean	Primary marks this as a default/recommended profile. When true, the profile is advertised for standard deployment and copied automatically for custom weight models. Defaults to false when not specified.	false
`manualSelectionOnly` boolean	ManualSelectionOnly is DEPRECATED and no longer honored by the resolver. It was a binary gate excluding a profile from automatic AIMService selection; that intent is now expressed through the graded `type` hierarchy (optimized > general > preview > unoptimized) combined with the selector’s `minimumType` floor. The field is retained for backward compatibility (existing objects and aim-build profile YAMLs still set it) but has no effect on selection; it will be removed in a future API version. Use `type: unoptimized` (+ a selector `minimumType`) instead. Deprecated: superseded by `type` + selector `minimumType`; ignored by the resolver.	false
`engineArgs` JSON	EngineArgs contains inference engine CLI arguments as a free-form JSON object. Passed to the inference engine (e.g., vLLM) at startup.		Schemaless: {} Optional: {}
`engineEnv` object (keys:string, values:string)	EngineEnv contains environment variables for the inference engine subprocess. Applied via os.execv, distinct from container-level ContainerEnv.		Optional: {}
`acceleratorModel` string	AcceleratorModel is the accelerator identifier for node selection. Maps to a node label key using the Exists operator: feature.node.kubernetes.io/aim-accelerator.{value}: Exists Supports both specific models (e.g., “MI300X”) and architecture-level fallbacks (e.g., “EPYC_ZEN5”) — the AcceleratorDetector labels nodes with all applicable identifiers.		MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`acceleratorType` AcceleratorType	AcceleratorType determines the resource derivation strategy: gpu or cpu. AIM Engine computes default resource requests from this field combined with AcceleratorCount and cluster-level configuration.		Enum: [gpu cpu] Optional: {}
`acceleratorCount` integer	AcceleratorCount is the number of accelerator units required. For AcceleratorType=gpu, this is the device count (e.g., 1, 2, 4, 8 for tensor-parallel sizes). For AcceleratorType=cpu, this is the number of CPU cores (e.g., 128 for EPYC_ZEN5, 192 for EPYC_9965). Combined with cluster-level configuration to compute default resource requests in status.resources. For AcceleratorType=gpu, the per-unit interpretation depends on AcceleratorPartitioningMode: under “unpartitioned” (default) one unit is one whole GPU; under “partitioned” or a specific scheme one unit is one partition slice (e.g. CPX-NPS4 = 1/8 of a GPU).		Minimum: 0 Optional: {}
`acceleratorPartitioningMode` string	AcceleratorPartitioningMode declares the GPU partition state the profile requires. Free-form string with reserved values: “” - omitted; CRD-defaulted to “unpartitioned”. “unpartitioned” - hardware-default partition state. Matches unpartitioned MI300X (canonical SPX-NPS1) AND non-partitionable hardware (Radeon, etc.) — any node whose detector stamped aim-accelerator.partitioning-scheme.default. “partitioned” - any actively partitioned mode. Excludes both unpartitioned partitionable hardware and non-partitionable hardware. “-” - specific compute+memory scheme, e.g. “CPX-NPS4”. Matches only nodes carrying that exact scheme label; does not match non-partitionable hardware. Other values (e.g. “CPX” alone, or typos) are accepted but fail-safe to zero matching nodes under this iteration’s label schema — compute-only / memory-only matching is not supported here. AIM Engine does NOT validate against AMD’s hardware compatibility matrix; invalid combinations simply report MatchingNodes == 0. Resolves to a single Exists or DoesNotExist node-affinity term on the aim-accelerator.partitioning-scheme.* labels published by the AcceleratorDetector, AND-ed with the acceleratorModel term.	unpartitioned	MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`resources` ResourceRequirements	Resources is an optional override for K8s resource requests/limits. When set, merged on top of the defaults that AIM Engine computes from AcceleratorType, AcceleratorCount, and cluster-level configuration. The resolved result is written to status.resources.		Optional: {}
`image` string	Image is the deployment container image. Required. For purpose-built profiles: the full AIM image. For custom weight profiles: the base image (e.g., aim-base:0.8.5).		MinLength: 1
`modelSources` AIMModelSource array	ModelSources specifies model artifact sources for this profile. Populated during discovery or set by user.		Optional: {}
`containerEnv` EnvVar array	ContainerEnv specifies container-level env vars for the AIM runtime process (K8s pod spec).		Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets for pulling container images.		Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the service account for workloads.		Optional: {}
`features` string array	Features lists optional capabilities the profile’s image honours, e.g. ”adapters” for LoRA serving. A service declaring spec.adapters is rejected (ConfigValid=False) unless its resolved profile lists “adapters” here.		Optional: {}

AIMModel#

AIMModel is the Schema for the v1alpha2 AIMModel API.

AIMModel mirrors AIMService: the canonical Spec/Status types live in v1alpha1 and the v1alpha2 wrapper is a thin re-export so JSON wire format is identical between versions and the None conversion strategy works without a webhook.

v1alpha2 onboarding contract: exactly one of spec.image (official discovery) or spec.profiles (fine-tune / custom-model derivation); legacy v1alpha1 onboarding fields and the deprecated flat spec.derivedFrom shape are forbidden on NEW v1alpha2 objects. Each “forbidden” rule uses optionalOldSelf so that objects originally created via v1alpha1 (which legally carry spec.custom, spec.modelSources, spec.customTemplates, spec.profileCopy) and the early-iteration v1alpha2 objects (carrying spec.derivedFrom) can still be updated through the v1alpha2 surface — the reconciler must be able to add finalizers and patch the spec of legacy-shaped objects without being blocked by the v1alpha2 schema. Adding a legacy/deprecated field to an object that did not previously have it is still rejected.

Appears in:

AIMModelList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMModel`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMModelSpec
`status` AIMModelStatus

AIMModelList#

AIMModelList contains a list of AIMModel.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMModelList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMModel array

AIMProfile#

AIMProfile is the Schema for namespace-scoped AIM profiles. A profile is a self-contained runtime configuration that answers five questions without consulting any other resource: model architecture, accelerator, K8s resources, runtime config, and container image.

Deployable profiles have both aimId and modelSources populated; base profiles (custom-model derivation source material) have neither. Mixed (one of the two set) is rejected to keep status.deployable derivable from spec.

Appears in:

AIMProfileList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfile`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMProfileSpec
`status` AIMProfileStatus

AIMProfileCache#

AIMProfileCache pre-warms model artifacts for a specified profile’s model sources.

Appears in:

AIMProfileCacheList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfileCache`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMProfileCacheSpec
`status` AIMProfileCacheStatus

AIMProfileCacheList#

AIMProfileCacheList contains a list of AIMProfileCache.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfileCacheList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMProfileCache array

AIMProfileCacheMode#

Underlying type: string

AIMProfileCacheMode controls the ownership behavior of artifacts created by a profile cache.

Validation:

Enum: [Dedicated Shared]

Appears in:

AIMProfileCacheSpec

Field	Description
`Dedicated`	ProfileCacheModeDedicated means artifacts are owned by this profile cache and garbage collected when it is deleted.
`Shared`	ProfileCacheModeShared means artifacts have no owner references and persist independently of the profile cache lifecycle. This is the default mode.

AIMProfileCacheSpec#

AIMProfileCacheSpec defines the desired state of AIMProfileCache.

Appears in:

AIMProfileCache

Field	Description	Default	Validation
`profileName` string	ProfileName is the name of the AIMProfile or AIMClusterProfile to cache. The controller resolves model sources from the referenced profile’s spec.modelSources.		MinLength: 1
`profileScope` AIMResolutionScope	ProfileScope indicates whether the profile is namespace-scoped or cluster-scoped.		Enum: [Namespace Cluster] Required: {}
`storageClassName` string	StorageClassName specifies the storage class for cache volumes. When not specified, uses the cluster default storage class.		Optional: {}
`env` EnvVar array	Env specifies environment variables for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens).		Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.		Optional: {}
`mode` AIMProfileCacheMode	Mode controls the ownership behavior of artifacts created by this profile cache. - Dedicated: artifacts are owned by this profile cache and garbage collected when it’s deleted. - Shared (default): artifacts have no owner references and persist independently.	Shared	Enum: [Dedicated Shared] Optional: {}
`requiresAdapterDisk` boolean	RequiresAdapterDisk requests that the backing model artifact carry a shared ReadWriteMany adapter disk for LoRA serving. Set by the AIMService planner when the service serves adapters. When set, the cache stamps an adapterDisk onto the artifact it creates and won’t adopt one lacking a disk; size and class come from AIMRuntimeConfig.Storage.		Optional: {}

AIMProfileCacheStatus#

AIMProfileCacheStatus defines the observed state of AIMProfileCache.

Appears in:

AIMProfileCache

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller.
`conditions` Condition array	Conditions represent the latest observations of the profile cache state.
`status` AIMStatus	Status represents the current high-level status of the profile cache.	Pending	Enum: [Pending Progressing Ready Failed Degraded NotAvailable]
`artifacts` object (keys:string, values:AIMResolvedArtifact)	Artifacts maps artifact names to their resolved AIMArtifact resources.		Optional: {}

AIMProfileCachingConfig#

AIMProfileCachingConfig configures model caching behavior for namespace-scoped profiles.

Appears in:

AIMProfileSpec

Field	Description	Default	Validation
`enabled` boolean	Enabled controls whether caching is enabled for this profile.	false
`env` EnvVar array	Env specifies environment variables for model download during caching. If not set, falls back to the profile’s ContainerEnv.		Optional: {}

AIMProfileList#

AIMProfileList contains a list of AIMProfile.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfileList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMProfile array

AIMProfileSet#

AIMProfileSet is the Schema for namespace-scoped profile derivation resources.

Appears in:

AIMProfileSetList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfileSet`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMProfileSetSpec
`status` AIMProfileSetStatus

AIMProfileSetList#

AIMProfileSetList contains a list of AIMProfileSet.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMProfileSetList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMProfileSet array

AIMProfileSpec#

AIMProfileSpec defines the desired state of a namespace-scoped AIMProfile.

Appears in:

AIMProfile

Field	Description	Default	Validation
`aimId` string	AimId is the model architecture identifier (e.g., “qwen/qwen3-32b”). Primary matching axis for profile selection and custom weight onboarding. AimId is required for deployable profiles. Iteration 1 producers always emit deployable profiles, so AimId is effectively required there. Empty AimId is reserved for base profiles emitted by base-image discovery (custom-model derivation source material), which are not deployable until derived. Once set, AimId is immutable.		Optional: {}
`modelId` string	ModelId is the specific model / HuggingFace URI (e.g., “qwen/qwen3-32b-fp8”). Determines the cache path (/workspace/cache/{modelId}) and serves as a secondary discriminator for custom weight matching.		Optional: {}
`profileId` string	ProfileId is the on-disk profile identifier from the AIM image (e.g., “vllm-mi300x-fp8-tp1-latency”). Populated during discovery to link this CRD back to the profile YAML inside the container. Not required for manually created profiles.		Optional: {}
`engine` string	Engine identifies the inference engine (e.g., “vllm”, “tgi”).		Optional: {}
`metric` AIMMetric	Metric is the optimization target for this profile.		Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision is the numeric precision used by this profile.		Enum: [fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`type` AIMProfileType	Type indicates the optimization level. Hierarchy: optimized > general > preview > unoptimized.		Enum: [optimized general preview unoptimized] Optional: {}
`primary` boolean	Primary marks this as a default/recommended profile. When true, the profile is advertised for standard deployment and copied automatically for custom weight models. Defaults to false when not specified.	false
`manualSelectionOnly` boolean	ManualSelectionOnly is DEPRECATED and no longer honored by the resolver. It was a binary gate excluding a profile from automatic AIMService selection; that intent is now expressed through the graded `type` hierarchy (optimized > general > preview > unoptimized) combined with the selector’s `minimumType` floor. The field is retained for backward compatibility (existing objects and aim-build profile YAMLs still set it) but has no effect on selection; it will be removed in a future API version. Use `type: unoptimized` (+ a selector `minimumType`) instead. Deprecated: superseded by `type` + selector `minimumType`; ignored by the resolver.	false
`engineArgs` JSON	EngineArgs contains inference engine CLI arguments as a free-form JSON object. Passed to the inference engine (e.g., vLLM) at startup.		Schemaless: {} Optional: {}
`engineEnv` object (keys:string, values:string)	EngineEnv contains environment variables for the inference engine subprocess. Applied via os.execv, distinct from container-level ContainerEnv.		Optional: {}
`acceleratorModel` string	AcceleratorModel is the accelerator identifier for node selection. Maps to a node label key using the Exists operator: feature.node.kubernetes.io/aim-accelerator.{value}: Exists Supports both specific models (e.g., “MI300X”) and architecture-level fallbacks (e.g., “EPYC_ZEN5”) — the AcceleratorDetector labels nodes with all applicable identifiers.		MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`acceleratorType` AcceleratorType	AcceleratorType determines the resource derivation strategy: gpu or cpu. AIM Engine computes default resource requests from this field combined with AcceleratorCount and cluster-level configuration.		Enum: [gpu cpu] Optional: {}
`acceleratorCount` integer	AcceleratorCount is the number of accelerator units required. For AcceleratorType=gpu, this is the device count (e.g., 1, 2, 4, 8 for tensor-parallel sizes). For AcceleratorType=cpu, this is the number of CPU cores (e.g., 128 for EPYC_ZEN5, 192 for EPYC_9965). Combined with cluster-level configuration to compute default resource requests in status.resources. For AcceleratorType=gpu, the per-unit interpretation depends on AcceleratorPartitioningMode: under “unpartitioned” (default) one unit is one whole GPU; under “partitioned” or a specific scheme one unit is one partition slice (e.g. CPX-NPS4 = 1/8 of a GPU).		Minimum: 0 Optional: {}
`acceleratorPartitioningMode` string	AcceleratorPartitioningMode declares the GPU partition state the profile requires. Free-form string with reserved values: “” - omitted; CRD-defaulted to “unpartitioned”. “unpartitioned” - hardware-default partition state. Matches unpartitioned MI300X (canonical SPX-NPS1) AND non-partitionable hardware (Radeon, etc.) — any node whose detector stamped aim-accelerator.partitioning-scheme.default. “partitioned” - any actively partitioned mode. Excludes both unpartitioned partitionable hardware and non-partitionable hardware. “-” - specific compute+memory scheme, e.g. “CPX-NPS4”. Matches only nodes carrying that exact scheme label; does not match non-partitionable hardware. Other values (e.g. “CPX” alone, or typos) are accepted but fail-safe to zero matching nodes under this iteration’s label schema — compute-only / memory-only matching is not supported here. AIM Engine does NOT validate against AMD’s hardware compatibility matrix; invalid combinations simply report MatchingNodes == 0. Resolves to a single Exists or DoesNotExist node-affinity term on the aim-accelerator.partitioning-scheme.* labels published by the AcceleratorDetector, AND-ed with the acceleratorModel term.	unpartitioned	MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`resources` ResourceRequirements	Resources is an optional override for K8s resource requests/limits. When set, merged on top of the defaults that AIM Engine computes from AcceleratorType, AcceleratorCount, and cluster-level configuration. The resolved result is written to status.resources.		Optional: {}
`image` string	Image is the deployment container image. Required. For purpose-built profiles: the full AIM image. For custom weight profiles: the base image (e.g., aim-base:0.8.5).		MinLength: 1
`modelSources` AIMModelSource array	ModelSources specifies model artifact sources for this profile. Populated during discovery or set by user.		Optional: {}
`containerEnv` EnvVar array	ContainerEnv specifies container-level env vars for the AIM runtime process (K8s pod spec).		Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets for pulling container images.		Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the service account for workloads.		Optional: {}
`features` string array	Features lists optional capabilities the profile’s image honours, e.g. ”adapters” for LoRA serving. A service declaring spec.adapters is rejected (ConfigValid=False) unless its resolved profile lists “adapters” here.		Optional: {}
`caching` AIMProfileCachingConfig	Caching configures model caching behavior for this namespace-scoped profile.		Optional: {}

AIMProfileSpecCommon#

AIMProfileSpecCommon contains spec fields shared between AIMProfile and AIMClusterProfile. A profile answers five questions without consulting any other resource: model architecture (aimId), accelerator (acceleratorModel/Type/Count), K8s resources (status.resources), runtime config (engineArgs, engineEnv), and container image (image).

Appears in:

AIMClusterProfileSpec
AIMProfileSpec

Field	Description	Default	Validation
`aimId` string	AimId is the model architecture identifier (e.g., “qwen/qwen3-32b”). Primary matching axis for profile selection and custom weight onboarding. AimId is required for deployable profiles. Iteration 1 producers always emit deployable profiles, so AimId is effectively required there. Empty AimId is reserved for base profiles emitted by base-image discovery (custom-model derivation source material), which are not deployable until derived. Once set, AimId is immutable.		Optional: {}
`modelId` string	ModelId is the specific model / HuggingFace URI (e.g., “qwen/qwen3-32b-fp8”). Determines the cache path (/workspace/cache/{modelId}) and serves as a secondary discriminator for custom weight matching.		Optional: {}
`profileId` string	ProfileId is the on-disk profile identifier from the AIM image (e.g., “vllm-mi300x-fp8-tp1-latency”). Populated during discovery to link this CRD back to the profile YAML inside the container. Not required for manually created profiles.		Optional: {}
`engine` string	Engine identifies the inference engine (e.g., “vllm”, “tgi”).		Optional: {}
`metric` AIMMetric	Metric is the optimization target for this profile.		Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision is the numeric precision used by this profile.		Enum: [fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`type` AIMProfileType	Type indicates the optimization level. Hierarchy: optimized > general > preview > unoptimized.		Enum: [optimized general preview unoptimized] Optional: {}
`primary` boolean	Primary marks this as a default/recommended profile. When true, the profile is advertised for standard deployment and copied automatically for custom weight models. Defaults to false when not specified.	false
`manualSelectionOnly` boolean	ManualSelectionOnly is DEPRECATED and no longer honored by the resolver. It was a binary gate excluding a profile from automatic AIMService selection; that intent is now expressed through the graded `type` hierarchy (optimized > general > preview > unoptimized) combined with the selector’s `minimumType` floor. The field is retained for backward compatibility (existing objects and aim-build profile YAMLs still set it) but has no effect on selection; it will be removed in a future API version. Use `type: unoptimized` (+ a selector `minimumType`) instead. Deprecated: superseded by `type` + selector `minimumType`; ignored by the resolver.	false
`engineArgs` JSON	EngineArgs contains inference engine CLI arguments as a free-form JSON object. Passed to the inference engine (e.g., vLLM) at startup.		Schemaless: {} Optional: {}
`engineEnv` object (keys:string, values:string)	EngineEnv contains environment variables for the inference engine subprocess. Applied via os.execv, distinct from container-level ContainerEnv.		Optional: {}
`acceleratorModel` string	AcceleratorModel is the accelerator identifier for node selection. Maps to a node label key using the Exists operator: feature.node.kubernetes.io/aim-accelerator.{value}: Exists Supports both specific models (e.g., “MI300X”) and architecture-level fallbacks (e.g., “EPYC_ZEN5”) — the AcceleratorDetector labels nodes with all applicable identifiers.		MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`acceleratorType` AcceleratorType	AcceleratorType determines the resource derivation strategy: gpu or cpu. AIM Engine computes default resource requests from this field combined with AcceleratorCount and cluster-level configuration.		Enum: [gpu cpu] Optional: {}
`acceleratorCount` integer	AcceleratorCount is the number of accelerator units required. For AcceleratorType=gpu, this is the device count (e.g., 1, 2, 4, 8 for tensor-parallel sizes). For AcceleratorType=cpu, this is the number of CPU cores (e.g., 128 for EPYC_ZEN5, 192 for EPYC_9965). Combined with cluster-level configuration to compute default resource requests in status.resources. For AcceleratorType=gpu, the per-unit interpretation depends on AcceleratorPartitioningMode: under “unpartitioned” (default) one unit is one whole GPU; under “partitioned” or a specific scheme one unit is one partition slice (e.g. CPX-NPS4 = 1/8 of a GPU).		Minimum: 0 Optional: {}
`acceleratorPartitioningMode` string	AcceleratorPartitioningMode declares the GPU partition state the profile requires. Free-form string with reserved values: “” - omitted; CRD-defaulted to “unpartitioned”. “unpartitioned” - hardware-default partition state. Matches unpartitioned MI300X (canonical SPX-NPS1) AND non-partitionable hardware (Radeon, etc.) — any node whose detector stamped aim-accelerator.partitioning-scheme.default. “partitioned” - any actively partitioned mode. Excludes both unpartitioned partitionable hardware and non-partitionable hardware. “-” - specific compute+memory scheme, e.g. “CPX-NPS4”. Matches only nodes carrying that exact scheme label; does not match non-partitionable hardware. Other values (e.g. “CPX” alone, or typos) are accepted but fail-safe to zero matching nodes under this iteration’s label schema — compute-only / memory-only matching is not supported here. AIM Engine does NOT validate against AMD’s hardware compatibility matrix; invalid combinations simply report MatchingNodes == 0. Resolves to a single Exists or DoesNotExist node-affinity term on the aim-accelerator.partitioning-scheme.* labels published by the AcceleratorDetector, AND-ed with the acceleratorModel term.	unpartitioned	MaxLength: 63 Pattern: `^[A-Za-z0-9]([A-Za-z0-9._-]*[A-Za-z0-9])?$` Optional: {}
`resources` ResourceRequirements	Resources is an optional override for K8s resource requests/limits. When set, merged on top of the defaults that AIM Engine computes from AcceleratorType, AcceleratorCount, and cluster-level configuration. The resolved result is written to status.resources.		Optional: {}
`image` string	Image is the deployment container image. Required. For purpose-built profiles: the full AIM image. For custom weight profiles: the base image (e.g., aim-base:0.8.5).		MinLength: 1
`modelSources` AIMModelSource array	ModelSources specifies model artifact sources for this profile. Populated during discovery or set by user.		Optional: {}
`containerEnv` EnvVar array	ContainerEnv specifies container-level env vars for the AIM runtime process (K8s pod spec).		Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets for pulling container images.		Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the service account for workloads.		Optional: {}
`features` string array	Features lists optional capabilities the profile’s image honours, e.g. ”adapters” for LoRA serving. A service declaring spec.adapters is rejected (ConfigValid=False) unless its resolved profile lists “adapters” here.		Optional: {}

AIMProfileStatus#

AIMProfileStatus defines the observed state of AIMProfile / AIMClusterProfile.

Appears in:

AIMClusterProfile
AIMProfile

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller.
`status` AIMStatus	Status represents the current high-level status of this profile. Ready: at least one cluster node matches the profile’s accelerator labels and resource requests. NotAvailable: no matching nodes found.	Pending	Enum: [Pending Progressing Ready Degraded Failed NotAvailable]
`deployable` boolean	Deployable reports whether the profile is materialised enough to back an AIMService: true when spec.aimId and spec.modelSources are both populated, false for base profiles awaiting derivation. Image discovery of a deployable AIM image emits deployable profiles; base-image discovery emits base profiles (no aimId/modelSources) that a custom-model AIMModel derives into deployable copies.	false
`sourceModel` ProfileSourceModel	SourceModel identifies the producing AIM(Cluster)Model for profiles owned by AIMModel reconcilers. Empty for user-authored profiles.		Optional: {}
`origin` ProfileOrigin	Origin classifies how this profile was produced: - discovered: emitted by image discovery (AIMModel.spec.image). - derived: emitted by an AIMProfileSet or AIMModel.spec.profiles.derivedFrom. - user-authored: created independently by a user. Backfilled by the AIMProfile reconciler when not stamped at creation time; user-authored profiles default to `user-authored`.		Enum: [discovered derived user-authored] Optional: {}
`version` string	Version is extracted from the spec.image tag during reconciliation (e.g., “0.8.5”).		Optional: {}
`baseImage` string	BaseImage is the AIM_BASE_IMAGE_REF the inspector extracted from the source image when this profile was materialised by AIMModel discovery. Used by derivation flows (AIMService overlays, AIMProfileSet) to rebase the deployment image onto the source’s base when overriding model sources, so private mirrors stay self-contained. Empty for user-authored profiles.		Optional: {}
`matchingNodes` integer	MatchingNodes is the count of cluster nodes matching both the accelerator model label and status.resources requests. Zero means NotAvailable.		Optional: {}
`hardwareSummary` string	HardwareSummary is a human-readable string describing the hardware requirements. Format: “{count} x {model}” for GPU (e.g., “1 x MI300X”) or “CPU” for CPU-only.		Optional: {}
`resources` ResourceRequirements	Resources contains the definitive K8s resource requests/limits used for deployment. Computed by AIM Engine from AcceleratorType, AcceleratorCount, and cluster-level configuration, then merged with any spec.resources override.		Optional: {}
`resolvedNodeAffinity` NodeAffinity	ResolvedNodeAffinity contains the computed node affinity rules derived from spec.acceleratorModel. Used by AIMService when building InferenceService pods.		Optional: {}
`conditions` Condition array	Conditions represent the latest observations of profile state.

AIMService#

AIMService manages a KServe-based AIM inference service for the selected model and template. Note: KServe uses {name}-{namespace} format which must not exceed 63 characters. This constraint is validated at runtime since CEL cannot access metadata.namespace.

Appears in:

AIMServiceList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMService`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMServiceSpec
`status` AIMServiceStatus

AIMServiceList#

AIMServiceList contains a list of AIMService.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha2`
`kind` string	`AIMServiceList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMService array

ProfileSourceModel#

ProfileSourceModel identifies the producing AIM(Cluster)Model for a reconciler-produced profile. Stamped from owner references during reconciliation; left unset for user-authored profiles.

Appears in:

AIMProfileStatus

Field	Description	Validation
`name` string	Name is the producing model’s name.
`kind` ProfileSourceModelKind	Kind is the producing model’s kind (“AIMModel” or “AIMClusterModel”).	Enum: [AIMModel AIMClusterModel]
`namespace` string	Namespace is the producing model’s namespace. Empty when Kind is AIMClusterModel (cluster-scoped).	Optional: {}

ProfileSourceModelKind#

Underlying type: string

ProfileSourceModelKind identifies whether a profile’s source model is namespace-scoped (AIMModel) or cluster-scoped (AIMClusterModel).

Validation:

Enum: [AIMModel AIMClusterModel]

Appears in:

ProfileSourceModel

Field	Description
`AIMModel`
`AIMClusterModel`

API Reference

Contents

API Reference#

Packages#

aim.eai.amd.com/v1alpha2#

Resource Types#

AIMClusterModel#

AIMClusterModelList#

AIMClusterProfile#

AIMClusterProfileList#

AIMClusterProfileSet#

AIMClusterProfileSetList#

AIMClusterProfileSpec#

AIMModel#

AIMModelList#

AIMProfile#

AIMProfileCache#

AIMProfileCacheList#

AIMProfileCacheMode#

AIMProfileCacheSpec#

AIMProfileCacheStatus#

AIMProfileCachingConfig#

AIMProfileList#

AIMProfileSet#

AIMProfileSetList#

AIMProfileSpec#

AIMProfileSpecCommon#

AIMProfileStatus#

AIMService#

AIMServiceList#

ProfileSourceModel#

ProfileSourceModelKind#