Multi-Tenancy#
AIM Engine supports multi-tenant deployments through a combination of cluster-scoped and namespace-scoped resources.
Resource Scoping#
Resource |
Cluster-Scoped |
Namespace-Scoped |
|---|---|---|
Models |
|
|
Templates |
|
|
Runtime Config |
|
|
Model Sources |
|
— |
Services |
— |
|
Artifacts |
— |
|
Namespace-scoped resources always take precedence over cluster-scoped ones during resolution.
Team Isolation#
A typical multi-tenant setup:
Cluster admin creates cluster-scoped resources shared by all teams:
AIMClusterModelSourcefor model discoveryAIMClusterRuntimeConfigfor default routing, storage, and policiesAIMClusterServiceTemplatefor validated runtime profiles
Teams work in their own namespaces with:
AIMServiceresources for their inference endpointsAIMRuntimeConfigfor team-specific credentials and overridesAIMModelfor custom models only their team needs
Example: Namespace Configuration#
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMRuntimeConfig
metadata:
name: default
namespace: ml-team-a
spec:
env:
- name: HF_TOKEN
valueFrom:
secretKeyRef:
name: team-a-hf-token
key: token
routing:
pathTemplate: "/team-a/{.metadata.name}"
Override Hierarchy#
Configuration is resolved with the most specific scope winning:
Setting |
Resolution Order |
|---|---|
Model |
|
Template |
|
Runtime config |
|
Environment vars |
Service → RuntimeConfig (namespace) → ClusterRuntimeConfig |
Label Propagation#
AIM Engine can propagate labels from AIMService resources to managed child resources (InferenceService, HTTPRoute, PVCs). This is useful for cost tracking, team attribution, and policy enforcement.
Enable via runtime configuration:
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterRuntimeConfig
metadata:
name: default
spec:
labelPropagation:
enabled: true
match:
- "aim.eai.amd.com/*"
- "team-*"
- "cost-center"
Labels on the AIMService matching these patterns are automatically applied to all child resources. Labels matching aim.eai.amd.com/* are always propagated regardless of this setting.
RBAC#
AIM Engine creates helper ClusterRoles for each CRD when rbacHelpers.enable is true (default):
Role |
Permissions |
|---|---|
|
Full access to AIMService resources |
|
Create, update, delete AIMService resources |
|
Read-only access to AIMService resources |
Similar roles exist for all CRDs. Bind these to team groups or service accounts:
kubectl create rolebinding team-a-aimservice \
--clusterrole=aimservice-editor \
--group=team-a \
--namespace=ml-team-a
Next Steps#
Runtime Configuration — Configuration resolution details
Security — RBAC and security configuration
Private Registries — Per-team credential management