Agent CRD

Every field on the Agent spec.

apiVersion: krypton.ai/v1alpha1, kind: Agent, namespaced (short name ag).

Minimal example

apiVersion: krypton.ai/v1alpha1
kind: Agent
metadata:
  name: travel
  namespace: agents
spec:
  image: ghcr.io/org/travel-agent:latest

That’s the smallest valid Agent. Everything else has a default.

Full example

apiVersion: krypton.ai/v1alpha1
kind: Agent
metadata:
  name: travel
  namespace: agents
spec:
  # Container
  image: ghcr.io/org/travel-agent:latest
  imagePullPolicy: IfNotPresent
  imagePullSecrets:
    - name: ghcr-secret

  # Metadata (informational — surfaced in UI + metrics)
  runtime: python
  framework: langgraph

  # Wire protocol the agent speaks
  protocol: a2a               # a2a | mcp | http

  # Mode (always-on is the supported MVP default)
  mode: always-on             # always-on | serverless (paused)

  # Scaling
  minReplicas: 1
  maxReplicas: 10
  concurrency: 8              # in-flight requests per pod

  # Networking
  port: 8080
  invocationPath: /a2a

  # Pod
  resources:
    requests: { cpu: 100m, memory: 256Mi }
    limits:   { cpu: 1000m, memory: 1Gi }
  env:
    - { name: LOG_LEVEL, value: info }
  envFrom:
    - secretRef: { name: travel-secrets }
  serviceAccountName: ""      # blank = auto-create

  # Lifecycle
  timeout: 60s                # per-invocation
  startupTimeout: 30s         # cold-start grace

Spec reference

Container

FieldTypeDefaultNotes
imagestring (required)Container image, including registry + tag
imagePullPolicystringIfNotPresentStandard K8s pull policy
imagePullSecrets[]corev1.LocalObjectReferenceSame shape as a pod’s imagePullSecrets

Metadata

FieldTypeDefaultNotes
runtimestringInformational (python, node, go, …)
frameworkstringInformational (langgraph, crewai, …)
protocolstringa2aOne of a2a, mcp, http

Mode + scaling

FieldTypeDefaultNotes
modestringalways-onalways-on (MVP default) or serverless (paused — see below)
minReplicasint32 (≥ 0)1Always-on floor — pin to 0 only for serverless
maxReplicasint32 (≥ 1)10Must be ≥ minReplicas
concurrencyint32 (≥ 1)8In-flight requests per pod cap (enforced by sidecar)
scaleToZeroAfterduration300sIdle window before scale-to-zero — only consulted in mode: serverless (paused)

Networking

FieldTypeDefaultNotes
portint32 (1–65535)8080User container’s listen port
invocationPathstring/Path the gateway forwards invocations to (prefix-stripped)

Pod

FieldTypeDefaultNotes
resourcescorev1.ResourceRequirementsSame as a pod’s resources
env[]corev1.EnvVarPassed to user container
envFrom[]corev1.EnvFromSourceSame
serviceAccountNamestring""Empty = auto-created SA with minimal permissions

Lifecycle

FieldTypeDefaultNotes
timeoutduration60sBounds a single invocation
startupTimeoutduration30sActivator’s cold-start grace; gateway returns 504 if exceeded

Status (read-only)

FieldTypeWritten by
phaseenumManager
replicasint32Manager
readyReplicasint32Manager
desiredReplicasint32Scaler + activator
urlstringManager
lastInvocationAttimeGateway
observedGenerationint64Manager
conditions[]Cond.Manager

Phases

PhaseMeaning
PendingAt least one desired replica, none ready yet
ReadyAt least one replica ready, OR scaled to zero
Scaling(reserved; not currently emitted)
FailedPersistent reconcile errors (e.g. crashloop)

Validation

Beyond OpenAPI defaults, the (optional) validating webhook enforces:

  • image non-empty
  • mode: always-onminReplicas >= 1
  • concurrency >= 1
  • maxReplicas >= minReplicas
  • port in [1, 65535]

Webhooks are off by default (require cert plumbing). The OpenAPI validation catches everything except the cross-field rules.

Serverless mode (paused)

mode: serverless is implemented but paused in the MVP. To opt in for an individual agent:

spec:
  mode: serverless
  minReplicas: 0
  scaleToZeroAfter: 60s

What happens then: the activator catches requests when no pods are ready, patches desiredReplicas = 1, polls Endpoints for readiness, forwards once a pod is up. The scaler drops the agent back to zero after scaleToZeroAfter of idle time. See Architecture → Components → Serverless mode (paused) for the current status.