Architecture¶

This page explains how sikifanso sets up and manages Kubernetes clusters for AI agent infrastructure.

Directory structure¶

Each cluster's state lives under ~/.sikifanso/clusters/<name>/:

~/.sikifanso/clusters/<name>/
+-- session.yaml              # Cluster metadata, credentials, ports
+-- gitops/                   # Local git repo (mounted into cluster)
    +-- bootstrap/
    |   +-- root-app.yaml     # ApplicationSet for custom apps
    |   +-- root-catalog.yaml # ApplicationSet for catalog apps
    +-- apps/                 # Custom user-supplied Helm apps
    |   +-- coordinates/
    |   |   +-- <app>.yaml    # Helm chart coordinates (repo, chart, version, namespace)
    |   +-- values/
    |       +-- <app>.yaml    # Helm values overrides
    +-- catalog/              # AI agent infrastructure catalog
    |   +-- <app>.yaml        # App definition with enabled flag
    |   +-- values/
    |       +-- <app>.yaml    # Helm values overrides
    +-- agents/               # Agent sandbox definitions
    |   +-- <agent>.yaml      # Agent definition (name, quotas)
    |   +-- values/
    |       +-- <agent>.yaml  # Helm values for agent-template chart
    +-- infra/                # Infrastructure config overrides
        +-- *.yaml            # Deep-merged with compiled-in defaults

How the local gitops repo works¶

The gitops/ directory is a regular git repository on your filesystem. During cluster creation, it is scaffolded from a bootstrap template repo.

This directory is mounted into the k3d cluster at /local-gitops via a hostPath volume. ArgoCD's repo-server reads from it directly -- no remote git server needed.

AI Agent Infrastructure Catalog¶

The catalog contains curated tools organized by function:

Category	Tools	Purpose
gateway	LiteLLM Proxy	LLM API routing, cost tracking, multi-provider support
observability	Langfuse, Prometheus+Grafana, Loki, Tempo	LLM tracing, metrics, logs, distributed tracing
guardrails	Guardrails AI, NeMo Guardrails, Presidio	Output validation, safety rails, PII redaction
rag	Qdrant, Text Embeddings Inference, Unstructured	Vector storage, embeddings, document parsing
runtime	Temporal, External Secrets, OPA	Workflow orchestration, secrets management, policy
models	Ollama	Local LLM inference and model management
storage	PostgreSQL, Valkey (Redis)	Supporting data stores for other tools

Each tool is a self-contained YAML file with Helm chart coordinates and an enabled flag. ArgoCD's ApplicationSet deploys only enabled entries.

Root ApplicationSets¶

The bootstrap template includes two ApplicationSet manifests that manage two tracks of apps:

Custom apps (`root-app.yaml`)¶

Uses the git file generator to watch apps/coordinates/*.yaml. Each coordinate file defines a Helm chart source:

name: podinfo
repoURL: https://stefanprodan.github.io/podinfo
chart: podinfo
targetRevision: 6.10.1
namespace: podinfo

The ApplicationSet creates a multi-source ArgoCD Application for every matching coordinate file, pairing it with the corresponding values file at apps/values/<name>.yaml. Adding or removing a coordinate file is all it takes to deploy or undeploy.

Catalog apps (`root-catalog.yaml`)¶

Uses the git file generator to watch catalog/*.yaml, but only generates Applications for entries where enabled: true. Each catalog entry has additional metadata:

name: litellm-proxy
category: gateway
description: LLM API gateway with multi-provider routing, cost tracking, and rate limiting
repoURL: https://litellm.github.io/helm-charts
chart: litellm
targetRevision: "0.2.1"
namespace: gateway
enabled: false

Setting enabled: true and committing causes ArgoCD to create and sync the Application. Setting it back to false causes ArgoCD to prune/delete it. Values overrides live at catalog/values/<name>.yaml.

How `app sync` works¶

ArgoCD's default reconciliation interval is 180 seconds. The sikifanso app sync command bypasses this by sending a webhook push event (mimicking a GitHub push notification) to two endpoints:

ArgoCD server -- invalidates the repo-server's git revision cache, causing it to re-read the local gitops repo
ApplicationSet controller -- triggers immediate re-evaluation of the git generator, picking up new or removed coordinate files

The ApplicationSet controller webhook is reached via the Kubernetes API server proxy, so no extra ports are exposed.

Cluster components¶

When you run sikifanso cluster create, the following happens in order:

k3d cluster is created as a single-node cluster running k3s v1.29. The flannel CNI and kube-proxy are disabled (Cilium replaces both).
Cilium is installed via Helm as a full kube-proxy replacement with ingress controller and Hubble UI enabled.
ArgoCD is installed via Helm, configured to use the hostPath-mounted gitops repo as its source.
GitOps repo is cloned from the bootstrap template and both root ApplicationSets are applied.
Ports are mapped from the cluster to your host (ArgoCD UI, Hubble UI).

State management¶

Cluster metadata is persisted to ~/.sikifanso/clusters/<name>/session.yaml and includes:

Cluster state (running / stopped)
ArgoCD URL, username, password
Hubble UI URL
GitOps repo path
k3d configuration (image, node counts)
Port mappings
Bootstrap template URL and version

This file is read on every CLI command to locate and interact with the cluster.

Snapshot storage¶

Snapshots are stored at ~/.sikifanso/snapshots/<name>.tar.gz. Each archive contains the session metadata and the full gitops repo directory, allowing a cluster's configuration to be captured and restored independently of the running infrastructure.

Port allocation¶

Each cluster gets its own set of ports. Defaults are:

30080 -- ArgoCD UI
30081 -- Hubble UI

If defaults are taken by another cluster, sikifanso automatically finds free ports. Port assignments are stored in session.yaml.

The dashboard listens on :9090 by default, independent of cluster port mappings (configurable via --addr).

Profile system¶

Profiles are named presets that map to a set of catalog apps. When --profile is passed to cluster create, the CLI enables each app in the profile's set after the cluster is bootstrapped. Profiles are composable -- --profile agent-dev,rag takes the union of both app sets with no duplicates.

Profile definitions are compiled into the binary (in internal/profile/), not stored in the gitops repo. This means profiles are versioned with the CLI, and the same profile name always produces the same set of apps for a given CLI version.

Agent sandbox architecture¶

Agent sandboxes use the sikifanso-agent-template Helm chart to create isolated namespaces. Each sandbox includes:

Namespace -- agent-<name>, dedicated to one agent workload
ResourceQuota -- limits CPU, memory, and pod count (configurable per-agent)
NetworkPolicy -- Cilium-enforced rules: default-deny egress, allowlisted access to LiteLLM Proxy, Qdrant, PostgreSQL, and Valkey; no cross-agent traffic; no Kubernetes API access
ServiceAccount -- dedicated identity for the agent workload

Agent definitions live in the gitops repo at agents/<name>.yaml with Helm values at agents/values/<name>.yaml. The root ApplicationSet picks up these files and creates ArgoCD Applications that deploy the agent-template chart.

MCP server architecture¶

The MCP server (internal/mcp/) exposes 25 tools across 6 categories: cluster management, catalog/profiles, agents, ArgoCD, Kubernetes, and health. It uses the Model Context Protocol Go SDK with stdio transport.

Each MCP tool calls the same internal functions that the CLI commands use -- there are no separate code paths or elevated privileges. The server is designed to be launched as a subprocess by MCP clients (Claude Code, Claude Desktop, Cursor, etc.) via sikifanso mcp serve.

Infrastructure config¶

Infrastructure configuration (internal/infraconfig/) controls the versions and settings of cluster components (Cilium, ArgoCD). The system uses a deep-merge approach:

Compiled-in defaults (internal/infraconfig/defaults/) -- shipped with the binary
User overrides (gitops/infra/*.yaml) -- optional per-cluster customizations

Users only need to specify the fields they want to change. The merge produces the final configuration used during cluster creation and upgrades.