Deployment
Plano can be deployed in two ways: natively on the host (default) or inside a Docker container.
Native Deployment (Default)
Plano runs natively by default. Pre-compiled binaries (Envoy, WASM plugins, brightstaff) are automatically downloaded on the first run and cached at ~/.plano/.
Supported platforms: Linux (x86_64, aarch64), macOS (Apple Silicon).
Start Plano
planoai up plano_config.yaml
Options:
--foreground— stay attached and stream logs (Ctrl+C to stop)--with-tracing— start a local OTLP trace collector
Runtime files (rendered configs, logs, PID file) are stored in ~/.plano/run/.
Stop Plano
planoai down
Build from Source (Developer)
If you want to build from source instead of using pre-compiled binaries, you need:
Rust with the
wasm32-wasip1targetOpenSSL dev headers (
libssl-devon Debian/Ubuntu,opensslon macOS)
planoai build --native
Docker Deployment
Below is a minimal, production-ready example showing how to deploy the Plano Docker image directly and run basic runtime checks. Adjust image names, tags, and the plano_config.yaml path to match your environment.
Note
You will need to pass all required environment variables that are referenced in your plano_config.yaml file.
For plano_config.yaml, you can use any sample configuration defined earlier in the documentation. For example, you can try the LLM Routing sample config.
Docker Compose Setup
Create a docker-compose.yml file with the following configuration:
# docker-compose.yml
services:
plano:
image: katanemo/plano:0.4.14
container_name: plano
ports:
- "10000:10000" # ingress (client -> plano)
- "12000:12000" # egress (plano -> upstream/llm proxy)
volumes:
- ./plano_config.yaml:/app/plano_config.yaml:ro
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY:?error}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?error}
Starting the Stack
Start the services from the directory containing docker-compose.yml and plano_config.yaml:
# Set required environment variables and start services
OPENAI_API_KEY=xxx ANTHROPIC_API_KEY=yyy docker compose up -d
Check container health and logs:
docker compose ps
docker compose logs -f plano
You can also use the CLI with Docker mode:
planoai up plano_config.yaml --docker
planoai down --docker
Kubernetes Deployment
Plano runs as a single container in Kubernetes. The container bundles Envoy, WASM plugins, and brightstaff, managed by supervisord internally. Deploy it as a standard Kubernetes Deployment with your plano_config.yaml mounted via a ConfigMap and API keys injected via a Secret.
Note
All environment variables referenced in your plano_config.yaml (e.g. $OPENAI_API_KEY) must be set in the container environment. Use Kubernetes Secrets for API keys.
Step 1: Create the Config
Store your plano_config.yaml in a ConfigMap:
kubectl create configmap plano-config --from-file=plano_config.yaml=./plano_config.yaml
Step 2: Create API Key Secrets
Store your LLM provider API keys in a Secret:
kubectl create secret generic plano-secrets \
--from-literal=OPENAI_API_KEY=sk-... \
--from-literal=ANTHROPIC_API_KEY=sk-ant-...
Step 3: Deploy Plano
Create a plano-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: plano
labels:
app: plano
spec:
replicas: 1
selector:
matchLabels:
app: plano
template:
metadata:
labels:
app: plano
spec:
containers:
- name: plano
image: katanemo/plano:0.4.14
ports:
- containerPort: 12000 # LLM gateway (chat completions, model routing)
name: llm-gateway
envFrom:
- secretRef:
name: plano-secrets
env:
- name: LOG_LEVEL
value: "info"
volumeMounts:
- name: plano-config
mountPath: /app/plano_config.yaml
subPath: plano_config.yaml
readOnly: true
readinessProbe:
httpGet:
path: /healthz
port: 12000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 12000
initialDelaySeconds: 10
periodSeconds: 30
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1000m"
volumes:
- name: plano-config
configMap:
name: plano-config
---
apiVersion: v1
kind: Service
metadata:
name: plano
spec:
selector:
app: plano
ports:
- name: llm-gateway
port: 12000
targetPort: 12000
Apply it:
kubectl apply -f plano-deployment.yaml
Step 4: Verify
# Check pod status
kubectl get pods -l app=plano
# Check logs
kubectl logs -l app=plano -f
# Test routing (port-forward for local testing)
kubectl port-forward svc/plano 12000:12000
curl -s -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \
http://localhost:12000/v1/chat/completions | jq .model
Updating Configuration
To update plano_config.yaml, replace the ConfigMap and restart the pod:
kubectl create configmap plano-config \
--from-file=plano_config.yaml=./plano_config.yaml \
--dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment/plano
Enabling OTEL Tracing
Plano emits OpenTelemetry traces for every request — including routing decisions, model selection, and upstream latency. To export traces to an OTEL collector in your cluster, add the tracing section to your plano_config.yaml:
tracing:
opentracing_grpc_endpoint: "http://otel-collector.monitoring:4317"
random_sampling: 100 # percentage of requests to trace (1-100)
trace_arch_internal: true # include internal Plano spans
span_attributes:
header_prefixes: # capture request headers as span attributes
- "x-"
static: # add static attributes to all spans
environment: "production"
service: "plano"
Set the OTEL_TRACING_GRPC_ENDPOINT environment variable or configure it directly in the config. Plano propagates the traceparent header end-to-end, so traces correlate across your upstream and downstream services.
Environment Variables Reference
The following environment variables can be set on the container:
Variable |
Description |
Default |
|---|---|---|
|
Log verbosity ( |
|
|
OpenAI API key (if referenced in config) |
|
|
Anthropic API key (if referenced in config) |
|
|
OTEL collector endpoint for trace export |
|
Any environment variable referenced in plano_config.yaml with $VAR_NAME syntax will be substituted at startup. Use Kubernetes Secrets for sensitive values and ConfigMaps or env entries for non-sensitive configuration.
Runtime Tests
Perform basic runtime tests to verify routing and functionality.
Gateway Smoke Test
Test the chat completion endpoint with automatic routing:
# Request handled by the gateway. 'model: "none"' lets Plano decide routing
curl --header 'Content-Type: application/json' \
--data '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
"gpt-5.2"
Model-Based Routing
Test explicit provider and model routing:
curl -s -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Explain quantum computing"}], "model":"anthropic/claude-sonnet-4-5"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
"claude-sonnet-4-5"
Troubleshooting
Common Issues and Solutions
- Environment Variables
Ensure all environment variables (
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc.) used byplano_config.yamlare set before starting services.- TLS/Connection Errors
If you encounter TLS or connection errors to upstream providers:
Check DNS resolution
Verify proxy settings
Confirm correct protocol and port in your
plano_configendpoints
- Verbose Logging
To enable more detailed logs for debugging:
Run plano with a higher component log level
See the Observability guide for logging and monitoring details
Rebuild the image if required with updated log configuration
- CI/Automated Checks
For continuous integration or automated testing, you can use the curl commands above as health checks in your deployment pipeline.