Production Deployment

Deploying the Conserver in Production

This guide covers deploying the Conserver in production environments with considerations for scalability, reliability, and security.

Prerequisites

  • Docker and Docker Compose

  • Redis server (or Redis cluster for high availability)

  • Storage backends configured (PostgreSQL, S3, etc.)

  • Domain name and TLS certificates

  • Monitoring infrastructure (optional but recommended)

Image strategy: two Dockerfiles

As of the May 2026 image optimization (docker/Dockerfile.api and docker/Dockerfile.conserver), the conserver ships two separate images:

Image
What it contains
Use it for

Dockerfile.api

FastAPI app, storage backends, vCon library β€” no audio/ML stack

The API tier. Light, starts fast, scales horizontally.

Dockerfile.conserver

Everything in the API image plus transformers, openai, deepgram, ffmpeg, pydub

The worker tier. Heavy but only needed where audio processing and LLM calls run.

Both images use uv (Astral's Python package manager) for reproducible builds and place the virtualenv at /opt/venv so it survives volume mounts.

In production, deploy the API image for the conserver-api service and the conserver image for the conserver-worker service. The example below does exactly that.

Architecture Overview

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Load Balancer β”‚
                    β”‚    (nginx/ALB)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚                   β”‚                   β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
    β”‚Conserverβ”‚        β”‚Conserverβ”‚        β”‚Conserverβ”‚
    β”‚   API   β”‚        β”‚   API   β”‚        β”‚   API   β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
         β”‚                   β”‚                   β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚      Redis      β”‚
                    β”‚  (Queues/Cache) β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚                   β”‚                   β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
    β”‚PostgreSQLβ”‚        β”‚   S3    β”‚        β”‚ Milvus  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Docker Compose Production Setup

docker-compose.yml

nginx.conf


Scaling Considerations

Horizontal Scaling

The Conserver supports horizontal scaling because:

  • All state is stored in Redis

  • Multiple instances can process from the same queues

  • API requests are stateless

Scale workers based on queue depth:

Redis Configuration

For production Redis deployments:

Queue Monitoring

Monitor queue lengths to detect backlogs:


Security Hardening

API Token Management

  1. Use token files instead of environment variables:

  2. Rotate tokens regularly:

  3. Use separate tokens for different purposes:

    • Internal API token for system operations

    • Partner-specific tokens via ingress_auth

Network Security

  1. Isolate Redis:

  2. Enable Redis AUTH:

  3. Use TLS for external connections

Secret Management

Consider using:

  • Docker secrets (as shown above)

  • HashiCorp Vault

  • AWS Secrets Manager

  • Kubernetes secrets


Monitoring and Observability

Health Checks

The Conserver exposes three public system endpoints (no auth required) for monitoring:

These endpoints intentionally live at the application root (not under API_ROOT_PATH), so they're stable regardless of how you've configured the API prefix.

Metrics Integration

Every standard link and storage emits OpenTelemetry spans and metrics (latency, errors, cache hits where applicable). Wire up an OTLP collector β€” see vCon MCP Adapters for a turnkey integration β€” and the conserver will populate your dashboards out of the box.

Per-link metrics include:

Log Aggregation

Configure structured JSON logging:

Logs include:

  • Request IDs for tracing

  • Processing times

  • Error details with stack traces

  • vCon UUIDs for correlation

Alerting

Set up alerts for:

Metric
Threshold
Action

Queue depth > 1000

Warning

Scale workers

DLQ depth > 100

Critical

Investigate failures

API latency p99 > 5s

Warning

Check resources

Error rate > 5%

Critical

Check logs


Graceful Shutdown

The Conserver handles SIGTERM for graceful shutdown:

  1. Stops accepting new vCons

  2. Completes in-flight processing

  3. Returns unprocessed items to queues

  4. Closes connections cleanly

Configure Docker stop timeout:


Backup and Recovery

Redis Persistence

Enable AOF for durability:

Backup Strategy

  1. Redis RDB snapshots:

  2. Storage backend backups:

    • PostgreSQL: pg_dump

    • S3: Enable versioning

    • Elasticsearch: Snapshot API

  3. Configuration backup:

Disaster Recovery

  1. Deploy Redis with persistence

  2. Use storage backends with replication

  3. Keep configuration in version control

  4. Document recovery procedures


Deployment Checklist

Pre-deployment

Deployment

Post-deployment


Troubleshooting

Common Issues

Workers not processing:

High DLQ count:

Memory issues:

See Troubleshooting for more detailed solutions.

Last updated

Was this helpful?