Deployment Options

This guide covers different deployment strategies for AnonDocs, from simple single-server setups to scalable production deployments.

Deployment Methods

Docker Compose (Recommended for Small to Medium Deployments)

Best for: Single server deployments, easy management, quick setup

Basic Setup

# Clone repository
git clone https://github.com/AI-SmartTalk/AnonDocs.git
cd AnonDocs

# Configure environment
cp .env.example .env
nano .env

# Start services
docker-compose up -d

Production Docker Compose

Use docker-compose.prod.yml for production:

docker-compose -f docker-compose.prod.yml up -d

This includes:

Proper restart policies
Health checks
Resource limits
Logging configuration

Docker Compose with Ollama

version: '3.8'
services:
  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    
  anondocs:
    build: .
    ports:
      - "3000:3000"
    environment:
      - DEFAULT_LLM_PROVIDER=ollama
      - OLLAMA_BASE_URL=http://ollama:11434
      - OLLAMA_MODEL=mistral-nemo
    depends_on:
      - ollama
    restart: unless-stopped

volumes:
  ollama_data:

Kubernetes Deployment

Best for: Large-scale deployments, high availability, cloud-native infrastructure

Prerequisites

Kubernetes cluster (1.20+)
kubectl configured
Storage class for persistent volumes

Deploy Ollama

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ollama
  template:
    metadata:
      labels:
        app: ollama
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - containerPort: 11434
        volumeMounts:
        - name: ollama-data
          mountPath: /root/.ollama
      volumes:
      - name: ollama-data
        persistentVolumeClaim:
          claimName: ollama-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: ollama
spec:
  selector:
    app: ollama
  ports:
  - port: 11434
    targetPort: 11434

Deploy AnonDocs

apiVersion: apps/v1
kind: Deployment
metadata:
  name: anondocs
spec:
  replicas: 3
  selector:
    matchLabels:
      app: anondocs
  template:
    metadata:
      labels:
        app: anondocs
    spec:
      containers:
      - name: anondocs
        image: your-registry/anondocs:latest
        ports:
        - containerPort: 3000
        env:
        - name: PORT
          value: "3000"
        - name: DEFAULT_LLM_PROVIDER
          value: "ollama"
        - name: OLLAMA_BASE_URL
          value: "http://ollama:11434"
        - name: OLLAMA_MODEL
          value: "mistral-nemo"
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
---
apiVersion: v1
kind: Service
metadata:
  name: anondocs
spec:
  type: LoadBalancer
  selector:
    app: anondocs
  ports:
  - port: 80
    targetPort: 3000

Apply Kubernetes Manifests

Example manifests are available in the k8s/ directory:

kubectl apply -f k8s/

Cloud Platform Deployments

AWS ECS / Fargate

{
  "family": "anondocs",
  "containerDefinitions": [{
    "name": "anondocs",
    "image": "your-account.dkr.ecr.region.amazonaws.com/anondocs:latest",
    "portMappings": [{
      "containerPort": 3000,
      "protocol": "tcp"
    }],
    "environment": [
      {"name": "PORT", "value": "3000"},
      {"name": "DEFAULT_LLM_PROVIDER", "value": "ollama"},
      {"name": "OLLAMA_BASE_URL", "value": "http://ollama:11434"}
    ],
    "memory": 2048,
    "cpu": 1024
  }]
}

Google Cloud Run

# Build and deploy
gcloud builds submit --tag gcr.io/PROJECT_ID/anondocs
gcloud run deploy anondocs \
  --image gcr.io/PROJECT_ID/anondocs \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars DEFAULT_LLM_PROVIDER=ollama

Azure Container Instances

az container create \
  --resource-group myResourceGroup \
  --name anondocs \
  --image your-registry.azurecr.io/anondocs:latest \
  --dns-name-label anondocs \
  --ports 3000 \
  --environment-variables \
    PORT=3000 \
    DEFAULT_LLM_PROVIDER=ollama

Reverse Proxy Setup

Nginx Configuration

server {
    listen 80;
    server_name anondocs.example.com;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
    
    location / {
        limit_req zone=api_limit burst=20 nodelay;
        
        proxy_pass http://localhost:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
        
        # Timeout for long-running requests
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }
}

Caddy Configuration

anondocs.example.com {
    reverse_proxy localhost:3000 {
        transport http {
            read_timeout 300s
        }
    }
    
    # Rate limiting
    rate_limit {
        zone api_limit {
            key {remote_host}
            events 10
            window 1m
        }
    }
}

Systemd Service (Linux)

For direct Node.js deployment on Linux:

[Unit]
Description=AnonDocs API Server
After=network.target

[Service]
Type=simple
User=anondocs
WorkingDirectory=/opt/anondocs
Environment=NODE_ENV=production
EnvironmentFile=/opt/anondocs/.env
ExecStart=/usr/bin/node /opt/anondocs/dist/index.js
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Install and enable:

sudo cp anondocs.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable anondocs
sudo systemctl start anondocs

Scaling Strategies

Horizontal Scaling

Deploy multiple AnonDocs instances behind a load balancer:

Load Balancer
    ├── AnonDocs Instance 1
    ├── AnonDocs Instance 2
    └── AnonDocs Instance 3
        └── Shared LLM Service (Ollama/vLLM)

Vertical Scaling

Increase resources for single-instance deployments:

CPU: More cores for parallel processing
RAM: More memory for larger chunks and parallel processing
GPU: Accelerate LLM inference with CUDA support

LLM Provider Scaling

Shared LLM Service

Run a single LLM instance and connect multiple AnonDocs instances:

# Multiple AnonDocs instances
services:
  anondocs-1:
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
  anondocs-2:
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
  ollama:
    # Shared LLM service

Health Checks and Monitoring

Health Check Endpoint

curl http://localhost:3000/health

Kubernetes Liveness Probe

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

Monitoring Integration

Prometheus: Expose metrics endpoint
Grafana: Visualize metrics and performance
Logging: Centralized logging with ELK stack or similar

Next Steps

🔒 Production Considerations - Security, backups, and best practices

Deployment Methods​

Docker Compose (Recommended for Small to Medium Deployments)​

Basic Setup​

Production Docker Compose​

Docker Compose with Ollama​

Kubernetes Deployment​

Prerequisites​

Deploy Ollama​

Deploy AnonDocs​

Apply Kubernetes Manifests​

Cloud Platform Deployments​

AWS ECS / Fargate​

Google Cloud Run​

Azure Container Instances​

Reverse Proxy Setup​

Nginx Configuration​

Caddy Configuration​

Systemd Service (Linux)​

Scaling Strategies​

Horizontal Scaling​

Vertical Scaling​

LLM Provider Scaling​

Shared LLM Service​

Health Checks and Monitoring​

Health Check Endpoint​

Kubernetes Liveness Probe​

Monitoring Integration​

Next Steps​