Skip to main content

Deployment Options

This guide covers different deployment strategies for AnonDocs, from simple single-server setups to scalable production deployments.

Deployment Methods

Best for: Single server deployments, easy management, quick setup

Basic Setup

# Clone repository
git clone https://github.com/AI-SmartTalk/AnonDocs.git
cd AnonDocs

# Configure environment
cp .env.example .env
nano .env

# Start services
docker-compose up -d

Production Docker Compose

Use docker-compose.prod.yml for production:

docker-compose -f docker-compose.prod.yml up -d

This includes:

  • Proper restart policies
  • Health checks
  • Resource limits
  • Logging configuration

Docker Compose with Ollama

version: '3.8'
services:
ollama:
image: ollama/ollama:latest
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"

anondocs:
build: .
ports:
- "3000:3000"
environment:
- DEFAULT_LLM_PROVIDER=ollama
- OLLAMA_BASE_URL=http://ollama:11434
- OLLAMA_MODEL=mistral-nemo
depends_on:
- ollama
restart: unless-stopped

volumes:
ollama_data:

Kubernetes Deployment

Best for: Large-scale deployments, high availability, cloud-native infrastructure

Prerequisites

  • Kubernetes cluster (1.20+)
  • kubectl configured
  • Storage class for persistent volumes

Deploy Ollama

apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
spec:
replicas: 1
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
containers:
- name: ollama
image: ollama/ollama:latest
ports:
- containerPort: 11434
volumeMounts:
- name: ollama-data
mountPath: /root/.ollama
volumes:
- name: ollama-data
persistentVolumeClaim:
claimName: ollama-pvc
---
apiVersion: v1
kind: Service
metadata:
name: ollama
spec:
selector:
app: ollama
ports:
- port: 11434
targetPort: 11434

Deploy AnonDocs

apiVersion: apps/v1
kind: Deployment
metadata:
name: anondocs
spec:
replicas: 3
selector:
matchLabels:
app: anondocs
template:
metadata:
labels:
app: anondocs
spec:
containers:
- name: anondocs
image: your-registry/anondocs:latest
ports:
- containerPort: 3000
env:
- name: PORT
value: "3000"
- name: DEFAULT_LLM_PROVIDER
value: "ollama"
- name: OLLAMA_BASE_URL
value: "http://ollama:11434"
- name: OLLAMA_MODEL
value: "mistral-nemo"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
---
apiVersion: v1
kind: Service
metadata:
name: anondocs
spec:
type: LoadBalancer
selector:
app: anondocs
ports:
- port: 80
targetPort: 3000

Apply Kubernetes Manifests

Example manifests are available in the k8s/ directory:

kubectl apply -f k8s/

Cloud Platform Deployments

AWS ECS / Fargate

{
"family": "anondocs",
"containerDefinitions": [{
"name": "anondocs",
"image": "your-account.dkr.ecr.region.amazonaws.com/anondocs:latest",
"portMappings": [{
"containerPort": 3000,
"protocol": "tcp"
}],
"environment": [
{"name": "PORT", "value": "3000"},
{"name": "DEFAULT_LLM_PROVIDER", "value": "ollama"},
{"name": "OLLAMA_BASE_URL", "value": "http://ollama:11434"}
],
"memory": 2048,
"cpu": 1024
}]
}

Google Cloud Run

# Build and deploy
gcloud builds submit --tag gcr.io/PROJECT_ID/anondocs
gcloud run deploy anondocs \
--image gcr.io/PROJECT_ID/anondocs \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars DEFAULT_LLM_PROVIDER=ollama

Azure Container Instances

az container create \
--resource-group myResourceGroup \
--name anondocs \
--image your-registry.azurecr.io/anondocs:latest \
--dns-name-label anondocs \
--ports 3000 \
--environment-variables \
PORT=3000 \
DEFAULT_LLM_PROVIDER=ollama

Reverse Proxy Setup

Nginx Configuration

server {
listen 80;
server_name anondocs.example.com;

# Rate limiting
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;

location / {
limit_req zone=api_limit burst=20 nodelay;

proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache_bypass $http_upgrade;

# Timeout for long-running requests
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
}

Caddy Configuration

anondocs.example.com {
reverse_proxy localhost:3000 {
transport http {
read_timeout 300s
}
}

# Rate limiting
rate_limit {
zone api_limit {
key {remote_host}
events 10
window 1m
}
}
}

Systemd Service (Linux)

For direct Node.js deployment on Linux:

[Unit]
Description=AnonDocs API Server
After=network.target

[Service]
Type=simple
User=anondocs
WorkingDirectory=/opt/anondocs
Environment=NODE_ENV=production
EnvironmentFile=/opt/anondocs/.env
ExecStart=/usr/bin/node /opt/anondocs/dist/index.js
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Install and enable:

sudo cp anondocs.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable anondocs
sudo systemctl start anondocs

Scaling Strategies

Horizontal Scaling

Deploy multiple AnonDocs instances behind a load balancer:

Load Balancer
├── AnonDocs Instance 1
├── AnonDocs Instance 2
└── AnonDocs Instance 3
└── Shared LLM Service (Ollama/vLLM)

Vertical Scaling

Increase resources for single-instance deployments:

  • CPU: More cores for parallel processing
  • RAM: More memory for larger chunks and parallel processing
  • GPU: Accelerate LLM inference with CUDA support

LLM Provider Scaling

Shared LLM Service

Run a single LLM instance and connect multiple AnonDocs instances:

# Multiple AnonDocs instances
services:
anondocs-1:
environment:
- OLLAMA_BASE_URL=http://ollama:11434
anondocs-2:
environment:
- OLLAMA_BASE_URL=http://ollama:11434
ollama:
# Shared LLM service

Health Checks and Monitoring

Health Check Endpoint

curl http://localhost:3000/health

Kubernetes Liveness Probe

livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10

Monitoring Integration

  • Prometheus: Expose metrics endpoint
  • Grafana: Visualize metrics and performance
  • Logging: Centralized logging with ELK stack or similar

Next Steps

Proudly made byAI SmartTalkAI SmartTalk