LLM Provider Setup

AnonDocs supports multiple LLM providers for PII detection and anonymization. All options can run locally on your infrastructure, ensuring complete data privacy.

Recommended Options

Option 1: Ollama (Recommended - Easiest Setup)

Best for: Quick setup, ease of use, automatic model management

Ollama is the easiest way to get started with local LLMs. It handles model downloading and management automatically.

Installation

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai

Pull a Model

# Recommended models for PII detection
ollama pull mistral-nemo       # Best accuracy (12B parameters)
ollama pull llama3.1           # Good balance (8B parameters)
ollama pull mistral            # Fastest option (7B parameters)

Configuration

Add to your .env file:

DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mistral-nemo

Verify Setup

# Check if Ollama is running
curl http://localhost:11434/api/tags

# List available models
ollama list

Option 2: vLLM (High Performance)

Best for: Production deployments, high throughput, GPU acceleration

vLLM provides high-performance inference with GPU acceleration support.

Installation

# Install vLLM (requires Python 3.8+)
pip install vllm

Start vLLM Server

python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --host 0.0.0.0 \
  --port 8000

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_MODEL=mistralai/Mistral-7B-Instruct-v0.2
OPENAI_API_KEY=not-required

Option 3: LM Studio (GUI-Based)

Best for: Desktop users, visual interface, model testing

LM Studio provides a user-friendly GUI for managing local LLMs.

Installation

Download LM Studio from https://lmstudio.ai/
Install and open the application
Download a model through the GUI (e.g., Mistral 7B Instruct)
Start the local server (default port: 1234)

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=mistral-7b-instruct
OPENAI_API_KEY=not-required

Option 4: LocalAI

Best for: Self-hosted, Docker deployments, OpenAI API compatibility

LocalAI provides OpenAI-compatible API endpoints with Docker support.

Docker Setup

docker run -p 8080:8080 \
  -v $PWD/models:/models \
  localai/localai:latest

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8080/v1
OPENAI_MODEL=your-model-name
OPENAI_API_KEY=not-required

Model Recommendations

Model	Size	Quality	Speed	Best For	Resource Usage
mistral-nemo	12B	⭐⭐⭐⭐⭐	⭐⭐⭐	Best overall accuracy	High
llama3.1	8B	⭐⭐⭐⭐	⭐⭐⭐⭐	Good balance	Medium
mistral	7B	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	Fastest, good quality	Medium
phi-3	3.8B	⭐⭐⭐	⭐⭐⭐⭐⭐	Low resource usage	Low

Choosing the Right Model

High Accuracy Required: Use mistral-nemo or llama3.1
Limited Resources: Use phi-3 or mistral
Production/GPU Available: Use vLLM with larger models
Quick Testing: Use mistral with Ollama

Configuration Details

Processing Options

# Chunk size for text processing (characters)
CHUNK_SIZE=1500

# Overlap between chunks (characters)
CHUNK_OVERLAP=0

# Process chunks in parallel (faster but uses more memory)
ENABLE_PARALLEL_CHUNKS=false

Performance Tuning

Sequential Processing (ENABLE_PARALLEL_CHUNKS=false): Safer, uses less memory, processes one chunk at a time
Parallel Processing (ENABLE_PARALLEL_CHUNKS=true): Faster for large documents, requires more memory

Testing Your Setup

Test Ollama

# Test model directly
ollama run mistral-nemo "Extract PII from: John Doe, john@example.com, 555-1234"

Test AnonDocs Integration

# Test text anonymization
curl -X POST http://localhost:3000/api/anonymize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Contact John Doe at john@example.com",
    "provider": "ollama"
  }'

Troubleshooting

Ollama Connection Failed

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama if not running
ollama serve

# Verify model is available
ollama list

vLLM/OpenAI-Compatible API Connection Failed

# Check if server is running
curl http://localhost:8000/v1/models

# Verify URL includes /v1 suffix
# Check OPENAI_BASE_URL in .env

Poor PII Detection Quality

Try larger models (mistral-nemo, llama3.1)
Adjust CHUNK_SIZE - smaller chunks can improve accuracy
Try different models - some are better at entity recognition

Out of Memory

Reduce CHUNK_SIZE
Disable ENABLE_PARALLEL_CHUNKS
Use smaller models (mistral 7B, phi-3)
Increase system RAM or use models with lower parameters

Next Steps

⚙️ Configuration Guide - Advanced configuration options
🚀 Deployment Options - Production deployment strategies

Recommended Options​

Option 1: Ollama (Recommended - Easiest Setup)​

Installation​

Pull a Model​

Configuration​

Verify Setup​

Option 2: vLLM (High Performance)​

Installation​

Start vLLM Server​

Configuration​

Option 3: LM Studio (GUI-Based)​

Installation​

Configuration​

Option 4: LocalAI​

Docker Setup​

Configuration​

Model Recommendations​

Choosing the Right Model​

Configuration Details​

Processing Options​

Performance Tuning​

Testing Your Setup​

Test Ollama​

Test AnonDocs Integration​

Troubleshooting​

Ollama Connection Failed​

vLLM/OpenAI-Compatible API Connection Failed​

Poor PII Detection Quality​

Out of Memory​

Next Steps​

Recommended Options

Option 1: Ollama (Recommended - Easiest Setup)

Installation

Pull a Model

Configuration

Verify Setup

Option 2: vLLM (High Performance)

Installation

Start vLLM Server

Configuration

Option 3: LM Studio (GUI-Based)

Installation

Configuration

Option 4: LocalAI

Docker Setup

Configuration

Model Recommendations

Choosing the Right Model

Configuration Details

Processing Options

Performance Tuning

Testing Your Setup

Test Ollama

Test AnonDocs Integration

Troubleshooting

Ollama Connection Failed

vLLM/OpenAI-Compatible API Connection Failed

Poor PII Detection Quality

Out of Memory

Next Steps