LLM Provider Setup
AnonDocs supports multiple LLM providers for PII detection and anonymization. All options can run locally on your infrastructure, ensuring complete data privacy.
Recommended Options
Option 1: Ollama (Recommended - Easiest Setup)
Best for: Quick setup, ease of use, automatic model management
Ollama is the easiest way to get started with local LLMs. It handles model downloading and management automatically.
Installation
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Or download from https://ollama.ai
Pull a Model
# Recommended models for PII detection
ollama pull mistral-nemo # Best accuracy (12B parameters)
ollama pull llama3.1 # Good balance (8B parameters)
ollama pull mistral # Fastest option (7B parameters)
Configuration
Add to your .env file:
DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mistral-nemo
Verify Setup
# Check if Ollama is running
curl http://localhost:11434/api/tags
# List available models
ollama list
Option 2: vLLM (High Performance)
Best for: Production deployments, high throughput, GPU acceleration
vLLM provides high-performance inference with GPU acceleration support.
Installation
# Install vLLM (requires Python 3.8+)
pip install vllm
Start vLLM Server
python -m vllm.entrypoints.openai.api_server \
--model mistralai/Mistral-7B-Instruct-v0.2 \
--host 0.0.0.0 \
--port 8000
Configuration
DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_MODEL=mistralai/Mistral-7B-Instruct-v0.2
OPENAI_API_KEY=not-required
Option 3: LM Studio (GUI-Based)
Best for: Desktop users, visual interface, model testing
LM Studio provides a user-friendly GUI for managing local LLMs.
Installation
- Download LM Studio from https://lmstudio.ai/
- Install and open the application
- Download a model through the GUI (e.g., Mistral 7B Instruct)
- Start the local server (default port: 1234)
Configuration
DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=mistral-7b-instruct
OPENAI_API_KEY=not-required
Option 4: LocalAI
Best for: Self-hosted, Docker deployments, OpenAI API compatibility
LocalAI provides OpenAI-compatible API endpoints with Docker support.
Docker Setup
docker run -p 8080:8080 \
-v $PWD/models:/models \
localai/localai:latest
Configuration
DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8080/v1
OPENAI_MODEL=your-model-name
OPENAI_API_KEY=not-required
Model Recommendations
| Model | Size | Quality | Speed | Best For | Resource Usage |
|---|---|---|---|---|---|
| mistral-nemo | 12B | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | Best overall accuracy | High |
| llama3.1 | 8B | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Good balance | Medium |
| mistral | 7B | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Fastest, good quality | Medium |
| phi-3 | 3.8B | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Low resource usage | Low |
Choosing the Right Model
- High Accuracy Required: Use
mistral-nemoorllama3.1 - Limited Resources: Use
phi-3ormistral - Production/GPU Available: Use vLLM with larger models
- Quick Testing: Use
mistralwith Ollama
Configuration Details
Processing Options
# Chunk size for text processing (characters)
CHUNK_SIZE=1500
# Overlap between chunks (characters)
CHUNK_OVERLAP=0
# Process chunks in parallel (faster but uses more memory)
ENABLE_PARALLEL_CHUNKS=false
Performance Tuning
- Sequential Processing (
ENABLE_PARALLEL_CHUNKS=false): Safer, uses less memory, processes one chunk at a time - Parallel Processing (
ENABLE_PARALLEL_CHUNKS=true): Faster for large documents, requires more memory
Testing Your Setup
Test Ollama
# Test model directly
ollama run mistral-nemo "Extract PII from: John Doe, john@example.com, 555-1234"
Test AnonDocs Integration
# Test text anonymization
curl -X POST http://localhost:3000/api/anonymize \
-H "Content-Type: application/json" \
-d '{
"text": "Contact John Doe at john@example.com",
"provider": "ollama"
}'
Troubleshooting
Ollama Connection Failed
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Start Ollama if not running
ollama serve
# Verify model is available
ollama list
vLLM/OpenAI-Compatible API Connection Failed
# Check if server is running
curl http://localhost:8000/v1/models
# Verify URL includes /v1 suffix
# Check OPENAI_BASE_URL in .env
Poor PII Detection Quality
- Try larger models (mistral-nemo, llama3.1)
- Adjust
CHUNK_SIZE- smaller chunks can improve accuracy - Try different models - some are better at entity recognition
Out of Memory
- Reduce
CHUNK_SIZE - Disable
ENABLE_PARALLEL_CHUNKS - Use smaller models (mistral 7B, phi-3)
- Increase system RAM or use models with lower parameters
Next Steps
- ⚙️ Configuration Guide - Advanced configuration options
- 🚀 Deployment Options - Production deployment strategies