Skip to main content

LLM Provider Setup

AnonDocs supports multiple LLM providers for PII detection and anonymization. All options can run locally on your infrastructure, ensuring complete data privacy.

Best for: Quick setup, ease of use, automatic model management

Ollama is the easiest way to get started with local LLMs. It handles model downloading and management automatically.

Installation

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai

Pull a Model

# Recommended models for PII detection
ollama pull mistral-nemo # Best accuracy (12B parameters)
ollama pull llama3.1 # Good balance (8B parameters)
ollama pull mistral # Fastest option (7B parameters)

Configuration

Add to your .env file:

DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mistral-nemo

Verify Setup

# Check if Ollama is running
curl http://localhost:11434/api/tags

# List available models
ollama list

Option 2: vLLM (High Performance)

Best for: Production deployments, high throughput, GPU acceleration

vLLM provides high-performance inference with GPU acceleration support.

Installation

# Install vLLM (requires Python 3.8+)
pip install vllm

Start vLLM Server

python -m vllm.entrypoints.openai.api_server \
--model mistralai/Mistral-7B-Instruct-v0.2 \
--host 0.0.0.0 \
--port 8000

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_MODEL=mistralai/Mistral-7B-Instruct-v0.2
OPENAI_API_KEY=not-required

Option 3: LM Studio (GUI-Based)

Best for: Desktop users, visual interface, model testing

LM Studio provides a user-friendly GUI for managing local LLMs.

Installation

  1. Download LM Studio from https://lmstudio.ai/
  2. Install and open the application
  3. Download a model through the GUI (e.g., Mistral 7B Instruct)
  4. Start the local server (default port: 1234)

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=mistral-7b-instruct
OPENAI_API_KEY=not-required

Option 4: LocalAI

Best for: Self-hosted, Docker deployments, OpenAI API compatibility

LocalAI provides OpenAI-compatible API endpoints with Docker support.

Docker Setup

docker run -p 8080:8080 \
-v $PWD/models:/models \
localai/localai:latest

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8080/v1
OPENAI_MODEL=your-model-name
OPENAI_API_KEY=not-required

Model Recommendations

ModelSizeQualitySpeedBest ForResource Usage
mistral-nemo12B⭐⭐⭐⭐⭐⭐⭐⭐Best overall accuracyHigh
llama3.18B⭐⭐⭐⭐⭐⭐⭐⭐Good balanceMedium
mistral7B⭐⭐⭐⭐⭐⭐⭐⭐⭐Fastest, good qualityMedium
phi-33.8B⭐⭐⭐⭐⭐⭐⭐⭐Low resource usageLow

Choosing the Right Model

  • High Accuracy Required: Use mistral-nemo or llama3.1
  • Limited Resources: Use phi-3 or mistral
  • Production/GPU Available: Use vLLM with larger models
  • Quick Testing: Use mistral with Ollama

Configuration Details

Processing Options

# Chunk size for text processing (characters)
CHUNK_SIZE=1500

# Overlap between chunks (characters)
CHUNK_OVERLAP=0

# Process chunks in parallel (faster but uses more memory)
ENABLE_PARALLEL_CHUNKS=false

Performance Tuning

  • Sequential Processing (ENABLE_PARALLEL_CHUNKS=false): Safer, uses less memory, processes one chunk at a time
  • Parallel Processing (ENABLE_PARALLEL_CHUNKS=true): Faster for large documents, requires more memory

Testing Your Setup

Test Ollama

# Test model directly
ollama run mistral-nemo "Extract PII from: John Doe, john@example.com, 555-1234"

Test AnonDocs Integration

# Test text anonymization
curl -X POST http://localhost:3000/api/anonymize \
-H "Content-Type: application/json" \
-d '{
"text": "Contact John Doe at john@example.com",
"provider": "ollama"
}'

Troubleshooting

Ollama Connection Failed

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama if not running
ollama serve

# Verify model is available
ollama list

vLLM/OpenAI-Compatible API Connection Failed

# Check if server is running
curl http://localhost:8000/v1/models

# Verify URL includes /v1 suffix
# Check OPENAI_BASE_URL in .env

Poor PII Detection Quality

  • Try larger models (mistral-nemo, llama3.1)
  • Adjust CHUNK_SIZE - smaller chunks can improve accuracy
  • Try different models - some are better at entity recognition

Out of Memory

  • Reduce CHUNK_SIZE
  • Disable ENABLE_PARALLEL_CHUNKS
  • Use smaller models (mistral 7B, phi-3)
  • Increase system RAM or use models with lower parameters

Next Steps

Proudly made byAI SmartTalkAI SmartTalk