Zum Hauptinhalt springen

LLM Provider Setup

AnonDocs supports multiple LLM providers for PII detection and anonymization. All options can run locally on your infrastructure, ensuring complete data privacy.

Best for: Quick setup, ease of use, automatic model management

Ollama is the easiest way to get started with local LLMs. It handles model downloading and management automatically.

Installation

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai

Pull a Model

# Recommended models for PII detection
ollama pull mistral-nemo # Best accuracy (12B parameters)
ollama pull llama3.1 # Good balance (8B parameters)
ollama pull mistral # Fastest option (7B parameters)

Configuration

Add to your .env file:

DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mistral-nemo

Verify Setup

# Check if Ollama is running
curl http://localhost:11434/api/tags

# List available models
ollama list

Option 2: vLLM (High Performance)

Best for: Production deployments, high throughput, GPU acceleration

vLLM provides high-performance inference with GPU acceleration support.

Installation

# Install vLLM (requires Python 3.8+)
pip install vllm

Start vLLM Server

python -m vllm.entrypoints.openai.api_server \
--model mistralai/Mistral-7B-Instruct-v0.2 \
--host 0.0.0.0 \
--port 8000

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_MODEL=mistralai/Mistral-7B-Instruct-v0.2
OPENAI_API_KEY=not-required

Option 3: LM Studio (GUI-Based)

Best for: Desktop users, visual interface, model testing

LM Studio provides a user-friendly GUI for managing local LLMs.

Installation

  1. Download LM Studio from https://lmstudio.ai/
  2. Install and open the application
  3. Download a model through the GUI (e.g., Mistral 7B Instruct)
  4. Start the local server (default port: 1234)

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=mistral-7b-instruct
OPENAI_API_KEY=not-required

Option 4: LocalAI

Best for: Self-hosted, Docker deployments, OpenAI API compatibility

LocalAI provides OpenAI-compatible API endpoints with Docker support.

Docker Setup

docker run -p 8080:8080 \
-v $PWD/models:/models \
localai/localai:latest

Configuration

DEFAULT_LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:8080/v1
OPENAI_MODEL=your-model-name
OPENAI_API_KEY=not-required

Model Recommendations

ModelSizeQualitySpeedBest ForResource Usage
mistral-nemo12B⭐⭐⭐⭐⭐⭐⭐⭐Best overall accuracyHigh
llama3.18B⭐⭐⭐⭐⭐⭐⭐⭐Good balanceMedium
mistral7B⭐⭐⭐⭐⭐⭐⭐⭐⭐Fastest, good qualityMedium
phi-33.8B⭐⭐⭐⭐⭐⭐⭐⭐Low resource usageLow

Choosing the Right Model

  • High Accuracy Required: Use mistral-nemo or llama3.1
  • Limited Resources: Use phi-3 or mistral
  • Production/GPU Available: Use vLLM with larger models
  • Quick Testing: Use mistral with Ollama

Configuration Details

Processing Options

# Chunk size for text processing (characters)
CHUNK_SIZE=1500

# Overlap between chunks (characters)
CHUNK_OVERLAP=0

# Process chunks in parallel (faster but uses more memory)
ENABLE_PARALLEL_CHUNKS=false

Performance Tuning

  • Sequential Processing (ENABLE_PARALLEL_CHUNKS=false): Safer, uses less memory, processes one chunk at a time
  • Parallel Processing (ENABLE_PARALLEL_CHUNKS=true): Faster for large documents, requires more memory

Testing Your Setup

Test Ollama

# Test model directly
ollama run mistral-nemo "Extract PII from: John Doe, john@example.com, 555-1234"

Test AnonDocs Integration

# Test text anonymization
curl -X POST http://localhost:3000/api/anonymize \
-H "Content-Type: application/json" \
-d '{
"text": "Contact John Doe at john@example.com",
"provider": "ollama"
}'

Troubleshooting

Ollama Connection Failed

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama if not running
ollama serve

# Verify model is available
ollama list

vLLM/OpenAI-Compatible API Connection Failed

# Check if server is running
curl http://localhost:8000/v1/models

# Verify URL includes /v1 suffix
# Check OPENAI_BASE_URL in .env

Poor PII Detection Quality

  • Try larger models (mistral-nemo, llama3.1)
  • Adjust CHUNK_SIZE - smaller chunks can improve accuracy
  • Try different models - some are better at entity recognition

Out of Memory

  • Reduce CHUNK_SIZE
  • Disable ENABLE_PARALLEL_CHUNKS
  • Use smaller models (mistral 7B, phi-3)
  • Increase system RAM or use models with lower parameters

Next Steps

Proudly made byAI SmartTalkAI SmartTalk