Introduction
Welcome to AnonDocs - an open-source document anonymization tool designed to help you protect privacy while sharing knowledge. Proudly developed by AI SmartTalk, AnonDocs empowers individuals and organizations to remove sensitive information from documents before sharing them, ensuring compliance with privacy regulations like GDPR while maintaining document readability (structure is preserved for DOCX files).
What is AnonDocs?β
AnonDocs is a privacy-first, self-hostable microservice that uses AI to automatically detect and anonymize Personally Identifiable Information (PII) in documents. It supports multiple file formats (PDF, DOCX, TXT) and can process both uploaded files and raw text input.
Key Featuresβ
- π Privacy-First: All processing happens locally on your infrastructure - no data ever leaves your control
- π€ AI-Powered: Uses advanced LLM models (Ollama, OpenAI-compatible APIs) for intelligent PII detection
- π Multi-Format Support: Handles PDF, DOCX, and plain text files
- β‘ Real-Time Progress: Server-Sent Events (SSE) for live progress updates during anonymization
- π Open Source: Fully open-source under MIT license, transparent and auditable
- π Self-Hostable: Deploy on your own infrastructure for maximum control and compliance
How It Worksβ
AnonDocs follows a microservice architecture:
- Upload/Input: Documents or text are sent to the API endpoints
- Parsing: Files are parsed to extract text content (PDF, DOCX, TXT)
- Detection: LLM models analyze the text to detect PII (names, emails, phones, addresses, IDs)
- Anonymization: Detected PII is replaced with generic placeholders
- Output: Anonymized text is returned (DOCX structure is preserved, PDFs are converted to plain text)
Architecture Overviewβ
Quick Startβ
Try It Onlineβ
The easiest way to try AnonDocs is through our web interface at anondocs.org/anonymize. Simply upload a document or paste text, and get instant anonymization results.
SDK Quick Example (Recommended)β
import { AnonDocsClient } from '@aismarttalk/anondocs-sdk';
const client = new AnonDocsClient({
baseUrl: 'http://localhost:3000'
});
const result = await client.anonymizeText(
'Contact John Doe at john@example.com or call 555-1234'
);
console.log(result.anonymizedText);
// Output: Contact [NAME] at [EMAIL] or call [PHONE]
API Quick Exampleβ
import requests
# Anonymize text
response = requests.post('http://localhost:3000/api/anonymize', json={
'text': 'Contact John Doe at john@example.com or call 555-1234',
'provider': 'ollama'
})
print(response.json()['data']['anonymizedText'])
# Output: Contact [NAME] at [EMAIL] or call [PHONE]
Self-Host (5 Minutes)β
# 1. Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# 2. Pull a model
ollama pull mistral-nemo
# 3. Clone and start AnonDocs
git clone https://github.com/AI-SmartTalk/AnonDocs.git
cd AnonDocs
npm install
# 4. Configure (create .env)
echo "DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mistral-nemo" > .env
# 5. Start
npm start
For detailed self-hosting instructions, see our Self-Hosting Guide.
Use Casesβ
- π₯ Healthcare: Anonymize patient records before sharing with researchers
- βοΈ Legal: Redact sensitive information from legal documents for public disclosure
- πΌ HR: Process employee data while maintaining privacy
- π¦ Finance: Sanitize financial documents for analysis
- π Research: Share datasets without exposing personal information
- π Compliance: Meet GDPR, HIPAA, and other privacy regulations
Privacy & Securityβ
GDPR Complianceβ
β
Data Never Leaves Your Infrastructure - All processing happens locally on your servers
β
Zero Data Retention - Files are immediately deleted after processing, no storage
β
Open Source & Auditable - Review every line of code yourself
For more details, see our Privacy & Security documentation.
What's Next?β
- π¦ SDK & API Reference - Use the TypeScript/JavaScript SDK or REST API
- π Supported Formats - Learn about file format support
- π Self-Hosting Guide - Deploy your own instance
- π Privacy & Security - Understand our privacy guarantees
Getting Helpβ
- π¬ GitHub Discussions - Ask questions and share ideas
- π GitHub Issues - Report bugs or request features
- π GitHub Repository - View source code and contribute
AnonDocs - Protect Privacy, Share Knowledge. Open source document anonymization by AI SmartTalk.