Supported File Formats
AnonDocs supports anonymizing multiple document formats to help you protect sensitive information across different file types.
Document Formats
PDF Files
- Extension:
.pdf - Features: Full text extraction and anonymization
- Max Size: 25 MB
- Output Format: Plain text
- Note: Document formatting is not preserved - PDFs are converted to plain text
Microsoft Word Documents
- Extensions:
.doc,.docx - Features: Text extraction with formatting preservation
- Max Size: 25 MB
- Output Format: Plain text with preserved structure
- Note: Document formatting is preserved - paragraph breaks, line spacing, and structure are maintained
Plain Text Files
- Extension:
.txt - Features: Direct text processing
- Max Size: 25 MB
- Best For: Quick anonymization of text content
Processing Notes
- All document formats are converted to text during processing
- The anonymized output is provided as plain text
- DOCX files: Original formatting (paragraph breaks, line spacing, structure) is preserved
- PDF files: Formatting is not preserved - text is extracted without layout structure
- TXT files: No formatting to preserve
- Multiple files can be processed in batch
Limitations
- Maximum file size: 25 MB per file
- Images and embedded content are not processed
- PDF files: Formatting and layout are lost during text extraction
- DOCX files: Complex formatting (tables, images, special styles) may be simplified
- PDFs with scanned images (OCR) are not supported - only text-based PDFs work