Precise PII Detection at Scale
Runs air-gapped on your infrastructure with better accuracy than the hyperscaler alternatives.
Core Operations
PII Eraser offers four modes of operation to detect and anonymize PII, PCI and other sensitive data.
Identify selected entities, returning character offsets and confidence scores. Supports overlapping entities and nested PII.
Detect and replace entities with entity-specific placeholders like <NAME>. Ideal for preparing datasets for ML training and sanitizing LLM prompts.
Obfuscate characters while preserving format length (e.g., John -> ####). Useful for UX logs and where partial visibility is required.
Replace entities with consistent SHA-256 or SHA-512 hashes. Enables analytics on unique users without revealing identity.
import requests
# Detect sensitive data instances
response = requests.post(
"<base url>/text/detect",
json={
"text": ["Hallo Matthias"],
}
)import requests
# Replace sensitive data with entity types
response = requests.post(
"<base url>/text/transform",
json={
"text": ["Contact marcus@pii-eraser.com"],
"operator": "redact",
}
)import requests
# Mask characters with asterisks
response = requests.post(
"<base url>/text/transform",
json={
"text": ["My ID is 452-992-112"],
"operator": "mask",
}
)import requests
# Cryptographic hashing for analytics
response = requests.post(
"<base url>/text/transform",
json={
"text": ["User: John Smith"],
"operator": "hash",
}
){
"entities": [
[
{
"entity_type": "NAME",
"start": 6,
"end": 14,
"score": 0.995
}
]
],
"stats": { "total_tokens": 4, "tps": 5420 }
}{
"text": [
"Contact [EMAIL_ADDRESS]"
],
"entities": [
[
{
"entity_type": "EMAIL_ADDRESS",
"output_start": 8,
"output_end": 23
}
]
],
"stats": { "total_tokens": 8, "tps": 5420 }
}{
"text": [
"My ID is ***-***-112"
],
"entities": [
[
{
"entity_type": "US_SSN",
"output_start": 9,
"output_end": 20
}
]
],
"stats": { "total_tokens": 10, "tps": 5100 }
}{
"text": [
"User: <PERSON_8f4a2c>"
],
"entities": [
[
{
"entity_type": "PERSON",
"output_start": 6,
"output_end": 21
}
]
],
"stats": { "total_tokens": 5, "tps": 4950 }
}For the full range of options, including how to configure entity types via YAML, please visit the documentation
Designed for the Agentic AI Era
Built for LLM guardrail use, PII Eraser natively supports OpenAI-format chats to ensure seamless, context-aware privacy.
Process All Messages
Scans entire history every turn. High accuracy, but scales poorly.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Process Last Message
Scans only the latest prompt. Fast, but misses PII requiring context.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Smart Context
Automatically includes relevant context. The optimal balance.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Pools context between messages for higher accuracy without processing history needlessly.
Configure scanning for just completions or the last user prompt with intelligent context handling.
Optimized to handle RAG artifacts like chunking errors, scraping noise, and mixed-language content.
Great Accuracy, Globally
Accurate identification of 150+ entity types across Western Europe, North America and Australia.
150+ Localized Entity Types
Most systems are built for the US and retrofitted for the world with US-centric entity names like "SSN". PII Eraser is built from the ground up for global coverage with a focus on Western Europe, as well as Canada, US and Australia.
Regular Model Updates
The world changes fast. Older models fail on terms like "COVID" and aren't familiar with MCP tool calls. We continuously update our models to recognize contemporary entities and the shifting GenAI landscape.
No Regex Maintenance
PII Eraser relies on large encoder transformer models, freeing your team from maintaining fragile regex-based solutions. We also offer model updates free of charge in case we do miss something.
System Architecture
Customer Infrastructure (VPC / On-Prem)
Self-hosted, air-gapped deployment within your secure environment.
Data Pipeline
Storage
High Throughput
Capable of processing >5000 tokens/sec on a single instance via ARM SVX or x86 AMX instructions, such as AWS Graviton 4.
Flexible Deployment
Multiple deployment options including AWS Fargate, ECS and Azure Container Instances. CPU-only operation makes security scans, startup and autoscaling easy.
Seamless Migration
PII Eraser features Microsoft Presidio Analyzer compatibility endpoints, allowing for easy drop-in replacement.
Why PII Eraser?
Compare us against cloud APIs, open source libraries, and LLM-based approaches.
| Deployment Method | Self-hosted Container | Cloud API | Local Library | LLM (Generative) |
|---|---|---|---|---|
| Data Sovereignty | 100% Local / Air-gapped | Cloud Only | Local | Cloud (Mostly) |
| Cost Model | Hourly (Unlimited Tokens) | Per Character (Expensive) | Free (Maintenance Heavy) | Per Token (Very Expensive) |
| EU Localization | Native (DACH, FR, IT, etc.) | Limited (EN, ES, FR) | Requires Custom Regex | Varies (Training Bias) |
| Latency (p99) | < 20ms | ~100ms | ~200ms (Python overhead) | > 1000ms |
| Context Window | 1M+ Tokens | Short (Chunks required) | Short | Variable |
| Hallucinations | Zero (Deterministic) | Zero | Zero | Possible |
Get started in minutes with our free trial
Per-hour pricing on AWS and Azure Marketplaces. No sales calls and no lock-in.