Worka PII

Worka PII is a Rust-first library for detecting and anonymizing personally identifiable information (PII). It is designed for deterministic output, capability-aware NLP, and audit-friendly redaction so teams can safely run AI workflows in CPU-only environments.

This documentation covers the concepts, architecture, and APIs that make Worka PII predictable and composable in production systems. Use it when you need stable byte offsets, explicit policies, and a pipeline that degrades gracefully when language features are unavailable.

What it provides

Deterministic detection with stable byte offsets.
A modular pipeline of recognizers, validators, and optional NER.
Policy-driven anonymization with explicit operators per entity type.
An audit-friendly output model that preserves the original spans.

How it fits into Worka

Worka uses PII to sanitize prompts, tool inputs, and stored artifacts before they reach external systems. The same deterministic spans are also used for event logs and audit trails so redaction is reproducible.

Quick start

use pii::anonymize::{AnonymizeConfig, Anonymizer, Operator};
use pii::nlp::SimpleNlpEngine;
use pii::presets::default_recognizers;
use pii::{Analyzer, PolicyConfig};
use pii::types::Language;
use std::collections::HashMap;

let analyzer = Analyzer::new(
    Box::new(SimpleNlpEngine::default()),
    default_recognizers(),
    Vec::new(),
    PolicyConfig::default(),
);

let text = "Email jane@example.com or call +1 415-555-1212.";
let result = analyzer.analyze(text, &Language::from("en")).unwrap();

let mut config = AnonymizeConfig::default();
let mut per_entity = HashMap::new();
per_entity.insert("Email".to_string(), Operator::Replace { with: "<EMAIL>".into() });
per_entity.insert("Phone".to_string(), Operator::Mask { ch: '*', from_end: 4 });
config.per_entity = per_entity;

let redacted = Anonymizer::anonymize(text, &result.entities, &config).unwrap();
println!("{}", redacted.text);

Where to go next

Learn the pipeline and entity model in Fundamentals.
Review the deterministic offset and audit rules in Architecture.
Use the API reference to build custom recognizers or policies.
See real-world patterns in Scenarios.

What it provides​

How it fits into Worka​

Quick start​

Where to go next​

What it provides

How it fits into Worka

Quick start

Where to go next