Compliance Redaction: What Works and What Doesn’t

This guide explains how to use compliance presets effectively and avoid common pitfalls that lead to data exposure.

Key insight: Fapilog redacts based on field names, not field content. Understanding this is critical for compliance.

The Golden Rule

Redaction works on structured data with predictable field names.

from fapilog import LoggerBuilder

logger = LoggerBuilder().with_redaction(preset="GDPR_PII").build()

# ✅ WORKS: Named fields are redacted
logger.info("User signup", email="john@example.com", phone="+1-555-1234")
# Output: {"data": {"email": "***", "phone": "***"}}

# ❌ FAILS: PII buried in a string is NOT redacted
logger.info(f"User signed up: john@example.com, phone: +1-555-1234")
# Output: {"message": "User signed up: john@example.com, phone: +1-555-1234"}

What Gets Redacted

Redaction matches field names (and paths) against preset definitions:

Scenario

Redacted?

Why

email="john@example.com"

✅ Yes

Field name email matches GDPR_PII

user_email="john@example.com"

✅ Yes

Pattern .*email.* matches

contact={"email": "john@example.com"}

✅ Yes

Nested path contact.email matches

message="Email: john@example.com"

❌ No

Content scanning not performed

notes="Call john@example.com"

❌ No

Field name notes doesn’t match

data="SSN: 123-45-6789"

❌ No

data is a generic field name

What Does NOT Get Redacted

1. PII in Message Strings

# ❌ BAD: PII in the message string
logger.info(f"Processing order for {user.email}")
# Output: {"message": "Processing order for john@example.com"}

# ✅ GOOD: PII in named fields
logger.info("Processing order", email=user.email)
# Output: {"message": "Processing order", "data": {"email": "***"}}

2. PII in Arbitrarily-Named Fields

# ❌ BAD: Field name doesn't match any preset pattern
logger.info("Support ticket", customer_contact="john@example.com")
# Output: {"data": {"customer_contact": "john@example.com"}}

# ✅ GOOD: Use recognized field names
logger.info("Support ticket", email="john@example.com")
# Output: {"data": {"email": "***"}}

# ✅ ALSO GOOD: Add custom fields to cover your domain
logger = (
    LoggerBuilder()
    .with_redaction(preset="GDPR_PII")
    .with_redaction(fields=["customer_contact"])
    .build()
)

3. PII in Serialized Objects

# ❌ BAD: Serialized JSON string
user_json = '{"email": "john@example.com", "ssn": "123-45-6789"}'
logger.info("User data", payload=user_json)
# Output: {"data": {"payload": "{\"email\": \"john@example.com\", ...}"}}

# ✅ GOOD: Pass as dict, not string
user_data = {"email": "john@example.com", "ssn": "123-45-6789"}
logger.info("User data", **user_data)
# Output: {"data": {"email": "***", "ssn": "***"}}

4. PII in Exception Messages

# ❌ BAD: PII in exception message
try:
    process_user(email)
except Exception as e:
    logger.error(f"Failed for user {email}: {e}")
# Output: {"message": "Failed for user john@example.com: ..."}

# ✅ GOOD: PII in structured field
try:
    process_user(email)
except Exception as e:
    logger.error("User processing failed", email=email, error=str(e))
# Output: {"data": {"email": "***", "error": "..."}}

Structuring Logs for Compliance

Use Named Fields for All PII

# Instead of this:
logger.info(f"User {name} ({email}) logged in from {ip}")

# Do this:
logger.info("User logged in", name=name, email=email, ip_address=ip)

Use Context for Request-Scoped PII

from fapilog import LoggerBuilder

logger = LoggerBuilder().with_redaction(preset="GDPR_PII").build()

# Bind user context once
request_logger = logger.bind(
    email=request.user.email,
    ip_address=request.client.host,
)

# All subsequent logs have PII in named fields (and redacted)
request_logger.info("Viewing dashboard")
request_logger.info("Updated settings", setting="theme")
request_logger.warning("Rate limit approaching")

Pass Objects, Not Strings

# ❌ Avoid string interpolation
logger.info(f"Order {order.id} for {order.customer_email}")

# ✅ Pass structured data
logger.info("Order placed", order_id=order.id, email=order.customer_email)

# ✅ Or unpack relevant fields
logger.info("Order placed", **order.to_log_dict())

Testing Your Redaction

Before deploying, verify PII is actually redacted:

import pytest
from fapilog import LoggerBuilder
from fapilog.testing import capture_logs


@pytest.mark.asyncio
async def test_email_redacted_in_named_field():
    """Email in named field should be redacted."""
    async with capture_logs() as logs:
        logger = await (
            LoggerBuilder()
            .with_redaction(preset="GDPR_PII")
            .build_async()
        )
        await logger.info("signup", email="test@example.com")

    assert "test@example.com" not in logs.text
    assert "***" in logs.text


@pytest.mark.asyncio
async def test_email_in_message_NOT_redacted():
    """Email in message string is NOT redacted - this is expected behavior."""
    async with capture_logs() as logs:
        logger = await (
            LoggerBuilder()
            .with_redaction(preset="GDPR_PII")
            .build_async()
        )
        # This is the WRONG way to log PII
        await logger.info("User email: test@example.com")

    # PII IS exposed - this test documents the limitation
    assert "test@example.com" in logs.text


@pytest.mark.asyncio
async def test_custom_field_requires_explicit_config():
    """Custom field names need explicit configuration."""
    async with capture_logs() as logs:
        logger = await (
            LoggerBuilder()
            .with_redaction(preset="GDPR_PII")
            .with_redaction(fields=["customer_contact"])  # Add custom field
            .build_async()
        )
        await logger.info("ticket", customer_contact="test@example.com")

    assert "test@example.com" not in logs.text

Compliance Checklist

Before going to production with compliance redaction:

  • Audit your logging calls - Search codebase for f-strings and .format() in log calls

  • Use structured fields - All PII should be in named fields, not message strings

  • Add domain-specific fields - Extend presets with your custom field names

  • Test redaction - Write tests that verify PII is masked

  • Review preset coverage - Check Presets Reference for what’s covered

  • Document gaps - Note any PII that can’t be redacted (e.g., user-generated content)

Common Patterns by Regulation

GDPR (EU)

logger = (
    LoggerBuilder()
    .with_redaction(preset="GDPR_PII")
    .with_redaction(fields=["customer_id", "account_ref"])  # Your domain fields
    .build()
)

# Always use named fields for Article 4 personal data
logger.info(
    "Data subject request",
    email=user.email,           # Redacted
    name=user.full_name,        # Redacted
    ip_address=request.ip,      # Redacted
    request_type="erasure",     # Not PII, preserved
)

HIPAA (US Healthcare)

logger = (
    LoggerBuilder()
    .with_redaction(preset="HIPAA_PHI")
    .with_redaction(fields=["chart_number", "room_number"])
    .build()
)

# All 18 PHI identifiers should be in named fields
logger.info(
    "Patient admission",
    mrn=patient.medical_record_number,  # Redacted
    dob=patient.date_of_birth,          # Redacted
    ssn=patient.ssn,                    # Redacted
    admission_type="emergency",         # Not PHI, preserved
)

PCI-DSS (Payment Cards)

logger = (
    LoggerBuilder()
    .with_redaction(preset="PCI_DSS")
    .build()
)

# Never log full card numbers, but if you must log payment events:
logger.info(
    "Payment processed",
    card_number=card.pan,        # Redacted (but don't log this!)
    cardholder=card.name,        # Redacted
    amount=transaction.amount,   # Not cardholder data, preserved
    last_four=card.pan[-4:],     # Consider if this is acceptable
)

Summary

Do

Don’t

Use named fields for PII

Embed PII in message strings

Pass dicts, not JSON strings

Serialize objects before logging

Bind PII to context

Interpolate PII in f-strings

Extend presets with your fields

Assume all field names are covered

Test redaction in CI

Deploy without verification