
Key Features 1
Discovery Module
- Files: Scans individual files to locate PII and other sensitive data.
- Directories: Monitors entire directories, enabling organized scanning within specified file structures.
- Web Pages (WWW): Searches for confidential data on specified URLs, allowing targeted data discovery across the web.
- Databases: Analyzes structured databases to identify and flag sensitive information within data records.
Key Features 2
Detection Module
Central to Piipatrol’s operation, the Detection Module processes data collected by the Discovery Module to identify PII and other confidential information.
Pattern Matching: Identifies sensitive data by recognizing specific patterns (e.g., credit card numbers, social security numbers).
Dictionary Matching: Uses a dictionary of keywords and terms to detect predefined confidential information
AI Integration: Leverages advanced AI models for contextual analysis and complex pattern recognition, enhancing Piipatrol’s ability to identify nuanced forms of sensitive data.
LLM Cluster (Kubernetes):
- A Kubernetes-managed cluster of Large Language Models (LLMs) provides scalable, AI-driven analysis. This infrastructure supports the AI component within the Detection Module, enabling sophisticated language processing for accurate data identification.
Key Features 3
Database Support
- Piipatrol supports both PostgreSQL and SQLite databases. Detection results can be stored in the database, facilitating easy access, reporting, and integration with other systems. This flexibility allows Piipatrol to meet various deployment needs, from enterprise-scale to lightweight implementations.
Optional Integrations
Log Collection Systems
Wazuh and Logpatrol: These log collection systems can feed data into Piipatrol via a REST API, enabling Piipatrol to analyze logs for sensitive information.
Format Preserving Encryption (FPE)
Octomim FPE Engine integrates with Piipatrol to apply format-preserving encryption to sensitive data, ensuring that encrypted data retains its original format for downstream compatibility.