Security Models Designed For Runtime Protection
Unlike general-purpose foundation models, Patronus models are purpose-built for AI security, governance and runtime protection.
Our detection engine combines multiple layers of analysis to achieve fast, reliable and deterministic decisions across enterprise environments.
01
Ensemble Architecture
Multiple specialized detection layers — heuristics, gradient-boosted trees and transformer models — combine for robust, accurate security decisions.
02
Sub-5ms Latency
Deterministic heuristic rules provide near-instant classification without model inference — critical for real-time endpoint protection at scale.
03
On-Device Inference
All analysis runs locally on the endpoint. No prompts, responses or agent data leave your system — ever.
Instead of relying on a single large model, Patronus uses an ensemble architecture combining heuristics, machine learning techniques and compact transformer models. This approach delivers lower latency, reduced resource consumption and significantly improved robustness for real-world AI security workflows.
Built for endpoint deployment, Patronus models continuously analyze prompts, responses, agent activity, tool calls and MCP communication without sending sensitive data to external services.
The first layer of analysis uses deterministic heuristics and protocol-aware detection rules.
This layer provides near-instant detection for known patterns and enables extremely low-latency classification without requiring model inference.
Heuristic Rules Engine
Prompt injection signature match
BLOCK
PII leak detected
BLOCK
Standard API request
ALLOW
LightGBM Risk Scoring
Risk Score
0.82
Confidence
94%
Threat Class:
Prompt Injection
The second layer combines traditional machine learning techniques optimized for structured security signals.
Patronus primarily uses gradient-boosted decision tree models such as LightGBM to evaluate risk indicators generated from prompts, responses, tools and agent activity.
These models provide excellent performance while remaining highly efficient on endpoint devices.
The final layer uses compact transformer-based language models specifically optimized for AI security tasks.
Unlike general-purpose chat models, these models are trained exclusively for security classification and content understanding.
Patronus currently focuses on BERT-style architectures optimized for local deployment and real-time inference.
mmBERT Security Classifier
Input
"Ignore previous instructions and..."
88%
Prompt Injection
8%
Jailbreak Attempt
Multi-Task AI Architecture
Patronus uses a multi-task AI architecture inspired by mixture-of-experts systems.
Instead of relying on one large general-purpose model, Patronus routes security tasks to specialized detection layers optimized for speed, accuracy and memory efficiency.
01
Reduced Memory Footprint
Multi-task models detect multiple threat classes within a single compact model, reducing memory usage while still covering prompt injection, sensitive data exposure, policy violations and agent risk.
02
Ensemble Decisions
Patronus combines signals from heuristics, gradient-boosted trees and transformer models to improve threat detection, reduce false positives and provide more robust risk decisions.
03
Intelligent Model Routing
Inspired by mixture-of-experts architectures, Patronus dynamically routes each interaction to the most suitable detection layer: heuristics, machine learning classifiers or compact transformer models.
HuggingFace
Wolf Defender
Wolf Defender is our open AI security model for prompt injection detection — built to identify jailbreaks, instruction overrides and agent manipulation attempts before they impact AI systems. Lightweight, security-focused and optimized for real-world deployment.
BERT-based
Architecture
On-Device
Inference
Prompt Injection
Primary Use Case
European
Built & Maintained
Patronus develops specialized AI security models focused on privacy, governance and runtime protection.
Designed and developed in Europe, our models are optimized for endpoint deployment, enterprise environments and regulatory requirements such as GDPR, NIS2 and the EU AI Act.
European AI Compliance
GDPR
Data residency and privacy by design
NIS2
Incident reporting and risk management
EU AI Act
High-risk AI system compliance
On-Device
No cloud routing, zero data exposure
Frequently Asked Questions
What is Wolf Defender?
Wolf Defender is an open-source AI security model developed by Patronus, designed specifically for prompt injection detection. It identifies jailbreaks, instruction overrides and agent manipulation attempts before they reach or impact AI systems. Wolf Defender is available on HuggingFace and optimized for local, on-device deployment.
How does Patronus detect prompt injection and AI threats?
Patronus uses a three-layer ensemble architecture. The first layer applies deterministic heuristic rules for sub-millisecond detection of known attack patterns. The second layer uses gradient-boosted machine learning models (LightGBM) to evaluate structured risk signals. The third layer runs compact BERT-style transformer models for deep semantic security classification. All three layers operate locally on the endpoint — no data is sent to external services.
Does Patronus send my data to the cloud?
No. Patronus is built for on-device inference, meaning all analysis — including prompt scanning, response monitoring, agent activity and tool call inspection — runs entirely on your local endpoint. Your prompts, responses and sensitive data never leave your system.
Is Patronus GDPR and EU AI Act compliant?
Yes. Patronus is designed and developed in Europe with EU regulatory requirements at its core. On-device inference ensures data residency compliance under GDPR. The platform's governance and runtime protection capabilities are also aligned with NIS2 and EU AI Act requirements for high-risk AI systems.
What makes Patronus different from other AI security tools?
Unlike general-purpose cloud-based security platforms that route your data through external servers, Patronus uses purpose-built European AI security models that run entirely on your device. The ensemble architecture — combining heuristics, LightGBM classifiers and compact transformer models — delivers lower latency, reduced resource use and higher accuracy than single-model approaches. Patronus covers prompts, responses, agent activity, tool calls and MCP communication in real time.

