Orca-Sonar: Our Multilingual Document Classifier for AI Security

Dominik Hommer

Jun 12, 2026

8 min

AI Research

Orca-Sonar: Our Multilingual Document Classifier for AI Security

The Problem

Anyone securing AI systems in the enterprise runs into one deceptively simple question: what is this text actually about?A contract, a résumé, a quarterly report, source code, or just a harmless bit of small talk — the answer decides whether content gets logged, redacted, blocked, or simply passed through.

In practice this is surprisingly hard. The same information shows up in wildly different shapes: as a clean document, an email thread, a Slack message — and increasingly as an instruction to an AI ("Summarize this contract for me: …"). Naïve classifiers latch onto the format instead of the content and get it wrong.

What Orca-Sonar Does

Orca-Sonar assigns every text to exactly one of seven classes:

Class	Examples
legal	contracts, NDAs, ToS, privacy policies, judgments, compliance
hr	résumés, job ads, employment contracts, performance reviews
finance	balance sheets, reports, invoices, cash flow, filings
internal_and_tech	ADRs, RFCs, postmortems, specs, wikis, architecture
source_code	raw code & configuration (Python, Go, SQL, Terraform …)
marketing	press releases, newsletters, sales & landing-page copy
other	conversational / non-business: small talk, recipes, learning

The guiding principle: the topic determines the class, not the format. Whether it arrives as a plain PDF or wrapped in "Hey, can you quickly check this: …", the content is what counts. On ambiguity, the more sensitive class wins (legal > hr > finance > internal_and_tech > source_code > marketing > other), so borderline cases are protected rather than leaked.

Under the Hood

Orca-Sonar is built on mmBERT (ModernBERT family), a compact, multilingual encoder. That makes the model:

fast & lightweight: small enough for cheap, low-latency inference, including near the edge;
genuinely multilingual: German and English trained as equals;
deployment-friendly: available as safetensors and as an FP16 ONNX variant (half the size, identical predictions).

Performance

The numbers below are measured on our own internal held-out test set (real data only), where Orca-Sonar reaches a macro-F1 of ~0.98. It generalizes especially well on source_code, legal, and hr — holding up even on text from sources quite different from the training data.

One honest caveat on interpreting that score: there is no established, general benchmark for this exact 7-class document-topic task, nobody has a standard test set covering legal / hr / finance / internal_and_tech / source_code / marketing / other across German and English. We're currently building a general benchmark for this task and evaluating Orca-Sonar against external, model-unseen datasets to get a realistic, cross-distribution picture.

How to Use It

from transformers import pipeline

clf = pipeline("text-classification", model="patronus-studio/orca-sonar-document-classifier")
clf("Summarize this service agreement for me: 24-month term, jurisdiction Munich …")
# -> [{'label': 'legal', 'score': 0.99}]

from transformers import pipeline

clf = pipeline("text-classification", model="patronus-studio/orca-sonar-document-classifier")
clf("Summarize this service agreement for me: 24-month term, jurisdiction Munich …")
# -> [{'label': 'legal', 'score': 0.99}]

from transformers import pipeline

clf = pipeline("text-classification", model="patronus-studio/orca-sonar-document-classifier")
clf("Summarize this service agreement for me: 24-month term, jurisdiction Munich …")
# -> [{'label': 'legal', 'score': 0.99}]

For production, low-latency deployments, the ONNX variant is available under onnx/onnx_fp16/.

The Dataset

Orca-Sonar was trained on our own in-house dataset (German + English, 7 classes), curated specifically for real-world security scenarios. We'll publish it shortly.

Part of Patronus Protect

Orca-Sonar complements our security stack around Wolf Defender (prompt-injection detection): where Wolf Defender asks is this input an attack?, Orca-Sonar answers what is it actually about? Together they form the basis for intelligent, policy-driven routing in AI pipelines.