How Patronus Detects AI Traffic Across WebSockets, SSE, gRPC, Protobuf and Modern Encodings
Benedikt Veith
7 min
AI Traffic Detection

Most AI security tools still assume that AI traffic looks like a simple JSON request over HTTPS.
That assumption is becoming increasingly outdated.
Modern AI systems communicate through:
WebSockets
Server-Sent Events (SSE)
gRPC
HTTP/2 streams
Protobuf
MsgPack
binary framed protocols
compressed transport layers
As AI agents, coding assistants and MCP-based systems become more common, AI traffic is increasingly moving away from plain JSON and toward streaming and binary communication.
At Patronus, we recently spent a significant amount of time expanding our transparent AI traffic inspection layer to better understand how modern AI communication actually behaves on the wire.
The result is a protocol and encoding coverage matrix that maps:
transport support,
AI traffic detection,
semantic extraction,
and intentional fail-closed behavior.
Why Is AI Traffic Detection Becoming More Difficult?
A few years ago, most AI interactions looked relatively simple:
HTTP/1.1
REST API
JSON request
JSON response
Today, modern AI systems increasingly use:
streaming token responses,
bidirectional communication,
protobuf serialization,
binary framing,
multiplexed HTTP/2 streams,
and persistent realtime sessions.
This creates a major challenge for AI security and AI observability systems.
If a security layer only understands plain JSON traffic, it loses visibility into:
AI coding assistants,
desktop AI applications,
MCP communication,
local AI runtimes,
agent frameworks,
and realtime streaming systems.
That is one of the reasons why we believe AI security increasingly becomes a protocol and semantic visibility problem rather than just an API integration problem.
Which Protocols And Encodings Does Patronus Currently Cover?
The current matrix covers multiple transport layers and serialization formats including:
HTTP/1.1
HTTP/2
WebSockets
SSE
gRPC
And encodings such as:
JSON
JSONL
Protobuf
MsgPack
TLV binary
length-prefixed binary
gzip/zstd compressed variants
The interesting part for us was not plain JSON traffic.
The interesting part was everything around:
unknown protobuf carriers,
streaming traffic,
binary framed payloads,
and transport-aware false-positive handling.
Below is a simplified version of the current capability matrix.
Protocol | Encoding / Variant | AI Detection | AI Extraction |
|---|---|---|---|
HTTP/1.1 | JSON | yes | yes |
HTTP/1.1 | MsgPack | yes | yes |
HTTP/1.1 | Known protobuf | yes | yes |
HTTP/1.1 | Unknown protobuf | yes | no |
HTTP/2 | JSON multiplex | transport | transport |
HTTP/2 | MsgPack | yes | yes |
WebSocket | JSON text frames | yes | yes |
WebSocket | Unknown protobuf carriers | yes | no |
SSE | JSON delta stream | yes | yes |
SSE | Unknown carriers | yes | no |
gRPC | Known protobuf unary | yes | yes |
gRPC | Unknown protobuf stream | yes | no |
The full matrix additionally includes:
protobuf structural variants,
compressed payload variants,
opaque binary handling,
high-entropy payload guards,
and transport integrity coverage.
Why Do We Separate AI Detection From AI Content Extraction?
One thing we intentionally separate is:
AI traffic detection
fromsemantic extraction.
That distinction matters a lot.
For example:
an unknown protobuf carrier may still contain strong evidence that a connection represents AI interaction traffic, even if the exact protobuf schema is unavailable.
However:
detecting AI traffic is not the same thing as understanding the full semantic content.
This is especially important for binary protocols and unknown carriers.
Instead of pretending we fully understand every payload, we intentionally fail closed once semantic confidence becomes too low.
Meaning:
traffic may still be classified as AI-related, while extraction remains intentionally disabled.
We believe this is a more realistic and safer approach for transparent AI security systems.
Why Do Streaming Protocols Matter For AI Security?
Streaming fundamentally changes how AI traffic behaves.
Traditional security tooling often assumes:
request arrives,
response arrives,
analysis completes.
Modern AI systems increasingly behave differently:
responses stream incrementally,
tools execute during sessions,
connections remain persistent,
and context evolves continuously.
This is especially visible in:
SSE token streams,
WebSocket copilots,
gRPC streaming,
and agent-style systems.
As AI agents become more autonomous, streaming traffic inspection becomes increasingly important for:
AI observability,
prompt injection detection,
sensitive data protection,
tool governance,
and MCP security.
Why Does Transparent AI Traffic Visibility Matter?
Most organizations already have large amounts of unmanaged AI usage.
This includes:
browser AI tools,
AI IDE extensions,
local LLM runtimes,
desktop copilots,
and shadow AI workflows.
Many existing security solutions still depend heavily on:
browser extensions,
provider APIs,
or SaaS integrations.
The problem is that modern AI usage increasingly happens outside those controlled environments.
That is why we believe endpoint-side transparent inspection becomes increasingly important.
Not because integrations are useless —
but because integrations alone no longer provide complete visibility into how modern AI systems actually communicate.
What Does The Future Of AI Traffic Look Like?
We believe modern AI traffic will increasingly become:
streamed,
binary,
multiplexed,
persistent,
and agent-driven.
That means future AI security systems will likely require:
protocol-aware inspection,
semantic extraction pipelines,
streaming analysis,
and transport-agnostic detection mechanisms.
The era of “just inspect JSON requests” is slowly ending.
AI security increasingly becomes a protocol and semantic visibility problem.
