AuditNetworkPolicy is a kguardian CRD that lets you preview the impact of a Kubernetes NetworkPolicy before you enforce it. The spec is byte-identical to upstream networking.k8s.io/v1.NetworkPolicy. The only thing that changes is what kguardian does with it: the kguardian-evaluator watches observed pod traffic and reports flows that the policy would deny — but never drops a single packet.
Why
The hardest part of shipping NetworkPolicies in production isn’t writing them — it’s confidence. A typo in apodSelector or a missing to: block can blackhole half a service in a heartbeat. The conventional answer is “test in staging”, but staging traffic shapes never match production.
AuditNetworkPolicy lets you apply your candidate policy to live production traffic, watch for false positives in the audit stream, then promote with a one-line kind: change once you’re confident.
Prior art
This pattern is directly modelled on Calico’sStagedKubernetesNetworkPolicy. The differences:
- Calico evaluates staged policies alongside its own dataplane. kguardian is observability-only and runs over the live eBPF flow stream from the controller — you can use it with any CNI (Cilium, Calico, OVN, kindnet, etc.).
- Calico’s promotion path renames the resource kind. kguardian uses the same approach: copy the spec, change
kind: AuditNetworkPolicytokind: NetworkPolicy, apply with kubectl. Your CNI then enforces it. - Cilium has
policyAuditModebut it’s a cluster-wide agent flag and disables enforcement entirely; per-policy audit was closednot planned. The CRD-per-policy model that Calico (and now kguardian) use is what people actually want.
How it works
- The controller observes every TCP/UDP connection on each node (no change from before).
- The broker forwards each flow to the evaluator’s
/evaluateendpoint. - The evaluator looks up which
AuditNetworkPolicyresources select either side of the flow, runs the standard NetworkPolicy semantics over the rule set, and returns a verdict per (policy, direction). WouldDenyverdicts are persisted inaudit_verdictsand surface as logs, Kubernetes Events on the policy, and rolling counts in.status.evaluation.
Example
Apply this to a namespace and watch the evaluator’s logs:kubectl get auditnetworkpolicy payments-isolation -n prod shows a WOULD-DENY column populated by the rolling-window count. kubectl describe shows the most frequent offenders.
Promoting to enforced
When you’re satisfied the would-deny set is empty (or the false positives have all been triaged), promote:Tracking evaluator progress — status.observedGeneration
After you edit an AuditNetworkPolicy (e.g. broaden a to: selector) the evaluator has to pick the new spec up, re-evaluate the rolling window of observed traffic, and re-publish the WOULD-DENY counts. The CRD exposes .status.observedGeneration so you know when that loop has caught up:
observedGeneration once it has finished a full reconcile of the current .metadata.generation. Don’t trust the WOULD-DENY count for tuning decisions until the two numbers match.
Querying verdicts directly
The frontend’s Would-Deny view consumes the broker’sGET /audit/verdicts
endpoint. The same endpoint is what to hit from your own tooling
(scripted reports, periodic export, etc.):
?namespace= (empty
value) is the legitimate selector for cluster-scoped policy
verdicts. The result is ordered (observed_at DESC, id DESC) so
external paginators can cursor on the BIGSERIAL id. See the
endpoint reference for the full contract.
Cluster-scoped policies — AuditClusterNetworkPolicy
For cross-namespace audits (e.g. “what would happen if I default-denied ingress on every workload across the cluster?”) use the cluster-scoped sibling. The spec is identical to AuditNetworkPolicy but adds a top-level namespaceSelector (nil/empty matches all namespaces) and is itself cluster-scoped. Within each matching namespace the rule evaluation is identical.
NetworkPolicy resources (one per matched namespace) for actual enforcement, since upstream NetworkPolicy is namespaced. Calico’s GlobalNetworkPolicy and upstream’s AdminNetworkPolicy are the cluster-wide enforcement counterparts if your CNI supports them.
CLI helper — kguardian audit promote
kubectl.kubernetes.io/last-applied-configuration annotation (it’d be wrong for the new kind). Existing labels and other annotations are preserved.
Cluster-scoped — kguardian audit promote-cluster
AuditClusterNetworkPolicy promotes to one networking.k8s.io/v1.NetworkPolicy per matching namespace (since native NetworkPolicy is namespaced; the cluster-scope namespaceSelector is dropped from each emitted spec).
What kguardian does not do
- It does not enforce anything. Even with an
AuditNetworkPolicyin place, all traffic flows. If you want enforcement, promote the policy as above and rely on your CNI. - It does not aggregate verdicts across multiple evaluator replicas — the evaluator’s status updater is single-replica by design (a Helm guardrail blocks
replicaCount > 1). Multi-replica HA is a future story.
Limits + caveats
The matcher implements podSelector + namespaceSelector + numeric port + endPort range + named-port + ipBlock (CIDR + except). Edge cases that warrant care:- Empty
from:/to:matches all peers — the same trap as upstream NetworkPolicy. - Empty
ingress: []/egress: []is a default-deny in the relevant direction. If a policy’spolicyTypesincludes a direction with no rules, every flow in that direction is “would-deny”. - Unknown peer pods (deleted between flow capture and evaluation) are not matched against any selector — this can cause an apparent under-count. The
audit_verdicts.reasoncolumn records when this happens. - Named ports are resolved against the destination pod’s
spec.containers[].ports[]declarations. A named port matches only when both the name and the observedcontainerPortline up. ipBlockmatches against the peer’s L3 address as observed by the eBPF controller. Flows with unknown IPs (e.g. malformed eBPF events) are non-matches, not silent allows.- Ingress UDP is systematically under-counted. The controller’s eBPF probes have no
kindfor inbound UDP, so traffic like CoreDNS receiving queries, syslog, or NTP responses arriving at a pod is not seen by the evaluator. Auditing a UDP-fronted namespace will show zero ingress flows where there may be millions. Egress UDP and all TCP directions are tracked normally.