Runabot Network Architecture & Firewall#

This page describes the complete network topology of a Runabot tenant environment: how traffic flows between every component, and where authentication/authorisation and network firewall controls are enforced.

Legend#

Symbol	Meaning
🔥	Cilium / Kubernetes `NetworkPolicy` or `CiliumNetworkPolicy` firewall boundary
🔐	Authentication / Authorisation check (Ory Kratos + OpenFGA)
➡️	Allowed flow
✋	Blocked by default (explicit allowlist required)

Component Diagram#

The diagram below shows one user’s namespace (dev-alice) inside the cluster. All egress from bot and addon pods is default-deny at the eBPF level; only explicitly whitelisted FQDNs or cluster endpoints are reachable.

flowchart TD
    %% ─── External actors ──────────────────────────────────────────
    Browser["🌐 User Browser<br />(HTTPS)"]
    Internet["☁️ Internet<br />(LLM APIs, web search)"]

    %% ─── Ingress layer ────────────────────────────────────────────
    subgraph cluster["Kubernetes Cluster  (Cilium CNI)"]

        subgraph ingress_ns["Namespace: traefik (ingress)"]
            Traefik["Traefik Ingress<br />IngressRoute"]
        end

        %% ─── Auth layer ───────────────────────────────────────────
        subgraph auth_ns["Namespace: runabot (API + Auth)"]
            KratosUI["Ory Kratos<br />(login, registration)"]
            APIServer["Runabot API Server<br />ConnectRPC<br />🔐 Kratos session check<br />🔐 OpenFGA role check"]
            OpenFGA["OpenFGA<br />(relationship store)"]
        end

        %% ─── User namespace ───────────────────────────────────────
        subgraph user_ns["Namespace: dev-alice  (label: owner=alice)"]
            direction TB

            subgraph bot_pod["Bot Pod"]
                OpenClaw["OpenClaw Bot<br />(openclaw binary)"]
            end

            subgraph metamcp_pod["MetaMCP Addon Pod"]
                MetaMCP["MetaMCP Proxy<br />(MCP gateway)"]
            end

            subgraph olla_pod["Olla LLM Addon Pod  (optional)"]
                Olla["Olla<br />(local LLM proxy)"]
            end

            subgraph fetch_pod["secure-fetch Addon Pod"]
                SecureFetch["secure-fetch<br />(MCP server)<br />DSPy content validator<br />MCP Elicitation"]
            end
        end

        %% ─── Cluster DNS ──────────────────────────────────────────
        KubeDNS["kube-dns<br />(port 53)"]

    end

    %% ═══════════════════════════════════════════════════════════════
    %% FLOWS
    %% ═══════════════════════════════════════════════════════════════

    %% User browser → ingress
    Browser -->|"HTTPS :443"| Traefik

    %% Traefik → Kratos (login page)
    Traefik -->|"/_auth/*"| KratosUI

    %% Traefik → API (all /api/* traffic)
    Traefik -->|"/api/*<br />🔐 session cookie validated<br />by Kratos middleware"| APIServer

    %% API → OpenFGA (role check on every request)
    APIServer <-->|"gRPC<br />(cluster-internal)"| OpenFGA

    %% Traefik → OpenClaw (WebSocket + REST, proxied by API)
    Traefik -->|"/bots/alice/*<br />🔥 NetworkPolicy:<br />only from traefik ns<br />🔐 X-Forwarded-User injected<br />by API proxy"| OpenClaw

    %% Bot → MetaMCP (MCP over stdio/HTTP)
    OpenClaw -->|"MCP<br />🔥 CiliumNP: intra-user only<br />(same owner label)"| MetaMCP

    %% MetaMCP → Addons (only enabled tools)
    MetaMCP -->|"MCP<br />(selected tools only)"| SecureFetch
    MetaMCP -->|"MCP<br />(if enabled)"| Olla

    %% secure-fetch → Internet (fetching external URLs)
    SecureFetch -->|"HTTPS<br />🔥 CiliumCCNP toFQDNs whitelist<br />(admin-approved domains only)<br />✋ all others blocked"| Internet

    %% Bot → Olla (LLM inference, alternative to cloud APIs)
    OpenClaw -.->|"HTTP (local LLM)<br />🔥 CiliumNP: intra-user only"| Olla

    %% Bot → Internet (cloud LLM APIs, if no local Olla)
    OpenClaw -->|"HTTPS to api.openai.com etc.<br />🔥 CiliumCCNP toFQDNs whitelist<br />(admin-approved + user-approved FQDNs)<br />✋ all IP-direct attempts blocked"| Internet

    %% DNS inspection (mandatory for every workload with egress)
    OpenClaw -->|"DNS :53 UDP/TCP<br />🔥 Cilium DNS proxy<br />(maps IPs → FQDNs)"| KubeDNS
    SecureFetch -->|"DNS :53"| KubeDNS
    Olla -->|"DNS :53"| KubeDNS

    %% API manages Cilium policies dynamically
    APIServer -->|"K8s API: CRUD<br />CiliumNetworkPolicy<br />(user whitelist updates)"| user_ns

    %% ─── Styling ───────────────────────────────────────────────────
    classDef firewall stroke:#f38ba8,stroke-width:2px,stroke-dasharray:5 5
    classDef auth    stroke:#a6e3a1,stroke-width:2px
    classDef pod     stroke:#89b4fa,stroke-width:1px
    classDef ext     stroke:#fab387,stroke-width:2px,color:#1e1e2e,fill:#fab387

    class Traefik firewall
    class APIServer auth
    class KratosUI auth
    class OpenFGA auth
    class OpenClaw,MetaMCP,SecureFetch,Olla pod
    class Browser,Internet ext

Firewall Rules Reference#

Global Default-Deny (CiliumClusterwideNetworkPolicy)#

Applied to all pods carrying the runabot.de/workload-type label:

# Applied automatically to every bot and addon pod
egressDeny:
  - toEntities:
      - world   # blocks all internet egress by default

Any outbound connection not covered by an explicit toFQDNs or toCIDRSet allow rule is silently dropped at the eBPF layer — before the TCP handshake.

DNS Inspection (mandatory for every workload)#

egress:
  - toEndpoints:
      - matchLabels:
          k8s-app: kube-dns
    toPorts:
      - ports:
          - port: "53"
            protocol: UDP
          - port: "53"
            protocol: TCP
        rules:
          dns:
            - matchPattern: "*"   # Cilium intercepts all DNS; maps IPs → FQDNs

Why this matters: Without DNS inspection, Cilium cannot correlate the IP address a bot dials to the FQDN it resolved. The DNS interception step is what makes toFQDNs whitelisting enforceable.

Admin-Approved FQDN Whitelist (CiliumClusterwideNetworkPolicy)#

Managed by site administrators via the Runabot Admin UI or directly as Kubernetes manifests in Git:

# Example: runabot-default-whitelist-bots
egress:
  - toFQDNs:
      - matchName: api.openai.com
      - matchName: auth0.openai.com
      - matchName: api.anthropic.com
      - matchName: api.mistral.ai

Changes to these rules are pull-request-gated. Full Git history is available for SOC 2 audit purposes.

User-Defined Whitelist (CiliumNetworkPolicy, namespace-scoped)#

Bot owners can add their own FQDNs via the Settings UI. The Runabot API:

Validates the requested FQDN against the admin blacklist
Rejects entries that match a globally denied pattern
Writes a CiliumNetworkPolicy to the user’s namespace

# Example: user alice adds api.github.com for her coding assistant
# PUT /api/v1/bots/alice-bot/egress  { "allowed_fqdns": ["api.github.com"] }
# → Runabot API creates:
egress:
  - toFQDNs:
      - matchName: api.github.com
    description: "GitHub API — added by alice 2026-03-22"

Intra-User Isolation (CiliumClusterwideNetworkPolicy)#

# Applied per bot instance at deploy time
endpointSelector:
  matchLabels:
    app.kubernetes.io/instance: "alice-openclaw"
egress:
  - toEndpoints:
      - matchLabels:
          dev.runabot.de/owner: "alice"   # only alice's own addons

This ensures Alice’s bot can reach Alice’s MetaMCP, Olla, and secure-fetch pods, but cannot reach Bob’s addons or any other cluster service.

Authentication & Authorisation Reference#

Where Identities Are Checked#

Boundary	Mechanism	What is checked
Browser → Traefik	TLS (cert-manager)	Certificate validity
Traefik → API	Ory Kratos session cookie	Valid session + CSRF
API → Bot WebSocket	`X-Forwarded-User` header	Injected by API after Kratos validation; bot trusts trusted-proxy mode only
API → Any resource	OpenFGA	`user:alice can:access bot:alice-bot` tuple must exist
API → Admin endpoints	OpenFGA `admin` role	Additional role check per operation
MetaMCP → Addon	Internal API key	`BOOTSTRAP_API_KEYS` generated per-namespace by the Addon controller

OpenFGA Relationship Model#

user:alice  →  owner  →  bot:alice-openclaw
user:alice  →  owner  →  addon:alice-metamcp
user:alice  →  owner  →  addon:alice-secure-fetch
admin:bob   →  admin  →  tenant:dev

The Runabot API enforces these relationships synchronously on every request via the ConnectRPC interceptor chain. A missing tuple returns 403 Forbidden before any business logic executes.

Prompt Injection Defence (secure-fetch)#

The secure-fetch addon adds a second line of defence for any content fetched from the internet:

Input sanitisation — URLs and queries are matched against an allowlist; control characters are stripped.
DSPy content validation — fetched content is analysed by a secondary LLM for hidden instructions (prompt injection patterns, exfiltration payloads).
MCP Elicitation — if risk is detected, the server pauses and presents the user with an explicit Allow/Block decision. The decision is logged.
GEPA feedback loop — user decisions are collected to continuously improve the detection prompt.

This protection does not exist in vanilla GitHub Copilot’s Bing web search integration.

OIDC Configuration — configuring enterprise SSO providers
Filesystem Strategy — secure file access patterns for bots