Technical Architecture

Aegis is designed around a single primary GTM constraint: Self-hosted First → Multi-tenant SaaS. Large enterprise clients require data-sovereignty and in-VPC container workloads before exposing internal resources to LLMs. Consequently, the architecture splits core logic from cloud infrastructure.

Core Principles

  • Stateless Data Plane: The routing hot-path holds no in-memory state that isn't read from an in-process cache, allowing infinite horizontal scale.
  • Hexagonal Portability: No AWS, Azure, or GCP SDK classes are referenced in core engine assemblies. Interfaces (Ports) abstract all state, storage, and eventing.
  • Fail Closed: If security checks, authentication queries, or policy rules fail to compile or evaluate, the request is denied by default.
  • Tenant-Awareness: Every entity, database table row, trace element, and cache key includes a tenant_id from day one, making multi-tenancy a config choice rather than a rewrite.

Control Plane vs. Data Plane

Aegis divides responsibilities between two logical systems:

  • The Data Plane (Stateless Gateway Proxy): Built on top of Microsoft's YARP (Yet Another Reverse Proxy) engine. It terminates the transport (stdio, WebSocket, SSE, HTTP), performs version and capability negotiations, executes authentication, and routes requests to the policy pipeline.
  • The Control Plane (Administration & Registry): Exposes REST/gRPC endpoints to manage server registrations, track policy revisions, run policy simulations, monitor server health, check drift patterns, and process audit chains.

Configuration Rollouts

When an admin publishes a new policy or registers a server, the Control Plane compiles the bundle, signs it with a cryptographic key, and uploads it to the Object Store (e.g. S3 or MinIO). It then triggers a notification via the IEventBus. The Data Plane instances pull the bundle, verify the signature, and hot-swap their in-memory snapshots in under 30 seconds across the fleet.

API Definitions

Aegis provides a clean segregation of concerns, exposing distinct API interfaces for the runtime data plane and the administration control plane:

Data Plane (MCP Core)

The Data Plane exposes native Model Context Protocol (MCP) routing endpoints. It serves as a transparent, governed gateway for standard MCP-compatible clients:

  • Native Transports: Exposes standardized connection entrypoints across all supported transports (SSE, WebSockets, Streamable HTTP, and stdio managed subprocess sidecars).
  • Zero-Trust Routing: Dynamically extracts routing context from request headers, paths, or hostnames to resolve the correct tenant_id and registered downstream server.
  • Fail-Closed Error Handlers: When the enforcement pipeline denies a request, the Data Plane short-circuits execution and returns standard MCP JSON-RPC 2.0 error codes (e.g., codes -32001 to -32004 and -32010), avoiding unneeded downstream traffic.

Control Plane (Admin API)

The Control Plane exposes an administrative REST and gRPC API layer, typically running on port :8081 under the /api/v1/ path prefix:

  • Registry (/api/v1/servers): Complete CRUD operations for registering downstream MCP servers, tracking server metadata, target transport configurations, and quarantine states.
  • Policy (/api/v1/policies): Endpoints to author, version, and cryptographically sign policy bundles. It includes a simulation endpoint to dry-run candidate policies against historic telemetry to assess impacts before rolling out.
  • RBAC Rules (/api/v1/roles & /api/v1/bindings): Configures coarse-grained permissions mapping user identities to platform roles.
  • Audit Trail (/api/v1/audit): Allows searching and exporting the per-tenant audit chain, with endpoints to verify signature integrity and cryptographically prove the logs have not been tampered with.
  • Secret References (/api/v1/secrets/refs): Configures pointers to credentials stored in secret managers (e.g., Vault, AWS Secrets Manager) without exposing raw credentials.

Admin Dashboard & Analytics

Aegis co-deploys a self-contained administrative Single Page Application (SPA) on the Control Plane port (e.g., :8081) for real-time monitoring and analytics without external dependencies.

Core Dashboard Capabilities

  • Traffic & Latency Analytics: Real-time visualizations showing request volume, error rates, and p95/p99 latency trends across all managed servers.
  • Resilience Monitoring: Displays the current state (Open, Closed, or Half-Open) of active circuit-breakers managed by the Polly engine, segmented by tenant and server.
  • Policy Decisions: Tracks enforcement metrics, highlighting the ratio of allowed requests to policy-driven blocks and redacted/transformed arguments.
  • Policy Rollout Tracker: Monitors active Data Plane instances and displays the propagation status of compiled policy hot-swaps.
  • Drift & Quarantine Alerts: Surfaces active alarms when downstream MCP server schemas or capabilities drift from their registered baselines, prompting administrators to review quarantined servers.
  • Tamper-Evident Logs: Includes an interactive log explorer displaying the hash-chain audit trail and presenting instant cryptographic verification status.

The Enforcement Pipeline

Every MCP request flows through a series of middleware filters called the IEnforcementStage. If any stage fails, the request short-circuits, returns an MCP protocol error, and is written to the audit logs:

  1. Authentication Stage: Intercepts OIDC, SAML, or service principal credentials, validates the signature, and populates the Principal context.
  2. Coarse Authorization (RBAC): Evaluates if the user's role permits calling the target MCP server or primitive (e.g., denying an auditor from triggering tools).
  3. Fine-grained Policy (Cedar Engine): Runs compiled Cedar policies against request argument parameters (e.g. validating SQL statements against regex patterns).
  4. Rate Limiting: Queries a local cache and falls back to Redis to evaluate sliding-window quotas per user or tenant.
  5. Resilience Gate: Integrates with Polly to monitor timeout restrictions, execute backoff retries, and trip circuit breakers if downstream latency climbs.
  6. Request Transformation: Sanitizes payloads and applies argument constraints before dispatching the request.
  7. Response Filtering: Performs output validation (e.g. verifying the response conforms to the registered JSON schema).
  8. Audit Trail: Emits an asynchronous, hash-chained record containing transaction metadata to the audit queue.

Ports & Adapters (Hexagonal Abstraction)

Portability is enforced at compiling limits. The gateway core code is kept clean of cloud-provider dependencies by defining generic interfaces (Ports) and exchanging them dynamically based on environment configurations:

Port Interface Responsibility Compose Adapter (P1) Helm / K8s Adapter (P2) AWS SaaS Adapter (P3)
IStateStore Configuration metadata, registry lists, policies PostgreSQL (EF Core) PostgreSQL-HA Amazon Aurora PostgreSQL
IDistributedCache Session states, rate-limit buckets In-memory / Redis Redis Cluster HA Amazon ElastiCache Redis
IEventBus Config updates, async audits, health events In-process Channel NATS / Redis Streams Amazon SNS & SQS
IObjectStore Policy bundles, raw audit payloads Local Filesystem MinIO / S3 compat Amazon S3 (Object Lock)
ISecretProvider Downstream credentials, rotation Local Env / Vault HashiCorp Vault AWS Secrets Manager
IAuditSink Audit index & search tracking PostgreSQL FTS OpenSearch + MinIO OpenSearch + S3

Transport & Protocol Layer

The Mcp.Gateway.Protocol module handles Model Context Protocol (MCP) JSON-RPC 2.0 message semantics and transport-agnostic routing.

  • JSON-RPC 2.0 Engine: Fully conforms to the JSON-RPC 2.0 specification, parsing single and batched message envelopes.
  • Version Negotiation: Brokers the handshake between client and server during the initialize phase, gracefully handling or rejecting unsupported versions.
  • Capability Negotiation: Dynamically adjusts advertised client/server capabilities based on security policies (e.g., stripping sampling capabilities for untrusted workloads).
  • Primitive Router: Classifies and dispatches incoming messages to appropriate primitive handlers (tools/*, resources/*, prompts/*).

Importantly, this layer enforces bidirectional governance. While it processes client-to-server operations (like tool execution), it also intercepts server-to-client callbacks (like sampling request interception) on the return path, ensuring complete runtime control.

Proxy & Data Plane

The Mcp.Gateway.DataPlane module manages the request pipeline, transport session states, and downstream connector lifecycles. It is designed to be completely stateless to allow infinite horizontal scaling.

  • Stateless Reverse Proxy: Powered by Microsoft’s high-performance YARP (Yet Another Reverse Proxy) engine to bridge and route traffic.
  • Transport Normalization: Unifies multiple incoming transport protocols—including Streamable HTTP, Server-Sent Events (SSE), WebSockets, and stdio managed subprocesses—into a standardized, internal McpSession context.
  • Resilience & Gatekeeping: Integrates with Polly to apply request timeouts, backoff retries, and active circuit-breaker limits.
  • Downstream Management: Manages lifecycle connections to registered servers, applies credential mapping (such as OAuth On-Behalf-Of or secret provider credential injection), and filters outgoing payloads.

Policy Evaluator

The Mcp.Gateway.Policy module provides a fast, in-process authorization engine capable of fine-grained, argument-level validation.

  • Cedar Policy Engine: Implements Amazon's Cedar authorization language. Cedar’s forbid-overrides-permit model is ideal for enforcing unbreakable platform-level security invariants.
  • Argument-Level Enforcement: Inspects JSON-RPC request payloads to restrict actions based on runtime arguments (e.g., restricting database queries to specific schemas or blocking writes).
  • DSL Compilation: Compiles a human-readable YAML/JSON policy DSL into optimized Cedar policy structures.
  • Signed Hot-Swapping: Supports publishing cryptographically signed policy bundles from the control plane, which the data plane verifies and hot-swaps in-memory in under 30 seconds with zero request downtime.

Key Architectural Decisions

Several trade-offs were evaluated during the design of Aegis MCP Gateway:

  • Why Cedar instead of OPA (Rego)? Cedar was selected for its native forbid overrides permit precedence rules (perfect for platform invariants), its fast embedded evaluation performance (under 5ms), and its formal mathematical validation support.
  • Why YARP instead of Envoy? YARP was chosen because it allows deep, in-process C# integration with the protocol parser and Cedar engine, avoiding high-overhead sidecar serialization hops.
  • Why stdio support in a network proxy? Local developer loops and Kubernetes sidecars run MCP servers as local subprocesses. Aegis hosts a subprocess manager that bridges stdio pipes to remote TCP connections.