Technical Architecture
Aegis is designed around a single primary GTM constraint: Self-hosted First → Multi-tenant SaaS. Large enterprise clients require data-sovereignty and in-VPC container workloads before exposing internal resources to LLMs. Consequently, the architecture splits core logic from cloud infrastructure.
Core Principles
- Stateless Data Plane: The routing hot-path holds no in-memory state that isn't read from an in-process cache, allowing infinite horizontal scale.
- Hexagonal Portability: No AWS, Azure, or GCP SDK classes are referenced in core engine assemblies. Interfaces (Ports) abstract all state, storage, and eventing.
- Fail Closed: If security checks, authentication queries, or policy rules fail to compile or evaluate, the request is denied by default.
- Tenant-Awareness: Every entity, database table row, trace element, and cache key includes a
tenant_idfrom day one, making multi-tenancy a config choice rather than a rewrite.
Control Plane vs. Data Plane
Aegis divides responsibilities between two logical systems:
- The Data Plane (Stateless Gateway Proxy): Built on top of Microsoft's YARP (Yet Another Reverse Proxy) engine. It terminates the transport (stdio, WebSocket, SSE, HTTP), performs version and capability negotiations, executes authentication, and routes requests to the policy pipeline.
- The Control Plane (Administration & Registry): Exposes REST/gRPC endpoints to manage server registrations, track policy revisions, run policy simulations, monitor server health, check drift patterns, and process audit chains.
Configuration Rollouts
When an admin publishes a new policy or registers a server, the Control Plane compiles the bundle, signs it with a cryptographic key, and uploads it to the Object Store (e.g. S3 or MinIO). It then triggers a notification via the IEventBus. The Data Plane instances pull the bundle, verify the signature, and hot-swap their in-memory snapshots in under 30 seconds across the fleet.
API Definitions
Aegis provides a clean segregation of concerns, exposing distinct API interfaces for the runtime data plane and the administration control plane:
Data Plane (MCP Core)
The Data Plane exposes native Model Context Protocol (MCP) routing endpoints. It serves as a transparent, governed gateway for standard MCP-compatible clients:
- Native Transports: Exposes standardized connection entrypoints across all supported transports (SSE, WebSockets, Streamable HTTP, and stdio managed subprocess sidecars).
- Zero-Trust Routing: Dynamically extracts routing context from request headers, paths, or hostnames to resolve the correct
tenant_idand registered downstream server. - Fail-Closed Error Handlers: When the enforcement pipeline denies a request, the Data Plane short-circuits execution and returns standard MCP JSON-RPC 2.0 error codes (e.g., codes
-32001to-32004and-32010), avoiding unneeded downstream traffic.
Control Plane (Admin API)
The Control Plane exposes an administrative REST and gRPC API layer, typically running on port :8081 under the /api/v1/ path prefix:
- Registry (
/api/v1/servers): Complete CRUD operations for registering downstream MCP servers, tracking server metadata, target transport configurations, and quarantine states. - Policy (
/api/v1/policies): Endpoints to author, version, and cryptographically sign policy bundles. It includes a simulation endpoint to dry-run candidate policies against historic telemetry to assess impacts before rolling out. - RBAC Rules (
/api/v1/roles&/api/v1/bindings): Configures coarse-grained permissions mapping user identities to platform roles. - Audit Trail (
/api/v1/audit): Allows searching and exporting the per-tenant audit chain, with endpoints to verify signature integrity and cryptographically prove the logs have not been tampered with. - Secret References (
/api/v1/secrets/refs): Configures pointers to credentials stored in secret managers (e.g., Vault, AWS Secrets Manager) without exposing raw credentials.
Admin Dashboard & Analytics
Aegis co-deploys a self-contained administrative Single Page Application (SPA) on the Control Plane port (e.g., :8081) for real-time monitoring and analytics without external dependencies.
Core Dashboard Capabilities
- Traffic & Latency Analytics: Real-time visualizations showing request volume, error rates, and p95/p99 latency trends across all managed servers.
- Resilience Monitoring: Displays the current state (Open, Closed, or Half-Open) of active circuit-breakers managed by the Polly engine, segmented by tenant and server.
- Policy Decisions: Tracks enforcement metrics, highlighting the ratio of allowed requests to policy-driven blocks and redacted/transformed arguments.
- Policy Rollout Tracker: Monitors active Data Plane instances and displays the propagation status of compiled policy hot-swaps.
- Drift & Quarantine Alerts: Surfaces active alarms when downstream MCP server schemas or capabilities drift from their registered baselines, prompting administrators to review quarantined servers.
- Tamper-Evident Logs: Includes an interactive log explorer displaying the hash-chain audit trail and presenting instant cryptographic verification status.
The Enforcement Pipeline
Every MCP request flows through a series of middleware filters called the IEnforcementStage. If any stage fails, the request short-circuits, returns an MCP protocol error, and is written to the audit logs:
- Authentication Stage: Intercepts OIDC, SAML, or service principal credentials, validates the signature, and populates the
Principalcontext. - Coarse Authorization (RBAC): Evaluates if the user's role permits calling the target MCP server or primitive (e.g., denying an auditor from triggering tools).
- Fine-grained Policy (Cedar Engine): Runs compiled Cedar policies against request argument parameters (e.g. validating SQL statements against regex patterns).
- Rate Limiting: Queries a local cache and falls back to Redis to evaluate sliding-window quotas per user or tenant.
- Resilience Gate: Integrates with Polly to monitor timeout restrictions, execute backoff retries, and trip circuit breakers if downstream latency climbs.
- Request Transformation: Sanitizes payloads and applies argument constraints before dispatching the request.
- Response Filtering: Performs output validation (e.g. verifying the response conforms to the registered JSON schema).
- Audit Trail: Emits an asynchronous, hash-chained record containing transaction metadata to the audit queue.
Ports & Adapters (Hexagonal Abstraction)
Portability is enforced at compiling limits. The gateway core code is kept clean of cloud-provider dependencies by defining generic interfaces (Ports) and exchanging them dynamically based on environment configurations:
| Port Interface | Responsibility | Compose Adapter (P1) | Helm / K8s Adapter (P2) | AWS SaaS Adapter (P3) |
|---|---|---|---|---|
IStateStore |
Configuration metadata, registry lists, policies | PostgreSQL (EF Core) | PostgreSQL-HA | Amazon Aurora PostgreSQL |
IDistributedCache |
Session states, rate-limit buckets | In-memory / Redis | Redis Cluster HA | Amazon ElastiCache Redis |
IEventBus |
Config updates, async audits, health events | In-process Channel | NATS / Redis Streams | Amazon SNS & SQS |
IObjectStore |
Policy bundles, raw audit payloads | Local Filesystem | MinIO / S3 compat | Amazon S3 (Object Lock) |
ISecretProvider |
Downstream credentials, rotation | Local Env / Vault | HashiCorp Vault | AWS Secrets Manager |
IAuditSink |
Audit index & search tracking | PostgreSQL FTS | OpenSearch + MinIO | OpenSearch + S3 |
Transport & Protocol Layer
The Mcp.Gateway.Protocol module handles Model Context Protocol (MCP) JSON-RPC 2.0 message semantics and transport-agnostic routing.
- JSON-RPC 2.0 Engine: Fully conforms to the JSON-RPC 2.0 specification, parsing single and batched message envelopes.
- Version Negotiation: Brokers the handshake between client and server during the
initializephase, gracefully handling or rejecting unsupported versions. - Capability Negotiation: Dynamically adjusts advertised client/server capabilities based on security policies (e.g., stripping
samplingcapabilities for untrusted workloads). - Primitive Router: Classifies and dispatches incoming messages to appropriate primitive handlers (
tools/*,resources/*,prompts/*).
Importantly, this layer enforces bidirectional governance. While it processes client-to-server operations (like tool execution), it also intercepts server-to-client callbacks (like sampling request interception) on the return path, ensuring complete runtime control.
Proxy & Data Plane
The Mcp.Gateway.DataPlane module manages the request pipeline, transport session states, and downstream connector lifecycles. It is designed to be completely stateless to allow infinite horizontal scaling.
- Stateless Reverse Proxy: Powered by Microsoft’s high-performance YARP (Yet Another Reverse Proxy) engine to bridge and route traffic.
- Transport Normalization: Unifies multiple incoming transport protocols—including Streamable HTTP, Server-Sent Events (SSE), WebSockets, and stdio managed subprocesses—into a standardized, internal
McpSessioncontext. - Resilience & Gatekeeping: Integrates with Polly to apply request timeouts, backoff retries, and active circuit-breaker limits.
- Downstream Management: Manages lifecycle connections to registered servers, applies credential mapping (such as OAuth On-Behalf-Of or secret provider credential injection), and filters outgoing payloads.
Policy Evaluator
The Mcp.Gateway.Policy module provides a fast, in-process authorization engine capable of fine-grained, argument-level validation.
- Cedar Policy Engine: Implements Amazon's Cedar authorization language. Cedar’s
forbid-overrides-permitmodel is ideal for enforcing unbreakable platform-level security invariants. - Argument-Level Enforcement: Inspects JSON-RPC request payloads to restrict actions based on runtime arguments (e.g., restricting database queries to specific schemas or blocking writes).
- DSL Compilation: Compiles a human-readable YAML/JSON policy DSL into optimized Cedar policy structures.
- Signed Hot-Swapping: Supports publishing cryptographically signed policy bundles from the control plane, which the data plane verifies and hot-swaps in-memory in under 30 seconds with zero request downtime.
Key Architectural Decisions
Several trade-offs were evaluated during the design of Aegis MCP Gateway:
- Why Cedar instead of OPA (Rego)? Cedar was selected for its native
forbidoverridespermitprecedence rules (perfect for platform invariants), its fast embedded evaluation performance (under 5ms), and its formal mathematical validation support. - Why YARP instead of Envoy? YARP was chosen because it allows deep, in-process C# integration with the protocol parser and Cedar engine, avoiding high-overhead sidecar serialization hops.
- Why stdio support in a network proxy? Local developer loops and Kubernetes sidecars run MCP servers as local subprocesses. Aegis hosts a subprocess manager that bridges stdio pipes to remote TCP connections.