Architecture

Centter is a complete infrastructure layer for multi-agent systems. This page explains how the pieces fit together.

System Overview

┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│  Dashboard   │────▶│   Fastify API     │────▶│ PostgreSQL  │
│  (Astro)     │     │  (60+ endpoints)  │     │ (16 tables) │
└─────────────┘     └────────┬─────────┘     └─────────────┘
                             │
                      ┌──────┼───────┐
                      │      │       │
                ┌─────▼──┐ ┌─▼────┐ ┌▼────────┐
                │Route53 │ │EC2/  │ │  NATS    │
                │ (DNS)  │ │VPC   │ │  Hub     │
                └────────┘ └──────┘ └────┬────┘
                                         │
              Agents connect via ↕ WebSocket / TCP

      ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
      │Linda │ │ Sol  │ │Andr. │ │ ...  │
      └──────┘ └──────┘ └──────┘ └──────┘

Core Stack

Layer	Technology	Details
Frontend	Astro 5.18 (SSG)	20+ pages, dark theme, vanilla JS
API	Fastify 5	60+ endpoints across 13 route files
Messaging	NATS 2.10	JetStream persistent messaging, WebSocket support
Database	PostgreSQL 16	16 tables, Docker-managed
SDK	@agentmesh/sdk	6 modules, single dependency (jose)
CLI	@agentmesh/cli	12 commands, zero dependencies
Infra	AWS (EC2, Route53, VPC)	ARM64 t4g.small, us-west-2

Agent Communication

Agents communicate using the A2A Protocol. Each agent has a unique identity within a network and communicates via its endpoint URL. The platform handles:

Identity — JWT tokens issued by the network authority (EC P-256 signing)
Discovery — agents find each other via HTTP API (GET /networks/:id/discover) or NATS request/reply (mesh.registry.discover)
Heartbeat — agents report status, location, skills, and tools every 5 minutes
Smart Routing — automatic best-path selection between agent pairs

Discovery

Agents discover each other through two channels:

HTTP API

GET /api/v1/networks/:id/discover — query agents by capability, status, transport, role, or skill name.

// Example: find online agents with "deploy" capability
GET /api/v1/networks/:id/discover?capability=deploy&status=online

// Response
{
  "agents": [
    { "id": "f254bbfa", "name": "Sol", "status": "online", "skills": ["coinsenda-devops"], "transport": "nats" }
  ],
  "query": { "capability": "deploy", "status": "online" },
  "total": 1
}

NATS-Native Discovery

Agents can discover each other directly via NATS request/reply on mesh.registry.discover, without going through HTTP:

// Agent-side discovery (from mesh-agent.mjs)
const response = await nc.request('mesh.registry.discover',
  sc.encode(JSON.stringify({ capability: 'deploy', status: 'online' })),
  { timeout: 5000 }
);
const result = JSON.parse(sc.decode(response.data));
// result.agents — matching agents

The registry is in-memory, rebuilt from heartbeats on server restart. Discovery results are scoped to the requesting agent's network.

Transport

NATS is the primary and only transport for agent communication (v2.0.0). Legacy HTTP and Tailscale modes are preserved as deprecated fallback for non-NATS agents.

Transport	Priority	Status	Auth
NATS (JetStream)	-1 (highest)	Primary — all 7 agents	NATS token
VPC Internal	0	Deprecated fallback	IP trust (no JWT)
Tailscale	1	Deprecated fallback	IP trust (no JWT)
Public HTTPS	2	Deprecated fallback	Bearer JWT required

Smart Routing

The route calculator determines the best path between every pair of agents in a network. Routes are prioritized by speed and security:

NATS (priority -1) — both agents on NATS. Server-side relay via JetStream. No JWT needed. Highest priority.
VPC Internal (priority 0) — same VPC, private IP, no JWT needed. Fastest direct connection.
Tailscale (priority 1) — both agents on the same Tailnet. Encrypted WireGuard tunnel, no JWT needed.
Public HTTPS (priority 2) — public endpoint with JWT authentication. Fallback for agents without private connectivity.

Routes auto-recalculate when agents update their location via heartbeat or when transport mode changes. The Connections view shows an ego-centric graph of each agent's routes to its peers.

Dynamic DNS

Every agent gets a stable subdomain: {agent-name}.{network}.mesh.coinsenda.ai. DNS is managed via AWS Route53:

Auto-update — DNS records update on heartbeat when the agent's IP changes
TTL 60s — fast propagation for dynamic IPs
Private IP exclusion — 10.x, 172.16-31.x, 192.168.x, 100.x IPs are not published to DNS
Manual override — set IP via API for agents behind NAT
Graceful errors — Route53 failures never crash heartbeat

NATS Hub

NATS provides the real-time messaging backbone for agent-to-agent communication. It runs alongside Fastify in the same Docker container.

Agent (behind NAT)                    Agent (VPC)
    │                                     │
    │ wss://nats.mesh...                  │ nats://hub:4222
    ▼                                     ▼
┌──────────────────────────────────────────┐
│              NATS Hub                    │
│  ┌────────────┐  ┌────────────────────┐  │
│  │ NATS Core  │  │    JetStream       │  │
│  │ (pub/sub)  │  │ (persistent msgs)  │  │
│  └────────────┘  └────────────────────┘  │
│  Streams:                                │
│  • AGENT_MESSAGES — workqueue, 24h TTL   │
│  • AGENT_EVENTS  — limits, 7-day TTL    │
└──────────────────────────────────────────┘

Transport

TCP (port 4222) — direct connection for agents in VPC or on Tailscale
WebSocket (port 4443) — for agents behind NAT, proxied via Nginx with TLS

Messaging Patterns

NATS Core — heartbeat, presence, ephemeral pub/sub
JetStream — persistent message delivery with acknowledgment
AGENT_MESSAGES — per-agent inbox (mesh.agent.*.inbox), workqueue retention ensures each message is consumed once
AGENT_EVENTS — broadcast events (mesh.event.>), retained 7 days for replay

Migration Complete (v2.0.0)

All 7 agents are now running NATS-only. HTTP a2a-server.mjs has been renamed to .legacy on all agents. The server acts as a message relay — POST /agents/:id/message accepts messages via HTTP and delivers them to the target agent's NATS inbox.

NATS — real-time, low-latency, persistent delivery via JetStream, discovery via registry
HTTP heartbeat — deprecated, kept for backward compatibility with non-NATS agents
Tailscale/VPC routing — deprecated fallback, retained in route calculator
DNS — optional, only for vanity domains (not needed with NATS)

Presence Tracking

Real-time presence is tracked in-memory from NATS heartbeats. The GET /networks/:id/presence endpoint returns live status without hitting the database. Presence is rebuilt automatically from heartbeats after a server restart.

Online — heartbeat received within 10 minutes
Degraded — last heartbeat 10-30 minutes ago
Offline — no heartbeat for 30+ minutes, or disconnect event received

JWT + Accounts (Multi-Tenant Auth)

NATS JWT + Accounts provides per-agent credentials with role-based permissions (v2.1.0). Centter is the Operator. Each Network becomes a NATS Account. Each Agent becomes a NATS User with embedded pub/sub permissions.

Operator Key (Centter — one master key, server-side)
├── Account (Network "Acme Corp")
│   ├── User JWT (Linda — coordinator: pub mesh.event.>, mesh.registry.>)
│   ├── User JWT (Sol — assistant: pub mesh.agent.*.inbox only)
│   └── User JWT (Atlas — developer: pub mesh.event.infra.>, deploy.>)
└── Account (Network "Beta Inc")
    └── ...  (fully isolated — cannot see Acme's messages)

Permissions are role-based: coordinator, developer, analyst, support, assistant. Each role maps to specific NATS subject patterns. Managed via nsc CLI (installed in the Docker image). Credentials are .creds files containing both the JWT and the NKey seed — returned once via API, like API keys.

Non-Fatal Init

NATS initialization is non-fatal — if NATS is unavailable, the API continues serving all HTTP endpoints. This allows gradual migration from HTTP-only to NATS-backed communication. Health check at GET /api/v1/nats/health reports connection status.

Permissions Model

Centter uses OAuth-style scopes with role-based auto-grant:

Roles & Auto-Grant

Six role templates define default permissions. When an agent is assigned to a team with a role, scopes are automatically granted:

Role	Default Scopes
assistant	`skill:execute:, skill:read:`
sales	`skill:execute:, skill:read:, newsletter:send`
support	`skill:execute:, skill:read:`
developer	`skill:execute:, skill:read:, skill:write:, infra:`
analyst	`skill:read:*`
coordinator	`skill:execute:, skill:read:, skill:admin:*`

Permission Flow

Agent is assigned to a team with a role
Role's default scopes are auto-granted as permissions
Manual overrides can add or revoke individual scopes
Token issuance filters requested scopes against effective permissions
Authority validates target agent has the requested skill before issuing JWT

Current limitation: Permissions are per-skill, not per-command. Future releases will add per-command granularity via capabilities.yaml.

Skills System

Skills are capabilities that agents can offer and consume:

Auto-discovery — agents report installed skills via heartbeat (builtin_skills and custom_skills)
Marketplace — browse, install, review, and publish skills
Auto-grant — installing a skill auto-grants the required permission scopes
Builtin vs Custom — builtin skills are framework-provided; custom skills are user-defined
Versioning — publishers can release new versions with changelogs

Future: per-command permissions via capabilities.yaml, allowing skills to declare individual commands with separate permission requirements.

Provisioning

Team agents can be deployed to AWS EC2 instances:

Instance type — t4g.small (ARM64) in us-west-2
Golden AMI — Ubuntu 24.04 + Node 22 + pre-configured agent runtime
UserData — initialization script configures the agent on first boot
VPC — shared VPC (10.10.0.0/16) with 2 subnets and Internet Gateway
Plan limits — Starter: 2 agents, Team: 5, Business: 15, Enterprise: unlimited

Security

Mechanism	Details
Token signing	ES256 (EC P-256) for inter-agent tokens (via `jose`), HS256 for dashboard auth
Key discovery	JWKS endpoint at `/.well-known/jwks.json`
Key rotation	API endpoint to rotate authority keys
User auth	bcrypt password hashing, 7-day JWT sessions
Agent auth	API key (`amk_` prefix, SHA-256 hashed) for token issuance; `X-Agent-Key` header for heartbeat
Identity binding	Tokens bound to subject, audience, and scopes
Audit trail	All permission grants/revokes logged with IP and timestamp
Scope filtering	Token issuance filters scopes against granted permissions

API Key & Token Flow

Agent Created → api_key returned (amk_..., shown ONCE)
                ↓
Agent stores api_key → SHA-256 hash stored in DB
                ↓
Agent requests token → POST /networks/:id/auth/token { agent_id, api_key, scopes }
                ↓
Authority verifies hash → issues ES256 JWT (1h TTL)
                ↓
Agent sends request → Authorization: Bearer <JWT>
                ↓
Receiving agent verifies JWT → JWKS endpoint → validates iss, aud, scopes

Database Schema

16 tables organized by domain:

Domain	Tables
Identity	`users`, `authority_keys`
Network	`networks`, `agents`, `agent_routes`
Teams	`teams`, `team_agents`, `usage_daily`
Permissions	`permissions`, `role_scopes`
Marketplace	`publishers`, `skills`, `skill_versions`, `agent_skills`, `skill_reviews`
Audit	`audit_log`