Architecture

Centter is a complete infrastructure layer for multi-agent systems. This page explains how the pieces fit together.

System Overview

┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│  Dashboard   │────▶│   Fastify API     │────▶│ PostgreSQL  │
│  (Astro)     │     │  (60+ endpoints)  │     │ (16 tables) │
└─────────────┘     └────────┬─────────┘     └─────────────┘
                             │
                      ┌──────┼───────┐
                      │      │       │
                ┌─────▼──┐ ┌─▼────┐ ┌▼────────┐
                │Route53 │ │EC2/  │ │  NATS    │
                │ (DNS)  │ │VPC   │ │  Hub     │
                └────────┘ └──────┘ └────┬────┘
                                         │
              Agents connect via ↕ WebSocket / TCP

      ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
      │Linda │ │ Sol  │ │Andr. │ │ ...  │
      └──────┘ └──────┘ └──────┘ └──────┘

Core Stack

LayerTechnologyDetails
FrontendAstro 5.18 (SSG)20+ pages, dark theme, vanilla JS
APIFastify 560+ endpoints across 13 route files
MessagingNATS 2.10JetStream persistent messaging, WebSocket support
DatabasePostgreSQL 1616 tables, Docker-managed
SDK@agentmesh/sdk6 modules, single dependency (jose)
CLI@agentmesh/cli12 commands, zero dependencies
InfraAWS (EC2, Route53, VPC)ARM64 t4g.small, us-west-2

Agent Communication

Agents communicate using the A2A Protocol. Each agent has a unique identity within a network and communicates via its endpoint URL. The platform handles:

  • Identity — JWT tokens issued by the network authority (EC P-256 signing)
  • Discovery — agents find each other via HTTP API (GET /networks/:id/discover) or NATS request/reply (mesh.registry.discover)
  • Heartbeat — agents report status, location, skills, and tools every 5 minutes
  • Smart Routing — automatic best-path selection between agent pairs

Discovery

Agents discover each other through two channels:

HTTP API

GET /api/v1/networks/:id/discover — query agents by capability, status, transport, role, or skill name.

// Example: find online agents with "deploy" capability
GET /api/v1/networks/:id/discover?capability=deploy&status=online

// Response
{
  "agents": [
    { "id": "f254bbfa", "name": "Sol", "status": "online", "skills": ["coinsenda-devops"], "transport": "nats" }
  ],
  "query": { "capability": "deploy", "status": "online" },
  "total": 1
}

NATS-Native Discovery

Agents can discover each other directly via NATS request/reply on mesh.registry.discover, without going through HTTP:

// Agent-side discovery (from mesh-agent.mjs)
const response = await nc.request('mesh.registry.discover',
  sc.encode(JSON.stringify({ capability: 'deploy', status: 'online' })),
  { timeout: 5000 }
);
const result = JSON.parse(sc.decode(response.data));
// result.agents — matching agents

The registry is in-memory, rebuilt from heartbeats on server restart. Discovery results are scoped to the requesting agent's network.

Transport

NATS is the primary and only transport for agent communication (v2.0.0). Legacy HTTP and Tailscale modes are preserved as deprecated fallback for non-NATS agents.

TransportPriorityStatusAuth
NATS (JetStream) -1 (highest) Primary — all 7 agents NATS token
VPC Internal 0 Deprecated fallback IP trust (no JWT)
Tailscale 1 Deprecated fallback IP trust (no JWT)
Public HTTPS 2 Deprecated fallback Bearer JWT required

Smart Routing

The route calculator determines the best path between every pair of agents in a network. Routes are prioritized by speed and security:

  1. NATS (priority -1) — both agents on NATS. Server-side relay via JetStream. No JWT needed. Highest priority.
  2. VPC Internal (priority 0) — same VPC, private IP, no JWT needed. Fastest direct connection.
  3. Tailscale (priority 1) — both agents on the same Tailnet. Encrypted WireGuard tunnel, no JWT needed.
  4. Public HTTPS (priority 2) — public endpoint with JWT authentication. Fallback for agents without private connectivity.

Routes auto-recalculate when agents update their location via heartbeat or when transport mode changes. The Connections view shows an ego-centric graph of each agent's routes to its peers.

Dynamic DNS

Every agent gets a stable subdomain: {agent-name}.{network}.mesh.coinsenda.ai. DNS is managed via AWS Route53:

  • Auto-update — DNS records update on heartbeat when the agent's IP changes
  • TTL 60s — fast propagation for dynamic IPs
  • Private IP exclusion — 10.x, 172.16-31.x, 192.168.x, 100.x IPs are not published to DNS
  • Manual override — set IP via API for agents behind NAT
  • Graceful errors — Route53 failures never crash heartbeat

NATS Hub

NATS provides the real-time messaging backbone for agent-to-agent communication. It runs alongside Fastify in the same Docker container.

Agent (behind NAT)                    Agent (VPC)
    │                                     │
    │ wss://nats.mesh...                  │ nats://hub:4222
    ▼                                     ▼
┌──────────────────────────────────────────┐
│              NATS Hub                    │
│  ┌────────────┐  ┌────────────────────┐  │
│  │ NATS Core  │  │    JetStream       │  │
│  │ (pub/sub)  │  │ (persistent msgs)  │  │
│  └────────────┘  └────────────────────┘  │
│  Streams:                                │
│  • AGENT_MESSAGES — workqueue, 24h TTL   │
│  • AGENT_EVENTS  — limits, 7-day TTL    │
└──────────────────────────────────────────┘

Transport

  • TCP (port 4222) — direct connection for agents in VPC or on Tailscale
  • WebSocket (port 4443) — for agents behind NAT, proxied via Nginx with TLS

Messaging Patterns

  • NATS Core — heartbeat, presence, ephemeral pub/sub
  • JetStream — persistent message delivery with acknowledgment
  • AGENT_MESSAGES — per-agent inbox (mesh.agent.*.inbox), workqueue retention ensures each message is consumed once
  • AGENT_EVENTS — broadcast events (mesh.event.>), retained 7 days for replay

Migration Complete (v2.0.0)

All 7 agents are now running NATS-only. HTTP a2a-server.mjs has been renamed to .legacy on all agents. The server acts as a message relay — POST /agents/:id/message accepts messages via HTTP and delivers them to the target agent's NATS inbox.

  • NATS — real-time, low-latency, persistent delivery via JetStream, discovery via registry
  • HTTP heartbeat — deprecated, kept for backward compatibility with non-NATS agents
  • Tailscale/VPC routing — deprecated fallback, retained in route calculator
  • DNS — optional, only for vanity domains (not needed with NATS)

Presence Tracking

Real-time presence is tracked in-memory from NATS heartbeats. The GET /networks/:id/presence endpoint returns live status without hitting the database. Presence is rebuilt automatically from heartbeats after a server restart.

  • Online — heartbeat received within 10 minutes
  • Degraded — last heartbeat 10-30 minutes ago
  • Offline — no heartbeat for 30+ minutes, or disconnect event received

JWT + Accounts (Multi-Tenant Auth)

NATS JWT + Accounts provides per-agent credentials with role-based permissions (v2.1.0). Centter is the Operator. Each Network becomes a NATS Account. Each Agent becomes a NATS User with embedded pub/sub permissions.

Operator Key (Centter — one master key, server-side)
├── Account (Network "Acme Corp")
│   ├── User JWT (Linda — coordinator: pub mesh.event.>, mesh.registry.>)
│   ├── User JWT (Sol — assistant: pub mesh.agent.*.inbox only)
│   └── User JWT (Atlas — developer: pub mesh.event.infra.>, deploy.>)
└── Account (Network "Beta Inc")
    └── ...  (fully isolated — cannot see Acme's messages)

Permissions are role-based: coordinator, developer, analyst, support, assistant. Each role maps to specific NATS subject patterns. Managed via nsc CLI (installed in the Docker image). Credentials are .creds files containing both the JWT and the NKey seed — returned once via API, like API keys.

Non-Fatal Init

NATS initialization is non-fatal — if NATS is unavailable, the API continues serving all HTTP endpoints. This allows gradual migration from HTTP-only to NATS-backed communication. Health check at GET /api/v1/nats/health reports connection status.

Permissions Model

Centter uses OAuth-style scopes with role-based auto-grant:

Roles & Auto-Grant

Six role templates define default permissions. When an agent is assigned to a team with a role, scopes are automatically granted:

RoleDefault Scopes
assistantskill:execute:*, skill:read:*
salesskill:execute:*, skill:read:*, newsletter:send
supportskill:execute:*, skill:read:*
developerskill:execute:*, skill:read:*, skill:write:*, infra:*
analystskill:read:*
coordinatorskill:execute:*, skill:read:*, skill:admin:*

Permission Flow

  1. Agent is assigned to a team with a role
  2. Role's default scopes are auto-granted as permissions
  3. Manual overrides can add or revoke individual scopes
  4. Token issuance filters requested scopes against effective permissions
  5. Authority validates target agent has the requested skill before issuing JWT

Current limitation: Permissions are per-skill, not per-command. Future releases will add per-command granularity via capabilities.yaml.

Skills System

Skills are capabilities that agents can offer and consume:

  • Auto-discovery — agents report installed skills via heartbeat (builtin_skills and custom_skills)
  • Marketplace — browse, install, review, and publish skills
  • Auto-grant — installing a skill auto-grants the required permission scopes
  • Builtin vs Custom — builtin skills are framework-provided; custom skills are user-defined
  • Versioning — publishers can release new versions with changelogs

Future: per-command permissions via capabilities.yaml, allowing skills to declare individual commands with separate permission requirements.

Provisioning

Team agents can be deployed to AWS EC2 instances:

  • Instance type — t4g.small (ARM64) in us-west-2
  • Golden AMI — Ubuntu 24.04 + Node 22 + pre-configured agent runtime
  • UserData — initialization script configures the agent on first boot
  • VPC — shared VPC (10.10.0.0/16) with 2 subnets and Internet Gateway
  • Plan limits — Starter: 2 agents, Team: 5, Business: 15, Enterprise: unlimited

Security

MechanismDetails
Token signingES256 (EC P-256) for inter-agent tokens (via jose), HS256 for dashboard auth
Key discoveryJWKS endpoint at /.well-known/jwks.json
Key rotationAPI endpoint to rotate authority keys
User authbcrypt password hashing, 7-day JWT sessions
Agent authAPI key (amk_ prefix, SHA-256 hashed) for token issuance; X-Agent-Key header for heartbeat
Identity bindingTokens bound to subject, audience, and scopes
Audit trailAll permission grants/revokes logged with IP and timestamp
Scope filteringToken issuance filters scopes against granted permissions

API Key & Token Flow

Agent Created → api_key returned (amk_..., shown ONCE)
                ↓
Agent stores api_key → SHA-256 hash stored in DB
                ↓
Agent requests token → POST /networks/:id/auth/token { agent_id, api_key, scopes }
                ↓
Authority verifies hash → issues ES256 JWT (1h TTL)
                ↓
Agent sends request → Authorization: Bearer <JWT>
                ↓
Receiving agent verifies JWT → JWKS endpoint → validates iss, aud, scopes

Database Schema

16 tables organized by domain:

DomainTables
Identityusers, authority_keys
Networknetworks, agents, agent_routes
Teamsteams, team_agents, usage_daily
Permissionspermissions, role_scopes
Marketplacepublishers, skills, skill_versions, agent_skills, skill_reviews
Auditaudit_log