Architecture

Security Whitepaper

A complete description of how Claresia is built to meet enterprise security expectations: deployment topologies, encryption posture, network architecture, identity flow, and incident response.

Security Whitepaper (PDF)

The full whitepaper as a printable PDF including all diagrams, control mappings, and references — useful as an attachment to procurement reviews.

Architecture overview

Claresia is built around a strict control plane / data plane separation. The control plane (admin console, identity, billing, distribution, skill catalog) always lives in Claresia Cloud. The data plane (Hub records — outputs, decisions, governance events, employee profiles, telemetry events) can live in Claresia Cloud or in the customer's cloud, depending on deployment mode.

Every Claresia release ships through a versioned Skill IR contract (cc-065/schema/skill-ir-v0.json) and a canonical Hub schema (cc-050). Cryptographic provenance hashes (SHA-256 over canonical JSON) are computed in the data plane and co-signed in the control plane to bind the two without leaking content.

Architectural principles

  • Customer data never leaves customer cloud in Mode C. The distribution plane runs in Claresia; LLM invocation happens in the customer's LLM tenant; the Hub record persists in customer-managed Postgres / SharePoint / Snowflake; only telemetry envelopes flow back to Claresia.
  • Zero customer code installed on day 1. Modes A and B require only an API key paste, an Azure service principal grant, or a Slack OAuth install. No agents, no daemons, no sidecars.
  • Per-tenant isolation by default. Postgres uses Row-Level Security keyed to app.tenant_id; object storage is per-tenant prefix; each tenant in Modes B and C has its own customer-rotatable KMS-managed key.
  • Identity is delegated. Claresia never stores customer passwords. WorkOS sits in front of every login; SAML and OIDC supported on day 1; SCIM 2.0 for user lifecycle.
  • Auditability is total. Every privileged action emits a governance_event Hub record. Every skill invocation emits an output + telemetry_event. The Hub provenance chain reconstructs "what did the AI do for whom on what date?" 7 years back.

Mode A — Claresia Cloud (Shared SaaS)

Topology
Mode A topology — customer LLM, Claresia control plane, shared Hub Postgres with RLS isolation.
Multi-tenant control plane and data plane. RLS isolation, AES-256 at rest, customer-rotatable encryption key, 24-hour go-live.

Mode B — Claresia Cloud Dedicated

Topology
Mode B topology — dedicated tenant Postgres with CMEK, regional pin, dedicated subnet.
Single-tenant Postgres with customer-managed encryption key, regional pinning, dedicated subnet, customer Lockbox.

Mode C — Customer Cloud (BYOC)

Topology
Mode C topology — control plane in Claresia Cloud, data plane in customer cloud, telemetry envelope-only flow back.
Hub data plane lives entirely in customer cloud. Telemetry envelopes flow back over mTLS — payloads never leave the customer's trust boundary.

Encryption

At rest. All customer data persisted in any Claresia-managed store is encrypted with AES-256 using AWS KMS (or Azure Key Vault in Microsoft-resident deployments). In Mode B and Mode C, the encryption key is customer-rotatable and stored in a customer-named KMS key ring. Claresia operates the key ring under a Customer Lockbox contract — privileged operator access requires a documented approval workflow with customer notification.

In transit. All Claresia public endpoints terminate TLS 1.3 (TLS 1.2 minimum, perfect forward secrecy enforced). All inter-service traffic inside Claresia Cloud is mTLS. The Mode C link from customer cloud back to Claresia control plane is mTLS with customer-issued certs.

Sensitive data detection. Outputs flowing through the Hub are scanned for credentials (AWS access keys, Stripe keys, GitHub tokens) and PCI / PHI markers — matches are redacted in the persisted record and surfaced as governance events.

Identity & Permission Flow

Topology
Identity flow — user, WorkOS, customer IdP, SAML/OIDC token, Claresia JWT, with SCIM lifecycle.
Every Claresia request is bound to a customer-controlled SSO claim. SCIM 2.0 lifecycle propagates user changes in under 30 seconds end-to-end.

Network architecture

Claresia Cloud runs across two active-active regions today: eu-south-1 (Milano) and eu-central-1 (Frankfurt). Each region is a complete copy of the control plane, with cross-region replication for non-customer data only. Customer data is region-pinned per tenant — once chosen, region is immutable for that tenant.

All public ingress traverses Cloudflare (WAF, DDoS protection, bot mitigation, geo-routing). Cloudflare terminates TLS at the edge and re-encrypts to the origin. Cloudflare does not see decrypted customer payloads — application-layer TLS termination only at the edge for trust.claresia.com / app.claresia.com / hub.claresia.com.

Between Claresia Cloud and the LLM providers (Anthropic, OpenAI, Vertex AI, Azure OpenAI), egress traverses Claresia-managed NAT gateways with stable egress IP ranges, which can be allowlisted in customer firewalls when desired.

Incident response

Claresia operates a 24/7 on-call rotation owned by the Engineering team. Incidents are classified at detection time as Severity 1 (customer-impacting outage), Severity 2 (degraded service), or Severity 3 (cosmetic). The response playbook follows a five-step pattern:

  1. Detect — SLO burn-rate alert or Datadog synthetic check fires; on-call paged.
  2. Triage — incident commander assigned, Statuspage updated within 15 minutes for Sev 1.
  3. Mitigate — restore service via documented runbook. Communicate progress every 30 minutes.
  4. Resolve — root cause confirmed, fix deployed, post-mortem scheduled.
  5. Post-mortem — published within 5 business days for any Sev 1 or Sev 2; remediation actions tracked to closure.

In the event of a confirmed security incident affecting customer data, Claresia's Security Lead is the incident commander and customers are notified within 72 hours per GDPR Article 33 obligations — typically much sooner. Notifications include the affected data category, customer scope, root cause, and remediation status.

Backups & disaster recovery

All Postgres clusters use point-in-time recovery (PITR) with a 7-day window for Mode A and a 35-day window for Mode B. Automated daily snapshots are retained for 90 days. Snapshots are encrypted with the same CMEK as the live cluster.

RPO (Recovery Point Objective): 5 minutes (PITR granularity for production). RTO (Recovery Time Objective): 1 hour for Mode A, 30 minutes for Mode B. Quarterly DR drills validate both metrics and are evidenced for SOC 2.