Cremind
Concepts & Architecture

Design Principles

The ideas that shape Cremind — self-hosted, built on open protocols, isolated by profile, cost-aware, event-driven, and storage that scales with you.

Cremind is a personal AI assistant you run yourself. A handful of principles drive nearly every design decision in it. Knowing them makes the rest of the architecture predictable — once you know why, the how tends to follow.

Cremind is version 0.0.1 — open source, early, and community-driven. These are the directions the project is steering toward, refined as it grows.

You run it yourself

Cremind ships as a server, a desktop app, and a CLI you install on your own machine or server. Your conversations, your keys, and your data stay where you put them. The default install needs no external services at all — SQLite on disk is enough to get going. When external change notifications are involved, the relay sends only content-free nudges and your own listener fetches the data with your own token (see Event-Driven Architecture).

Open protocols, not lock-in

Rather than invent proprietary plumbing, Cremind speaks two open standards:

  • A2A (Agent-to-Agent) for pluggable agents.
  • MCP (Model Context Protocol) for pluggable tools.

Bring your own LLM — Anthropic, OpenAI, or Groq — and plug in any MCP server. Because these are open protocols, the ecosystem you connect to isn't ours to gatekeep. See A2A and MCP.

Everything is a tool

Capabilities don't live in scattered special cases — they live in one unified tool registry spanning five kinds: intrinsic, builtin, A2A, MCP, and skill tools. The reasoning loop treats them uniformly, so adding a capability means adding a tool, not rewiring the agent. See The Tool Plane.

Isolation by profile

One install can host several assistants without their contexts bleeding together. Each profile keeps its own skills directory, embeddings, tool visibility, and conversation history. A work assistant, a coding assistant, and a home assistant can coexist on the same machine, each unaware of the others. See Profiles.

Spend tokens where they matter

Reasoning is expensive; routine tool calls are not. Cremind sorts models into a high group for planning and a low group for individual tool calls, so you pay frontier prices only for the thinking. The result is lower token cost without giving up plan quality. See LLM Model Groups.

React to the world, don't poll it

An assistant that only responds to typing is half an assistant. Cremind reacts to external changes through a relay WebSocket and a filesystem-watched event log, running the agent the moment something happens — sub-second latency, no polling, cron, or heartbeat loops. See Event-Driven Architecture.

Storage that scales with you

Start simple, grow when you need to. SQLite with Alembic migrations is the default and needs nothing extra. Any individual service can later be switched to Postgres, Qdrant, or ChromaDB — as a Docker sidecar or an external endpoint — from the Setup Wizard. You don't pay the operational cost of heavy infrastructure until you actually want it. See Storage.

Many surfaces, one runtime

Web UI, the Cremind App desktop client, the cremind CLI, and chat channels all talk to the same agent runtime and the same state. The surface is a choice of convenience, never a fork of your assistant.

On this page