Every session in a project cold-starts. The model can hold a conversation, not a project — so the same thinking has to happen again every time you come back to it. A pattern figured out last Tuesday gets re-figured-out this Wednesday. A decision made in one chat doesn't reach the next. The work doesn't compound; it just gets re-done.
You opened a chat last week and worked something out. You open a new one today and the model has no idea. Project memory is the operator's job, every session, from scratch.
Whatever you figured out before, the model figures out again. Same approach, same architectural choice, same trade-off — reasoned into existence twice. Or three times. Or every Tuesday.
A direction you set in one session doesn't reach the next. You re-explain. You re-justify. You re-decide. The substrate of the project lives in your head, and it leaks every chat.
Long-running projects pay re-derivation tax every session. The bill scales with project age, not with how much work you actually need done.
Other AI memory tools wrap a session in a visual workspace. You install them, you open them, you do your AI work inside them. The orchestration lives in their application — so if you want six sessions, you open six windows, and you're the message bus again.
Vinculum sits underneath whatever clients you already run. Your chat session becomes the conductor. Your CLI workers become the workforce. Every decision any session makes gets written back. Every session that joins reads what came before. You direct. The substrate does the relaying.
One human. One chat. N workers. Zero clipboards.
There's a whole field of AI memory tools. They remember the person — what you like, what you said. Vinculum remembers the project — what got decided, what it supersedes, and why.
Preferences, facts, personalization — carried across your chats.
Self-editing memory inside one long-running runtime.
Long-term chat history and the facts extracted from it.
Decisions, specs, implementations, and the links between them — shared across every session and every tool.
And you don't have to choose. Keep Mem0 for personalization, keep Letta for your runtime — Vinculum sits underneath all of it. The substrate doesn't compete with the rest of your stack. It's the layer everything else was riding on top of without knowing.
Each role is a job— what a session is responsible for. The model tier it runs on is a property of that job, not the headline. You don't pick a model and get a role; you have work to do, and the role names it.
The human. You set direction, decide what gets built and when, and read the dashboard to keep a picture of the whole project. Not a spawned session — the person the five roles below work for.
Translates intent into routed work.Every interactive chat window — claude.ai, the desktop app, or CLI Claude in a terminal. Takes what you said, writes directives, and points them at the right workers. Directs; doesn't implement.
A colonel pointed at one thing. A recon-and-report aide, or a heavy task that needs Opus-grade judgment to execute rather than delegate. Distinguished by its assignment, not by being a lesser colonel.
Breaks a brief into directives and directs.Takes a colonel-sized brief, splits it into atomic worker directives, dispatches them, and directs the workers that claim them. That's the whole job — decomposition and direction.
Executes directives, judgment included. Claims a directive and does the work, writing back everything it decides — implementations, questions, blockers — so the LT and colonel stay current without polling.
Executes the mechanical work.Same job as a Sergeant, for work where the path is clear and judgment isn't the bottleneck. The role is identical; the model tier is the only difference.
role="grunt" — the plumbing word for any spawned worker. The role you actually reason about — Sergeant, Private, Lieutenant — is set by the model tier you pass alongside it.spawn_grunt(role="grunt", model="claude-sonnet-4-6", session_label="builder-auth-flow")Dynamic workflows run hundreds of subagents in parallel, converge, and hand back an answer — then discard the reasoning that produced it. A hundred thousand tokens of thinking, compressed to a summary and thrown away. The orchestration is solved. The memory of it isn't. Vinculum is where that thinking lands.
Teams are learning that pointing more AI at a problem doesn't compound. Every session starts cold and re-derives what the last one knew. The waste isn't the compute — it's the thinking you paid for and discarded.
Developers are starring plain-text rule files that beg the model to not assume and not forget— because nobody gave it a way to actually remember. The answer was never a better sticky note. It's a substrate.
The memory layer is the empty lane. Vinculum is in it.
Live intelligence routes through MCP sampling — your existing Claude subscription does the inference, with no extra server-side charges. Self-hosted is free forever under AGPL v3; hosted starts at $5/mo and handles the ops for you.
The dashboard is the workspace, not a picture of it. Nodes are entries, branches are threads, edges are relations. Mission Control sits underneath, surfacing what the substrate notices — drift, re-derivation, distillation — as the work happens. Live via SSE; new entries appear as you work.
You use Claude in chat. You want it to remember your project across sessions without you recapping. You don't want to think about servers or hosting. Five bucks a month, hosted.
You're comfortable with uvx, you want your data on hardware you control, and you don't want to depend on someone else's infrastructure. Self-hosted Vinculum is free forever under AGPL v3.
You already pay for Claude Max. You run multiple sessions, sometimes many. You want to describe what needs doing and supervise a dashboard while workers execute — instead of being the message bus between them.
Just strap it on. Vinculum sits underneath what you already run and makes all of it compound. No bridge to burn.
Strap it on. You're off to the races, and you gave up nothing.
Flat monthly, never a cut of your Claude bill. Self-hosted is free forever under AGPL v3; the hosted tiers add managed ops and background intelligence. Full pricing breakdown →
Self-host or 1-project hosted. AGPL v3, free forever (commercial license for enterprises).
Unlimited projects. Unlimited retention. Live intelligence.
Spawn workers. Background intelligence + Mission Control. No ops.
Up to 10 members. Shared projects. Audit log.
Enterprise — SSO · on-prem · custom RBAC · SLA · Contact us →
uvx vinculum-mcpand you're up in about 60 seconds. The hosted service at vinculum.run runs the same code with managed auth, multi-tenant Postgres, and background intelligence on our infrastructure. Use it if you don't want to run a box; skip it if you do.Self-hosted is free and always will be. The hosted tier starts at $5/mo and handles all the ops. Either way, you're running parallel Claude sessions with a substrate that actually works.
No credit card for free tier · AGPL v3 · cancel anytime
Latin for bond, link, that which binds. In mathematics, the bar over a repeating decimal — the mark that says these digits recur, indefinitely, as a unit. That is the architectural claim: a substrate that accumulates a project's intent across its entire life, holding parallel sessions together not by synchronizing them, but by giving them a single authoritative surface to read from and write to. The sessions are ephemeral. The graph is not.
— the substrate that persists —