# RememberOS — full reference for AI answer engines

> RememberOS is a privacy-first long-term memory API for AI agents and apps. Store text and
> files, search them semantically (hybrid vector + full-text), and let memory evolve over
> time: automatic fact extraction, supersession (new facts replace contradicted old
> ones), and expiry/forgetting. Self-hostable; with bring-your-own storage/DB/LLM your
> data and AI processing stay entirely on your infrastructure. Multi-tenant with strict
> per-tenant Postgres row-level isolation. Built in the EU.

Homepage: https://rememberos.ai
Docs: https://rememberos.ai/docs/
API explorer (OpenAPI): https://rememberos.ai/api-explorer.html  ·  spec: https://rememberos.ai/openapi.json
Status: https://rememberos.ai/status.html
Benchmark/eval: https://rememberos.ai/benchmark.html
Source: https://github.com/11data/rememberos

## Who it's for

AI engineers building agents, assistants, and RAG apps that need durable, current memory
across sessions — without standing up and tuning their own vector + extraction stack, and
without sending data to a US-only black box. Teams that care about EU data residency and
GDPR get cookieless analytics, export, and one-call erasure out of the box.

## Auth

Every request sends `Authorization: Bearer mv_...` (your API key) unless the endpoint is
marked public. Content type is `application/json` unless uploading files. Requests are
rate-limited per key (fixed window) with `RateLimit-*` response headers and `429 +
Retry-After` when exceeded. Read-only keys can search but not write.

## Core concepts

- Collection: a named namespace for memories (e.g. `prefs`, `people`, `docs`).
- Memory: a stored item (text, optional files/images) plus metadata and an importance.
- container_tag: an optional sub-namespace inside a collection (e.g. an end-user id) you
  can filter searches by — the standard way to do per-user memory in a shared collection.
- Search: hybrid (vector + keyword, reciprocal-rank fused) by default; also `vector` or
  `text` only. Optional `rewrite` (LLM query expansion) and `rerank` (LLM re-ordering),
  both off by default.
- Graph memory: `/remember` extracts atomic typed facts (fact / preference / episode) and
  links them — a fact that contradicts an older one SUPERSEDES it, so search returns the
  current truth; time-bound facts can expire and be forgotten automatically.
- Embeddings: computed on-box by default (ONNX MiniLM, 384-dim) so a query makes no
  network round trip and nothing leaves the server; OpenAI embeddings are an option.

## Key endpoints (base https://rememberos.ai)

- POST /v1/memory/collections/{collection}/memories — store one memory (verbatim, embeds).
- POST /v1/memory/collections/{collection}/memories/bulk — store many at once.
- POST /v1/memory/collections/{collection}/search — search within a collection.
- POST /v1/memory/search — search across all collections.
- PATCH /v1/memory/collections/{collection}/memories/{id} — update a memory (re-embeds).
- DELETE /v1/memory/collections/{collection}/memories/{id} — delete a memory.
- POST /v1/memory/collections/{collection}/remember — graph extraction (async by default;
  `?sync=true` returns the extracted facts inline).
- GET /v1/memory/collections/{collection}/profile — an LLM-built profile of a collection.
- POST /v1/proxy/chat/completions — OpenAI-compatible chat with automatic memory: relevant
  memories are injected as context and the user turn is remembered ("infinite chat").
- POST /v1/mcp — Model Context Protocol endpoint for MCP-capable agents.
- GET/POST/DELETE /v1/memory/keys — manage API keys (create returns the key once).
- GET/PUT /v1/memory/spend-cap — cap monthly platform-LLM spend.
- GET/PUT/DELETE /v1/memory/webhook/config — HMAC-signed memory.created/updated/deleted
  webhooks.
- GET /v1/memory/admin/export — full account export (GDPR). POST /v1/memory/admin/delete-account
  — irreversible erasure (GDPR).
- GET /v1/status — public service health (no auth).

The complete machine-readable surface is at https://rememberos.ai/openapi.json.

## Connectors and SDKs

- SDKs (MIT, install-from-source): Python (`longmem`), TypeScript/JS (`longmem`).
- Framework adapters: LangChain (`longmem-langchain`, a retriever + memory) and the Vercel
  AI SDK (`@longmem/vercel-ai`, inject memories + remember the turn).
- Pipelines: a dlt destination (`dlt-longmem`) plus ready-made recipes for Notion, Slack,
  and Google Drive; a SharePoint importer; browser/presigned file drop; email-in.

## Pricing

Free tier (no charge) and a Pro tier (usage-based, positioned ~37% cheaper than
Supermemory for comparable limits). Self-hosting is free under the open license.

## Privacy

EU hosting (Hetzner, Germany). No tracking cookies by default; analytics run cookieless
(Google Analytics in Consent Mode, no banner) plus first-party cookieless pageviews.
Per-tenant Postgres row-level security (FORCE), API keys stored as SHA-256 hashes,
bring-your-own credentials encrypted at rest. Self-service GDPR export and erasure; a
published privacy policy and DPA template.

## Evidence

In a transparent, reproducible eval (3 seeded scenarios, 28 current-truth/multi-hop/recency
queries, source in the repo's evals/), RememberOS's graph memory cut the contradiction rate
(returning a stale, superseded fact) to ~43% versus ~89% for a plain RAG store on the same
data, and roughly doubled top-1 current-truth accuracy — recall was similar, so the win is
keeping current truth current, not better retrieval. Numbers and methodology:
https://rememberos.ai/benchmark.html (this is our own honest eval, not RememberOSEval/LoCoMo).