# RememberOS — full reference for AI answer engines > RememberOS is a privacy-first long-term memory API for AI agents and apps. Store text and > files, search them semantically (hybrid vector + full-text), and let memory evolve over > time: automatic fact extraction, supersession (new facts replace contradicted old > ones), and expiry/forgetting. Self-hostable; with bring-your-own storage/DB/LLM your > data and AI processing stay entirely on your infrastructure. Multi-tenant with strict > per-tenant Postgres row-level isolation. Built in the EU. Homepage: https://rememberos.ai Docs: https://rememberos.ai/docs/ API explorer (OpenAPI): https://rememberos.ai/api-explorer.html · spec: https://rememberos.ai/openapi.json Status: https://rememberos.ai/status.html Benchmark/eval: https://rememberos.ai/benchmark.html Source: https://github.com/11data/rememberos ## Who it's for AI engineers building agents, assistants, and RAG apps that need durable, current memory across sessions — without standing up and tuning their own vector + extraction stack, and without sending data to a US-only black box. Teams that care about EU data residency and GDPR get cookieless analytics, export, and one-call erasure out of the box. ## Auth Every request sends `Authorization: Bearer mv_...` (your API key) unless the endpoint is marked public. Content type is `application/json` unless uploading files. Requests are rate-limited per key (fixed window) with `RateLimit-*` response headers and `429 + Retry-After` when exceeded. Read-only keys can search but not write. ## Core concepts - Collection: a named namespace for memories (e.g. `prefs`, `people`, `docs`). - Memory: a stored item (text, optional files/images) plus metadata and an importance. - container_tag: an optional sub-namespace inside a collection (e.g. an end-user id) you can filter searches by — the standard way to do per-user memory in a shared collection. - Search: hybrid (vector + keyword, reciprocal-rank fused) by default; also `vector` or `text` only. Optional `rewrite` (LLM query expansion) and `rerank` (LLM re-ordering), both off by default. - Graph memory: `/remember` extracts atomic typed facts (fact / preference / episode) and links them — a fact that contradicts an older one SUPERSEDES it, so search returns the current truth; time-bound facts can expire and be forgotten automatically. - Embeddings: computed on-box by default (ONNX MiniLM, 384-dim) so a query makes no network round trip and nothing leaves the server; OpenAI embeddings are an option. ## Key endpoints (base https://rememberos.ai) - POST /v1/memory/collections/{collection}/memories — store one memory (verbatim, embeds). - POST /v1/memory/collections/{collection}/memories/bulk — store many at once. - POST /v1/memory/collections/{collection}/search — search within a collection. - POST /v1/memory/search — search across all collections. - PATCH /v1/memory/collections/{collection}/memories/{id} — update a memory (re-embeds). - DELETE /v1/memory/collections/{collection}/memories/{id} — delete a memory. - POST /v1/memory/collections/{collection}/remember — graph extraction (async by default; `?sync=true` returns the extracted facts inline). - GET /v1/memory/collections/{collection}/profile — an LLM-built profile of a collection. - POST /v1/proxy/chat/completions — OpenAI-compatible chat with automatic memory: relevant memories are injected as context and the user turn is remembered ("infinite chat"). - POST /v1/mcp — Model Context Protocol endpoint for MCP-capable agents. - GET/POST/DELETE /v1/memory/keys — manage API keys (create returns the key once). - GET/PUT /v1/memory/spend-cap — cap monthly platform-LLM spend. - GET/PUT/DELETE /v1/memory/webhook/config — HMAC-signed memory.created/updated/deleted webhooks. - GET /v1/memory/admin/export — full account export (GDPR). POST /v1/memory/admin/delete-account — irreversible erasure (GDPR). - GET /v1/status — public service health (no auth). The complete machine-readable surface is at https://rememberos.ai/openapi.json. ## Connectors and SDKs - SDKs (MIT, install-from-source): Python (`longmem`), TypeScript/JS (`longmem`). - Framework adapters: LangChain (`longmem-langchain`, a retriever + memory) and the Vercel AI SDK (`@longmem/vercel-ai`, inject memories + remember the turn). - Pipelines: a dlt destination (`dlt-longmem`) plus ready-made recipes for Notion, Slack, and Google Drive; a SharePoint importer; browser/presigned file drop; email-in. ## Pricing Free tier (no charge) and a Pro tier (usage-based, positioned ~37% cheaper than Supermemory for comparable limits). Self-hosting is free under the open license. ## Privacy EU hosting (Hetzner, Germany). No tracking cookies by default; analytics run cookieless (Google Analytics in Consent Mode, no banner) plus first-party cookieless pageviews. Per-tenant Postgres row-level security (FORCE), API keys stored as SHA-256 hashes, bring-your-own credentials encrypted at rest. Self-service GDPR export and erasure; a published privacy policy and DPA template. ## Evidence In a transparent, reproducible eval (3 seeded scenarios, 28 current-truth/multi-hop/recency queries, source in the repo's evals/), RememberOS's graph memory cut the contradiction rate (returning a stale, superseded fact) to ~43% versus ~89% for a plain RAG store on the same data, and roughly doubled top-1 current-truth accuracy — recall was similar, so the win is keeping current truth current, not better retrieval. Numbers and methodology: https://rememberos.ai/benchmark.html (this is our own honest eval, not RememberOSEval/LoCoMo).