API Reference
Base URL https://rememberos.ai (or your own host). All
endpoints take Authorization: Bearer mv_… unless marked public. Errors
return {"detail": "…"} with 400/401/403/404 status codes.
Interactive explorer: browse every endpoint in the API Explorer (generated from the OpenAPI spec; raw spec at /openapi.json).
Rate limits. Requests are limited per API key (a generous fixed window per
minute). Every response carries RateLimit-Limit, RateLimit-Remaining,
and RateLimit-Reset (unix seconds); over the limit you get 429
with Retry-After. The limiter fails open, so an outage never blocks traffic.
Collections · Memories · Search · Graph memory · Files & drop · Account & keys · Email · BYO config · Managed connectors · Agents & provenance · Intelligent memory · Admin · Misc
Collections
| Endpoint | Notes |
|---|---|
GET /v1/memory/collections | list your collections with memory counts |
POST /v1/memory/collections | create — body {"name", "description?", "metadata?"}; collections are also auto-created on first write |
GET /v1/memory/collections/{name} | one collection's details |
PUT /v1/memory/collections/{name} | update description/metadata |
DELETE /v1/memory/collections/{name} | delete the collection and all its memories + files |
Memories
Create#
POST /v1/memory/collections/{collection}/memories
{"text": "…", // required, 1–10000 chars
"importance": 0.7, // 0–1
"category": "other",
"source": "api",
"metadata": {},
"container_tag": null} // optional sub-namespace
→ 201 {"id", "text", "importance", "category", "source", "metadata",
"images", "created_at", "updated_at"}
Everything else#
| Endpoint | Notes |
|---|---|
GET …/memories | list (newest first), ?limit= (≤200) and ?offset= for paging |
GET …/memories/{id} | one memory (file URLs presigned) |
GET …/memories/{id}/related | semantically related memories in the same collection — nearest neighbours of this memory by its own embedding, excluding itself, archived, and superseded rows (?limit=5&min_score=0.0) |
PATCH …/memories/{id} | partial update — any of text (re-embeds), importance, category, metadata; write key required |
POST …/memories/{id}/move | {"to_collection": "…"} — creates target if needed |
DELETE …/memories/{id} | removes the memory and its stored files |
POST …/memories/bulk | array of up to 100 create-bodies, one embed batch |
POST …/memories/upload | multipart: one memory with attached files |
POST …/memories/folder | multipart: many files → one memory each (≤50) |
POST …/memories/{id}/images | attach a file to an existing memory |
DELETE …/memories/{id}/images/{index} | detach one file |
POST …/forget | {"query", "min_score": 0.5, "limit": 10} — deletes semantic matches; destructive |
Search
POST /v1/memory/collections/{collection}/search # one collection
POST /v1/memory/search # all collections
{"query": "…", // required
"limit": 5, // 1–50
"min_score": 0.3, // vector threshold
"mode": "hybrid", // hybrid | vector | text
"category": null,
"metadata_filter": null, // JSONB containment
"container_tag": null,
"rewrite": false, // optional LLM query expansion (+1 LLM call)
"rerank": false} // optional LLM re-ranking of top results (+1 LLM call)
→ {"results": [{"id","text","category","importance","score","source",
"metadata","images","created_at","collection?"}],
"query_embedding_ms": 12.3, "search_ms": 4.5, "cached": false}
Results exclude superseded and expired memories. Exact-repeat queries return from the cache (~1 ms) until the collection changes.
Graph memory
| Endpoint | Notes |
|---|---|
POST …/remember | {"content", "importance?"} → 202 {"job_id"}; add ?sync=true for inline {"created", "memories":[{id,text,type,relation}]} |
GET /v1/memory/extract/job/{job_id} | {"state": "queued|processing|done|failed", "facts", "complete"} |
GET …/profile | {"profile": "…", "based_on": N} — LLM summary of current facts + preferences |
Files & drop
| Endpoint | Notes |
|---|---|
POST …/drop | multipart files → extract/transcribe/caption, chunk, embed. async_ingest=true → 202 {"batch_id","queued","failed"} |
POST …/drop/presign | array of {"filename","content_type","size"} → presigned PUT URL per file (oversize flagged) |
POST …/drop/complete | array of {"key","filename","mime","size"} after the PUTs → 202 enqueue; keys validated against your tenant namespace |
GET /v1/memory/ingest/batch/{batch_id} | batch progress {"total","done","failed","pending","complete"} |
Per-file limit 10 MB.
Account & keys
| Endpoint | Notes |
|---|---|
GET /v1/memory/admin/me | your tenant: name, email, tier, limits |
GET /v1/memory/admin/stats | your usage: collections, memories, storage size, added_7d/added_30d growth counts |
POST /v1/memory/admin/rotate-key | new key returned once; old key invalid immediately |
GET /v1/memory/admin/export | GDPR export — your full account (tenant, collections, memories) as a JSON download. Text-only (no vendor-specific vectors), so it re-imports anywhere |
POST /v1/memory/admin/import | portability — re-ingest an export dump: {memories:[{text, collection?, …}], collections?, into_collection?}. Text is re-embedded with the active model (knowledge survives a model/vendor change), collections created as needed, ids reassigned. Returns {imported, skipped, collections} |
POST /v1/memory/admin/delete-account | GDPR erasure — {"confirm": "<tenant name>"}; irreversibly deletes the account, all memories, and stored files |
POST /v1/memory/recover-key | public — {"email"}; sends a one-time reveal link |
GET /v1/memory/reveal/{token} | public — one-time key reveal (used by the email link) |
Email drop
| Endpoint | Notes |
|---|---|
GET /v1/memory/email/address | your private drop address |
POST /v1/memory/email/address/rotate | new address; old stops working |
GET / PUT /v1/memory/email/senders | sender allowlist |
BYO configuration
Each follows the same shape: GET current (secrets never returned),
PUT validates with a live self-test before saving (credentials encrypted at
rest), DELETE reverts to the platform default.
| Config | Endpoints | PUT body |
|---|---|---|
| Object storage | /v1/memory/storage/config | {"backend":"s3","endpoint_url","bucket","region","access_key","secret_key"} |
| Database | /v1/memory/database/config | your Postgres DSN |
| Extraction LLM | /v1/memory/extraction/config | {"provider":"custom","base_url","model","api_key"} — any OpenAI-compatible endpoint |
Managed connectors
A managed connector is a connection RememberOS syncs for you server-side — you store the source's credentials once and RememberOS keeps a collection in sync (no code to run). This is distinct from the self-host connector recipes, which you run yourself via dlt. Credentials are encrypted at rest (Fernet) and never returned by the API.
| Endpoint | Does |
|---|---|
POST /v1/memory/connections | create a connection: {provider, name?, collection, container_tag?, config, credentials} — provider ∈ notion/slack/gdrive/postgres/gmail/linear/confluence/hubspot; returns the connection (no credentials) |
GET /v1/memory/connections | list your connections (no credentials) |
GET /v1/memory/connections/{id} | one connection |
DELETE /v1/memory/connections/{id} | remove a connection |
POST /v1/memory/connections/{id}/sync | sync now — returns {synced, cursor, status} |
GET /v1/memory/connections/{id}/runs | recent sync runs (newest first): each {status, synced_count, error, created_at} — observability into what synced and any failures (?limit=20) |
How sync works. RememberOS fetches records newer than the connection's stored cursor
from the provider (using the encrypted credentials), embeds and stores them in the
connection's collection (scoped by its container_tag), and advances
the cursor — so re-syncs are incremental. A failed sync sets status="error" with
last_error and never affects other connections.
Automatic sync. Active connections also sync on their own: a background scheduler
re-syncs each due connection (default every 60 min; set config.sync_interval_minutes
to override). POST …/sync triggers one immediately.
Auto-pause. After 5 consecutive failed syncs a connection is set to
status="paused" so the scheduler stops retrying a broken source (bad credentials,
revoked token). It still appears in your connections; trigger POST …/sync manually
once the cause is fixed — a successful run flips it back to active.
Server-syncable providers today: Slack (credentials.token +
config.channel_id), Notion (credentials.token; optional
config.database_id to sync one database instead of every shared page — pages are
walked newest-edited first using the Notion API, version 2022-06-28), and Google Drive
(credentials.token = a Google OAuth access token; optional config.folder_id
or config.query — files are walked newest-modified first and native Google Docs are
exported to text), and Linear (credentials.token = a Linear API key;
issues are pulled newest-updated first via the GraphQL API, with title, description, state,
assignee, team, and labels), and Confluence (Cloud: credentials.email
+ credentials.api_token, config.base_url, optional
config.space_key — pages are walked newest-modified first and storage-format
bodies are rendered to text), and Gmail (credentials.token = a Google
OAuth access token with the gmail.readonly scope; optional config.query — messages
are walked newest-first, subject/from + plain-text body), and HubSpot
(credentials.token = a private-app token — CRM contacts are walked
newest-modified first with name, title, company, and lifecycle stage). Postgres remains a
self-host recipe while its managed sync rolls out.
In the dashboard. The dashboard has a Connections panel — add a connection, trigger a sync, watch its status and last error, and remove it, without writing any code.
Connect via OAuth (when configured). Instead of pasting a token, a tenant can
authorize an OAuth app: GET /v1/memory/connections/oauth/{provider}/start?collection=&name=
returns an authorize_url to send the user to; after they approve, the provider
redirects back to GET /v1/memory/connections/oauth/{provider}/callback, which
exchanges the code, stores the encrypted token as a connection, and bounces to the dashboard.
Available for notion, slack, Google Drive, and
Gmail once the operator sets that provider's OAuth app credentials; otherwise
start returns 503 and pasting a token still works. Google Drive and Gmail use
offline access, so RememberOS stores a refresh token and renews the access token automatically
before each sync. The
dashboard Connections panel shows a Connect with OAuth
button for those providers that drives this flow for you.
Agent identities & provenance
Each API key is an agent identity (give it a label when you create it under
keys). Multiple agents can share one tenant's collections — and
every memory records which agent wrote it: writes (add, bulk,
connectors) stamp created_by = {id, label}, returned on the memory. This is
the foundation for shared, multi-agent memory — see who contributed what in a common
collection.
| Endpoint | Does |
|---|---|
GET /v1/memory/agents | list this tenant's agent identities (id, label, read_only, created_at) — no key material |
Access control (shared vs private collections)#
By default a collection is open: every agent of the tenant can read and write it.
Add a grant and the collection becomes restricted — only granted agents may
access it, with per-agent read/write. So "Customer Support" can be shared across several
agents while "Personal Memory" stays private to one. Writes/reads by a disallowed agent get
403; restricted collections are also excluded from that agent's tenant-wide
search.
| Endpoint | Does |
|---|---|
PUT /v1/memory/collections/{name}/grants | {agent_key_id, can_read, can_write} — grant/update an agent (first grant restricts the collection) |
GET /v1/memory/collections/{name}/grants | list grants (restricted flag + per-agent perms); empty = open |
DELETE /v1/memory/collections/{name}/grants/{agent_key_id} | revoke; removing the last grant reopens the collection |
Intelligent memory
RememberOS doesn't just store memory — it keeps it clean. Dedup finds near-identical memories in a collection (cosine similarity over the most recent 500) and consolidates each cluster to one keeper (highest importance, then newest).
| Endpoint | Does |
|---|---|
POST /v1/memory/collections/{name}/dedup | {threshold=0.95, dry_run=true} — dry run returns the duplicate clusters (keeper + duplicates) without deleting; dry_run=false consolidates and returns deleted |
GET /v1/memory/collections/{name}/contradictions | where the current truth changed — each {current, superseded} pair from graph supersession (remember 'updates'), newest first |
POST /v1/memory/collections/{name}/memories/{id}/pin | {pinned: true|false} — pin/unpin a memory; pinned memories are always kept by dedup (never consolidated away) and rank first in listings. pinned appears on every memory |
POST /v1/memory/collections/{name}/memories/{id}/archive | {archived: true|false} — soft-archive (or restore) a memory. Archived memories drop out of default search and listing but are kept and restorable; archived appears on every memory |
POST /v1/memory/collections/{name}/archive-stale | {older_than_days=90, dry_run=true} — bulk-archive memories not updated in N days, skipping pinned and already-archived ones. Dry run returns would_archive; dry_run=false archives and returns archived. Staleness uses updated_at (no per-read tracking) |
GET /v1/memory/collections/{name}/health | {stale_days=90} — a one-shot health snapshot for the collection: total, current, archived, pinned, stale (unpinned, not-archived, older than stale_days), expired, supersessions, a categories breakdown, and needs_attention = stale + expired + supersessions. Cheap COUNT aggregates; {exists:false} for an unknown collection |
GET /v1/memory/collections/{name}/summary | {limit=50} — a board-report briefing: {summary, themes, based_on, generated_at} from the top limit memories (by importance + recency) via one bounded, cached LLM call (cached by collection version; writes invalidate). Notes notable supersession changes. Degrades to an extractive summary when no LLM is configured — never errors |
Archived memories are excluded from recall by default; pass
include_archived: true on search or ?include_archived=true when listing
to surface them. More intelligent-memory ops (collection summaries) are rolling out.
Operator endpoints
Require an admin-flagged tenant; aggregates only, tenant isolation intact.
| Endpoint | Notes |
|---|---|
GET /v1/memory/admin/metrics | per-route request counts, 4xx/5xx, p50/p95 latency, plus per-stage latency (embed, cache_get, search, llm_extract) under stages and event counters (search_cache_hit/search_cache_miss) under counters (?reset=true) |
GET /v1/memory/admin/audit | your tenant's audit log (writes + sensitive actions), newest first (?days=30&limit=200) |
GET /v1/memory/admin/usage | per-tenant + total storage (db/files), LLM token usage by purpose, and cost estimates (?days=30) |
GET /v1/memory/admin/connectors | connector-sync health across all tenants: per-provider {runs, errors, last_run} over a window (?hours=24) |
GET /v1/memory/admin/connectors/status | connection counts by status across all tenants — [{status, n}] — so paused/erroring connectors surface at a glance |
GET /v1/memory/admin/memory-health | fleet-wide memory hygiene across all collections — {total, current, archived, stale, expired, supersessions, needs_attention}. SECURITY DEFINER cross-tenant aggregate (cheap COUNTs); the operator companion to the per-collection /health |
GET /v1/memory/admin/analytics | signup→activation funnel, signups-by-day, cookieless pageviews by day/path/referrer (?days=7) |
Misc
| Endpoint | Notes |
|---|---|
GET /v1/memory/health | public liveness — {"status":"healthy"} |
POST /v1/proxy/chat/completions | OpenAI-compatible chat with automatic memory: relevant memories injected as a system message, response returned with a longmem extension ({memories_used, stored}); store (default true) queues the user message for graph extraction. Body: messages, model?, collection?, container_tag?, memory_limit (5), store, temperature?, max_tokens?. Non-streaming v1. |
POST /v1/mcp | MCP JSON-RPC — see MCP |
GET /v1/status | public service status (no auth): current state, 24h uptime %, recent health checks — no tenant data |
POST /v1/pulse | public, cookieless pageview beacon (no PII); /v1/track is a legacy alias |
POST /v1/memory/signup | public, card-free free-tier signup: {email} creates a free tenant and emails a one-time reveal link with the key (rate-limited; one per email) |
GET / POST / DELETE /v1/memory/keys | manage your API keys: list (masked, no secret), create (returns the new key once; {label, read_only}), revoke (soft; can't revoke your last active key). Read-only keys can search but not write. |
GET / PUT /v1/memory/spend-cap | per-tenant monthly USD cap on platform LLM spend; PUT {monthly_usd} (null clears). Over the cap, /remember stores the raw memory but skips graph extraction (BYOK tenants are never capped) |
GET / PUT / DELETE /v1/memory/webhook/config | outbound webhooks: PUT {url} generates the signing secret (shown once, stored encrypted); events memory.created/updated/deleted are POSTed with X-RememberOS-Signature: hmac-sha256(secret, body); best-effort, 5s timeout |
POST /v1/memory/stripe/checkout/create, POST /v1/memory/stripe/portal | billing: create a checkout session / open the Stripe portal |
Endpoints not listed here (/_internal/embed, /stripe/webhook,
/email/inbound) are internal: secured by shared secrets or signatures, not
for client use.