API Reference

Base URL https://rememberos.ai (or your own host). All endpoints take Authorization: Bearer mv_… unless marked public. Errors return {"detail": "…"} with 400/401/403/404 status codes.

Interactive explorer: browse every endpoint in the API Explorer (generated from the OpenAPI spec; raw spec at /openapi.json).

Rate limits. Requests are limited per API key (a generous fixed window per minute). Every response carries RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset (unix seconds); over the limit you get 429 with Retry-After. The limiter fails open, so an outage never blocks traffic.

Collections · Memories · Search · Graph memory · Files & drop · Account & keys · Email · BYO config · Managed connectors · Agents & provenance · Intelligent memory · Admin · Misc

Collections

EndpointNotes
GET /v1/memory/collectionslist your collections with memory counts
POST /v1/memory/collectionscreate — body {"name", "description?", "metadata?"}; collections are also auto-created on first write
GET /v1/memory/collections/{name}one collection's details
PUT /v1/memory/collections/{name}update description/metadata
DELETE /v1/memory/collections/{name}delete the collection and all its memories + files

Memories

Create#

POST /v1/memory/collections/{collection}/memories
{"text": "…",                  // required, 1–10000 chars
 "importance": 0.7,            // 0–1
 "category": "other",
 "source": "api",
 "metadata": {},
 "container_tag": null}        // optional sub-namespace
→ 201 {"id", "text", "importance", "category", "source", "metadata",
        "images", "created_at", "updated_at"}

Everything else#

EndpointNotes
GET …/memorieslist (newest first), ?limit= (≤200) and ?offset= for paging
GET …/memories/{id}one memory (file URLs presigned)
GET …/memories/{id}/relatedsemantically related memories in the same collection — nearest neighbours of this memory by its own embedding, excluding itself, archived, and superseded rows (?limit=5&min_score=0.0)
PATCH …/memories/{id}partial update — any of text (re-embeds), importance, category, metadata; write key required
POST …/memories/{id}/move{"to_collection": "…"} — creates target if needed
DELETE …/memories/{id}removes the memory and its stored files
POST …/memories/bulkarray of up to 100 create-bodies, one embed batch
POST …/memories/uploadmultipart: one memory with attached files
POST …/memories/foldermultipart: many files → one memory each (≤50)
POST …/memories/{id}/imagesattach a file to an existing memory
DELETE …/memories/{id}/images/{index}detach one file
POST …/forget{"query", "min_score": 0.5, "limit": 10} — deletes semantic matches; destructive
POST /v1/memory/collections/{collection}/search     # one collection
POST /v1/memory/search                              # all collections
{"query": "…",                // required
 "limit": 5,                  // 1–50
 "min_score": 0.3,            // vector threshold
 "mode": "hybrid",            // hybrid | vector | text
 "category": null,
 "metadata_filter": null,     // JSONB containment
 "container_tag": null,
 "rewrite": false,            // optional LLM query expansion (+1 LLM call)
 "rerank": false}             // optional LLM re-ranking of top results (+1 LLM call)
→ {"results": [{"id","text","category","importance","score","source",
                "metadata","images","created_at","collection?"}],
   "query_embedding_ms": 12.3, "search_ms": 4.5, "cached": false}

Results exclude superseded and expired memories. Exact-repeat queries return from the cache (~1 ms) until the collection changes.

Graph memory

EndpointNotes
POST …/remember{"content", "importance?"} → 202 {"job_id"}; add ?sync=true for inline {"created", "memories":[{id,text,type,relation}]}
GET /v1/memory/extract/job/{job_id}{"state": "queued|processing|done|failed", "facts", "complete"}
GET …/profile{"profile": "…", "based_on": N} — LLM summary of current facts + preferences

Files & drop

EndpointNotes
POST …/dropmultipart files → extract/transcribe/caption, chunk, embed. async_ingest=true → 202 {"batch_id","queued","failed"}
POST …/drop/presignarray of {"filename","content_type","size"} → presigned PUT URL per file (oversize flagged)
POST …/drop/completearray of {"key","filename","mime","size"} after the PUTs → 202 enqueue; keys validated against your tenant namespace
GET /v1/memory/ingest/batch/{batch_id}batch progress {"total","done","failed","pending","complete"}

Per-file limit 10 MB.

Account & keys

EndpointNotes
GET /v1/memory/admin/meyour tenant: name, email, tier, limits
GET /v1/memory/admin/statsyour usage: collections, memories, storage size, added_7d/added_30d growth counts
POST /v1/memory/admin/rotate-keynew key returned once; old key invalid immediately
GET /v1/memory/admin/exportGDPR export — your full account (tenant, collections, memories) as a JSON download. Text-only (no vendor-specific vectors), so it re-imports anywhere
POST /v1/memory/admin/importportability — re-ingest an export dump: {memories:[{text, collection?, …}], collections?, into_collection?}. Text is re-embedded with the active model (knowledge survives a model/vendor change), collections created as needed, ids reassigned. Returns {imported, skipped, collections}
POST /v1/memory/admin/delete-accountGDPR erasure — {"confirm": "<tenant name>"}; irreversibly deletes the account, all memories, and stored files
POST /v1/memory/recover-keypublic — {"email"}; sends a one-time reveal link
GET /v1/memory/reveal/{token}public — one-time key reveal (used by the email link)

Email drop

EndpointNotes
GET /v1/memory/email/addressyour private drop address
POST /v1/memory/email/address/rotatenew address; old stops working
GET / PUT /v1/memory/email/senderssender allowlist

BYO configuration

Each follows the same shape: GET current (secrets never returned), PUT validates with a live self-test before saving (credentials encrypted at rest), DELETE reverts to the platform default.

ConfigEndpointsPUT body
Object storage/v1/memory/storage/config{"backend":"s3","endpoint_url","bucket","region","access_key","secret_key"}
Database/v1/memory/database/configyour Postgres DSN
Extraction LLM/v1/memory/extraction/config{"provider":"custom","base_url","model","api_key"} — any OpenAI-compatible endpoint

Managed connectors

A managed connector is a connection RememberOS syncs for you server-side — you store the source's credentials once and RememberOS keeps a collection in sync (no code to run). This is distinct from the self-host connector recipes, which you run yourself via dlt. Credentials are encrypted at rest (Fernet) and never returned by the API.

EndpointDoes
POST /v1/memory/connectionscreate a connection: {provider, name?, collection, container_tag?, config, credentials} — provider ∈ notion/slack/gdrive/postgres/gmail/linear/confluence/hubspot; returns the connection (no credentials)
GET /v1/memory/connectionslist your connections (no credentials)
GET /v1/memory/connections/{id}one connection
DELETE /v1/memory/connections/{id}remove a connection
POST /v1/memory/connections/{id}/syncsync now — returns {synced, cursor, status}
GET /v1/memory/connections/{id}/runsrecent sync runs (newest first): each {status, synced_count, error, created_at} — observability into what synced and any failures (?limit=20)

How sync works. RememberOS fetches records newer than the connection's stored cursor from the provider (using the encrypted credentials), embeds and stores them in the connection's collection (scoped by its container_tag), and advances the cursor — so re-syncs are incremental. A failed sync sets status="error" with last_error and never affects other connections.

Automatic sync. Active connections also sync on their own: a background scheduler re-syncs each due connection (default every 60 min; set config.sync_interval_minutes to override). POST …/sync triggers one immediately.

Auto-pause. After 5 consecutive failed syncs a connection is set to status="paused" so the scheduler stops retrying a broken source (bad credentials, revoked token). It still appears in your connections; trigger POST …/sync manually once the cause is fixed — a successful run flips it back to active.

Server-syncable providers today: Slack (credentials.token + config.channel_id), Notion (credentials.token; optional config.database_id to sync one database instead of every shared page — pages are walked newest-edited first using the Notion API, version 2022-06-28), and Google Drive (credentials.token = a Google OAuth access token; optional config.folder_id or config.query — files are walked newest-modified first and native Google Docs are exported to text), and Linear (credentials.token = a Linear API key; issues are pulled newest-updated first via the GraphQL API, with title, description, state, assignee, team, and labels), and Confluence (Cloud: credentials.email + credentials.api_token, config.base_url, optional config.space_key — pages are walked newest-modified first and storage-format bodies are rendered to text), and Gmail (credentials.token = a Google OAuth access token with the gmail.readonly scope; optional config.query — messages are walked newest-first, subject/from + plain-text body), and HubSpot (credentials.token = a private-app token — CRM contacts are walked newest-modified first with name, title, company, and lifecycle stage). Postgres remains a self-host recipe while its managed sync rolls out.

In the dashboard. The dashboard has a Connections panel — add a connection, trigger a sync, watch its status and last error, and remove it, without writing any code.

Connect via OAuth (when configured). Instead of pasting a token, a tenant can authorize an OAuth app: GET /v1/memory/connections/oauth/{provider}/start?collection=&name= returns an authorize_url to send the user to; after they approve, the provider redirects back to GET /v1/memory/connections/oauth/{provider}/callback, which exchanges the code, stores the encrypted token as a connection, and bounces to the dashboard. Available for notion, slack, Google Drive, and Gmail once the operator sets that provider's OAuth app credentials; otherwise start returns 503 and pasting a token still works. Google Drive and Gmail use offline access, so RememberOS stores a refresh token and renews the access token automatically before each sync. The dashboard Connections panel shows a Connect with OAuth button for those providers that drives this flow for you.

Agent identities & provenance

Each API key is an agent identity (give it a label when you create it under keys). Multiple agents can share one tenant's collections — and every memory records which agent wrote it: writes (add, bulk, connectors) stamp created_by = {id, label}, returned on the memory. This is the foundation for shared, multi-agent memory — see who contributed what in a common collection.

EndpointDoes
GET /v1/memory/agentslist this tenant's agent identities (id, label, read_only, created_at) — no key material

Access control (shared vs private collections)#

By default a collection is open: every agent of the tenant can read and write it. Add a grant and the collection becomes restricted — only granted agents may access it, with per-agent read/write. So "Customer Support" can be shared across several agents while "Personal Memory" stays private to one. Writes/reads by a disallowed agent get 403; restricted collections are also excluded from that agent's tenant-wide search.

EndpointDoes
PUT /v1/memory/collections/{name}/grants{agent_key_id, can_read, can_write} — grant/update an agent (first grant restricts the collection)
GET /v1/memory/collections/{name}/grantslist grants (restricted flag + per-agent perms); empty = open
DELETE /v1/memory/collections/{name}/grants/{agent_key_id}revoke; removing the last grant reopens the collection

Intelligent memory

RememberOS doesn't just store memory — it keeps it clean. Dedup finds near-identical memories in a collection (cosine similarity over the most recent 500) and consolidates each cluster to one keeper (highest importance, then newest).

EndpointDoes
POST /v1/memory/collections/{name}/dedup{threshold=0.95, dry_run=true} — dry run returns the duplicate clusters (keeper + duplicates) without deleting; dry_run=false consolidates and returns deleted
GET /v1/memory/collections/{name}/contradictionswhere the current truth changed — each {current, superseded} pair from graph supersession (remember 'updates'), newest first
POST /v1/memory/collections/{name}/memories/{id}/pin{pinned: true|false} — pin/unpin a memory; pinned memories are always kept by dedup (never consolidated away) and rank first in listings. pinned appears on every memory
POST /v1/memory/collections/{name}/memories/{id}/archive{archived: true|false} — soft-archive (or restore) a memory. Archived memories drop out of default search and listing but are kept and restorable; archived appears on every memory
POST /v1/memory/collections/{name}/archive-stale{older_than_days=90, dry_run=true} — bulk-archive memories not updated in N days, skipping pinned and already-archived ones. Dry run returns would_archive; dry_run=false archives and returns archived. Staleness uses updated_at (no per-read tracking)
GET /v1/memory/collections/{name}/health{stale_days=90} — a one-shot health snapshot for the collection: total, current, archived, pinned, stale (unpinned, not-archived, older than stale_days), expired, supersessions, a categories breakdown, and needs_attention = stale + expired + supersessions. Cheap COUNT aggregates; {exists:false} for an unknown collection
GET /v1/memory/collections/{name}/summary{limit=50} — a board-report briefing: {summary, themes, based_on, generated_at} from the top limit memories (by importance + recency) via one bounded, cached LLM call (cached by collection version; writes invalidate). Notes notable supersession changes. Degrades to an extractive summary when no LLM is configured — never errors

Archived memories are excluded from recall by default; pass include_archived: true on search or ?include_archived=true when listing to surface them. More intelligent-memory ops (collection summaries) are rolling out.

Operator endpoints

Require an admin-flagged tenant; aggregates only, tenant isolation intact.

EndpointNotes
GET /v1/memory/admin/metricsper-route request counts, 4xx/5xx, p50/p95 latency, plus per-stage latency (embed, cache_get, search, llm_extract) under stages and event counters (search_cache_hit/search_cache_miss) under counters (?reset=true)
GET /v1/memory/admin/audityour tenant's audit log (writes + sensitive actions), newest first (?days=30&limit=200)
GET /v1/memory/admin/usageper-tenant + total storage (db/files), LLM token usage by purpose, and cost estimates (?days=30)
GET /v1/memory/admin/connectorsconnector-sync health across all tenants: per-provider {runs, errors, last_run} over a window (?hours=24)
GET /v1/memory/admin/connectors/statusconnection counts by status across all tenants — [{status, n}] — so paused/erroring connectors surface at a glance
GET /v1/memory/admin/memory-healthfleet-wide memory hygiene across all collections — {total, current, archived, stale, expired, supersessions, needs_attention}. SECURITY DEFINER cross-tenant aggregate (cheap COUNTs); the operator companion to the per-collection /health
GET /v1/memory/admin/analyticssignup→activation funnel, signups-by-day, cookieless pageviews by day/path/referrer (?days=7)

Misc

EndpointNotes
GET /v1/memory/healthpublic liveness — {"status":"healthy"}
POST /v1/proxy/chat/completionsOpenAI-compatible chat with automatic memory: relevant memories injected as a system message, response returned with a longmem extension ({memories_used, stored}); store (default true) queues the user message for graph extraction. Body: messages, model?, collection?, container_tag?, memory_limit (5), store, temperature?, max_tokens?. Non-streaming v1.
POST /v1/mcpMCP JSON-RPC — see MCP
GET /v1/statuspublic service status (no auth): current state, 24h uptime %, recent health checks — no tenant data
POST /v1/pulsepublic, cookieless pageview beacon (no PII); /v1/track is a legacy alias
POST /v1/memory/signuppublic, card-free free-tier signup: {email} creates a free tenant and emails a one-time reveal link with the key (rate-limited; one per email)
GET / POST / DELETE /v1/memory/keysmanage your API keys: list (masked, no secret), create (returns the new key once; {label, read_only}), revoke (soft; can't revoke your last active key). Read-only keys can search but not write.
GET / PUT /v1/memory/spend-capper-tenant monthly USD cap on platform LLM spend; PUT {monthly_usd} (null clears). Over the cap, /remember stores the raw memory but skips graph extraction (BYOK tenants are never capped)
GET / PUT / DELETE /v1/memory/webhook/configoutbound webhooks: PUT {url} generates the signing secret (shown once, stored encrypted); events memory.created/updated/deleted are POSTed with X-RememberOS-Signature: hmac-sha256(secret, body); best-effort, 5s timeout
POST /v1/memory/stripe/checkout/create, POST /v1/memory/stripe/portalbilling: create a checkout session / open the Stripe portal

Endpoints not listed here (/_internal/embed, /stripe/webhook, /email/inbound) are internal: secured by shared secrets or signatures, not for client use.