API Reference

Base URL https://rememberos.ai (or your own host). All endpoints take Authorization: Bearer mv_… unless marked public. Errors return {"detail": "…"} with 400/401/403/404 status codes.

Interactive explorer: browse every endpoint in the API Explorer (generated from the OpenAPI spec; raw spec at /openapi.json).

Rate limits. Requests are limited per API key (a generous fixed window per minute). Every response carries RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset (unix seconds); over the limit you get 429 with Retry-After. The limiter fails open, so an outage never blocks traffic.

Collections · Memories · Search · Graph memory · Files & drop · Account & keys · Email · BYO config · Managed connectors · Agents & provenance · Intelligent memory · Admin · Misc

Collections

Endpoint	Notes
`GET /v1/memory/collections`	list your collections with memory counts
`POST /v1/memory/collections`	create — body `{"name", "description?", "metadata?"}`; collections are also auto-created on first write
`GET /v1/memory/collections/{name}`	one collection's details
`PUT /v1/memory/collections/{name}`	update description/metadata
`DELETE /v1/memory/collections/{name}`	delete the collection and all its memories + files

Memories

Create#

POST /v1/memory/collections/{collection}/memories
{"text": "…",                  // required, 1–10000 chars
 "importance": 0.7,            // 0–1
 "category": "other",
 "source": "api",
 "metadata": {},
 "container_tag": null}        // optional sub-namespace
→ 201 {"id", "text", "importance", "category", "source", "metadata",
        "images", "created_at", "updated_at"}

Everything else#

Endpoint	Notes
`GET …/memories`	list (newest first), `?limit=` (≤200) and `?offset=` for paging
`GET …/memories/{id}`	one memory (file URLs presigned)
`GET …/memories/{id}/related`	semantically related memories in the same collection — nearest neighbours of this memory by its own embedding, excluding itself, archived, and superseded rows (`?limit=5&min_score=0.0`)
`PATCH …/memories/{id}`	partial update — any of `text` (re-embeds), `importance`, `category`, `metadata`; write key required
`POST …/memories/{id}/move`	`{"to_collection": "…"}` — creates target if needed
`DELETE …/memories/{id}`	removes the memory and its stored files
`POST …/memories/bulk`	array of up to 100 create-bodies, one embed batch
`POST …/memories/upload`	multipart: one memory with attached files
`POST …/memories/folder`	multipart: many files → one memory each (≤50)
`POST …/memories/{id}/images`	attach a file to an existing memory
`DELETE …/memories/{id}/images/{index}`	detach one file
`POST …/forget`	`{"query", "min_score": 0.5, "limit": 10}` — deletes semantic matches; destructive

Search

POST /v1/memory/collections/{collection}/search     # one collection
POST /v1/memory/search                              # all collections
{"query": "…",                // required
 "limit": 5,                  // 1–50
 "min_score": 0.3,            // vector threshold
 "mode": "hybrid",            // hybrid | vector | text
 "category": null,
 "metadata_filter": null,     // JSONB containment
 "container_tag": null,
 "rewrite": false,            // optional LLM query expansion (+1 LLM call)
 "rerank": false}             // optional LLM re-ranking of top results (+1 LLM call)
→ {"results": [{"id","text","category","importance","score","source",
                "metadata","images","created_at","collection?"}],
   "query_embedding_ms": 12.3, "search_ms": 4.5, "cached": false}

Results exclude superseded and expired memories. Exact-repeat queries return from the cache (~1 ms) until the collection changes.

Graph memory

Endpoint	Notes
`POST …/remember`	`{"content", "importance?"}` → 202 `{"job_id"}`; add `?sync=true` for inline `{"created", "memories":[{id,text,type,relation}]}`
`GET /v1/memory/extract/job/{job_id}`	`{"state": "queued\|processing\|done\|failed", "facts", "complete"}`
`GET …/profile`	`{"profile": "…", "based_on": N}` — LLM summary of current facts + preferences

Files & drop

Endpoint	Notes
`POST …/drop`	multipart files → extract/transcribe/caption, chunk, embed. `async_ingest=true` → 202 `{"batch_id","queued","failed"}`
`POST …/drop/presign`	array of `{"filename","content_type","size"}` → presigned PUT URL per file (oversize flagged)
`POST …/drop/complete`	array of `{"key","filename","mime","size"}` after the PUTs → 202 enqueue; keys validated against your tenant namespace
`GET /v1/memory/ingest/batch/{batch_id}`	batch progress `{"total","done","failed","pending","complete"}`

Per-file limit 10 MB.

Account & keys

Endpoint	Notes
`GET /v1/memory/admin/me`	your tenant: name, email, tier, limits
`GET /v1/memory/admin/stats`	your usage: collections, memories, storage size, `added_7d`/`added_30d` growth counts
`POST /v1/memory/admin/rotate-key`	new key returned once; old key invalid immediately
`GET /v1/memory/admin/export`	GDPR export — your full account (tenant, collections, memories) as a JSON download. Text-only (no vendor-specific vectors), so it re-imports anywhere
`POST /v1/memory/admin/import`	portability — re-ingest an export dump: `{memories:[{text, collection?, …}], collections?, into_collection?}`. Text is re-embedded with the active model (knowledge survives a model/vendor change), collections created as needed, ids reassigned. Returns `{imported, skipped, collections}`
`POST /v1/memory/admin/delete-account`	GDPR erasure — `{"confirm": "<tenant name>"}`; irreversibly deletes the account, all memories, and stored files
`POST /v1/memory/recover-key`	public — `{"email"}`; sends a one-time reveal link
`GET /v1/memory/reveal/{token}`	public — one-time key reveal (used by the email link)

Email drop

Endpoint	Notes
`GET /v1/memory/email/address`	your private drop address
`POST /v1/memory/email/address/rotate`	new address; old stops working
`GET / PUT /v1/memory/email/senders`	sender allowlist

BYO configuration

Each follows the same shape: GET current (secrets never returned), PUT validates with a live self-test before saving (credentials encrypted at rest), DELETE reverts to the platform default.

Config	Endpoints	PUT body
Object storage	`/v1/memory/storage/config`	`{"backend":"s3","endpoint_url","bucket","region","access_key","secret_key"}`
Database	`/v1/memory/database/config`	your Postgres DSN
Extraction LLM	`/v1/memory/extraction/config`	`{"provider":"custom","base_url","model","api_key"}` — any OpenAI-compatible endpoint

Managed connectors

A managed connector is a connection RememberOS syncs for you server-side — you store the source's credentials once and RememberOS keeps a collection in sync (no code to run). This is distinct from the self-host connector recipes, which you run yourself via dlt. Credentials are encrypted at rest (Fernet) and never returned by the API.

Endpoint	Does
`POST /v1/memory/connections`	create a connection: `{provider, name?, collection, container_tag?, config, credentials}` — provider ∈ notion/slack/gdrive/postgres/gmail/linear/confluence/hubspot; returns the connection (no credentials)
`GET /v1/memory/connections`	list your connections (no credentials)
`GET /v1/memory/connections/{id}`	one connection
`DELETE /v1/memory/connections/{id}`	remove a connection
`POST /v1/memory/connections/{id}/sync`	sync now — returns `{synced, cursor, status}`
`GET /v1/memory/connections/{id}/runs`	recent sync runs (newest first): each `{status, synced_count, error, created_at}` — observability into what synced and any failures (`?limit=20`)

How sync works. RememberOS fetches records newer than the connection's stored cursor from the provider (using the encrypted credentials), embeds and stores them in the connection's collection (scoped by its container_tag), and advances the cursor — so re-syncs are incremental. A failed sync sets status="error" with last_error and never affects other connections.

Automatic sync. Active connections also sync on their own: a background scheduler re-syncs each due connection (default every 60 min; set config.sync_interval_minutes to override). POST …/sync triggers one immediately.

Auto-pause. After 5 consecutive failed syncs a connection is set to status="paused" so the scheduler stops retrying a broken source (bad credentials, revoked token). It still appears in your connections; trigger POST …/sync manually once the cause is fixed — a successful run flips it back to active.

Server-syncable providers today: Slack (credentials.token + config.channel_id), Notion (credentials.token; optional config.database_id to sync one database instead of every shared page — pages are walked newest-edited first using the Notion API, version 2022-06-28), and Google Drive (credentials.token = a Google OAuth access token; optional config.folder_id or config.query — files are walked newest-modified first and native Google Docs are exported to text), and Linear (credentials.token = a Linear API key; issues are pulled newest-updated first via the GraphQL API, with title, description, state, assignee, team, and labels), and Confluence (Cloud: credentials.email + credentials.api_token, config.base_url, optional config.space_key — pages are walked newest-modified first and storage-format bodies are rendered to text), and Gmail (credentials.token = a Google OAuth access token with the gmail.readonly scope; optional config.query — messages are walked newest-first, subject/from + plain-text body), and HubSpot (credentials.token = a private-app token — CRM contacts are walked newest-modified first with name, title, company, and lifecycle stage). Postgres remains a self-host recipe while its managed sync rolls out.

In the dashboard. The dashboard has a Connections panel — add a connection, trigger a sync, watch its status and last error, and remove it, without writing any code.

Connect via OAuth (when configured). Instead of pasting a token, a tenant can authorize an OAuth app: GET /v1/memory/connections/oauth/{provider}/start?collection=&name= returns an authorize_url to send the user to; after they approve, the provider redirects back to GET /v1/memory/connections/oauth/{provider}/callback, which exchanges the code, stores the encrypted token as a connection, and bounces to the dashboard. Available for notion, slack, Google Drive, and Gmail once the operator sets that provider's OAuth app credentials; otherwise start returns 503 and pasting a token still works. Google Drive and Gmail use offline access, so RememberOS stores a refresh token and renews the access token automatically before each sync. The dashboard Connections panel shows a Connect with OAuth button for those providers that drives this flow for you.

Agent identities & provenance

Each API key is an agent identity (give it a label when you create it under keys). Multiple agents can share one tenant's collections — and every memory records which agent wrote it: writes (add, bulk, connectors) stamp created_by = {id, label}, returned on the memory. This is the foundation for shared, multi-agent memory — see who contributed what in a common collection.

Endpoint	Does
`GET /v1/memory/agents`	list this tenant's agent identities (id, label, read_only, created_at) — no key material

Access control (shared vs private collections)#

By default a collection is open: every agent of the tenant can read and write it. Add a grant and the collection becomes restricted — only granted agents may access it, with per-agent read/write. So "Customer Support" can be shared across several agents while "Personal Memory" stays private to one. Writes/reads by a disallowed agent get 403; restricted collections are also excluded from that agent's tenant-wide search.

Endpoint	Does
`PUT /v1/memory/collections/{name}/grants`	`{agent_key_id, can_read, can_write}` — grant/update an agent (first grant restricts the collection)
`GET /v1/memory/collections/{name}/grants`	list grants (`restricted` flag + per-agent perms); empty = open
`DELETE /v1/memory/collections/{name}/grants/{agent_key_id}`	revoke; removing the last grant reopens the collection

Intelligent memory

RememberOS doesn't just store memory — it keeps it clean. Dedup finds near-identical memories in a collection (cosine similarity over the most recent 500) and consolidates each cluster to one keeper (highest importance, then newest).

Endpoint	Does
`POST /v1/memory/collections/{name}/dedup`	`{threshold=0.95, dry_run=true}` — dry run returns the duplicate `clusters` (keeper + duplicates) without deleting; `dry_run=false` consolidates and returns `deleted`
`GET /v1/memory/collections/{name}/contradictions`	where the current truth changed — each `{current, superseded}` pair from graph supersession (`remember` 'updates'), newest first
`POST /v1/memory/collections/{name}/memories/{id}/pin`	`{pinned: true\|false}` — pin/unpin a memory; pinned memories are always kept by dedup (never consolidated away) and rank first in listings. `pinned` appears on every memory
`POST /v1/memory/collections/{name}/memories/{id}/archive`	`{archived: true\|false}` — soft-archive (or restore) a memory. Archived memories drop out of default search and listing but are kept and restorable; `archived` appears on every memory
`POST /v1/memory/collections/{name}/archive-stale`	`{older_than_days=90, dry_run=true}` — bulk-archive memories not updated in N days, skipping pinned and already-archived ones. Dry run returns `would_archive`; `dry_run=false` archives and returns `archived`. Staleness uses `updated_at` (no per-read tracking)
`GET /v1/memory/collections/{name}/health`	`{stale_days=90}` — a one-shot health snapshot for the collection: `total`, `current`, `archived`, `pinned`, `stale` (unpinned, not-archived, older than `stale_days`), `expired`, `supersessions`, a `categories` breakdown, and `needs_attention = stale + expired + supersessions`. Cheap COUNT aggregates; `{exists:false}` for an unknown collection
`GET /v1/memory/collections/{name}/summary`	`{limit=50}` — a board-report briefing: `{summary, themes, based_on, generated_at}` from the top `limit` memories (by importance + recency) via one bounded, cached LLM call (cached by collection version; writes invalidate). Notes notable supersession changes. Degrades to an extractive summary when no LLM is configured — never errors

Archived memories are excluded from recall by default; pass include_archived: true on search or ?include_archived=true when listing to surface them. More intelligent-memory ops (collection summaries) are rolling out.

Operator endpoints

Require an admin-flagged tenant; aggregates only, tenant isolation intact.

Endpoint	Notes
`GET /v1/memory/admin/metrics`	per-route request counts, 4xx/5xx, p50/p95 latency, plus per-stage latency (embed, cache_get, search, llm_extract) under `stages` and event counters (`search_cache_hit`/`search_cache_miss`) under `counters` (`?reset=true`)
`GET /v1/memory/admin/audit`	your tenant's audit log (writes + sensitive actions), newest first (`?days=30&limit=200`)
`GET /v1/memory/admin/usage`	per-tenant + total storage (db/files), LLM token usage by purpose, and cost estimates (`?days=30`)
`GET /v1/memory/admin/connectors`	connector-sync health across all tenants: per-provider `{runs, errors, last_run}` over a window (`?hours=24`)
`GET /v1/memory/admin/connectors/status`	connection counts by status across all tenants — `[{status, n}]` — so paused/erroring connectors surface at a glance
`GET /v1/memory/admin/memory-health`	fleet-wide memory hygiene across all collections — `{total, current, archived, stale, expired, supersessions, needs_attention}`. SECURITY DEFINER cross-tenant aggregate (cheap COUNTs); the operator companion to the per-collection `/health`
`GET /v1/memory/admin/analytics`	signup→activation funnel, signups-by-day, cookieless pageviews by day/path/referrer (`?days=7`)

Misc

Endpoint	Notes
`GET /v1/memory/health`	public liveness — `{"status":"healthy"}`
`POST /v1/proxy/chat/completions`	OpenAI-compatible chat with automatic memory: relevant memories injected as a system message, response returned with a `rememberos` extension ({memories_used, stored}); `store` (default true) queues the user message for graph extraction. Body: messages, model?, collection?, container_tag?, memory_limit (5), store, temperature?, max_tokens?. Non-streaming v1.
`POST /v1/mcp`	MCP JSON-RPC — see MCP
`GET /v1/status`	public service status (no auth): current state, 24h uptime %, recent health checks — no tenant data
`POST /v1/pulse`	public, cookieless pageview beacon (no PII); `/v1/track` is a legacy alias
`POST /v1/memory/signup`	public, card-free free-tier signup: `{email}` creates a free tenant and emails a one-time reveal link with the key (rate-limited; one per email)
`GET / POST / DELETE /v1/memory/keys`	manage your API keys: list (masked, no secret), create (returns the new key once; `{label, read_only}`), revoke (soft; can't revoke your last active key). Read-only keys can search but not write.
`GET / PUT /v1/memory/spend-cap`	per-tenant monthly USD cap on platform LLM spend; PUT `{monthly_usd}` (null clears). Over the cap, `/remember` stores the raw memory but skips graph extraction (BYOK tenants are never capped)
`GET / PUT / DELETE /v1/memory/webhook/config`	outbound webhooks: PUT {url} generates the signing secret (shown once, stored encrypted); events memory.created/updated/deleted are POSTed with `X-RememberOS-Signature: hmac-sha256(secret, body)`; best-effort, 5s timeout
`POST /v1/memory/stripe/checkout/create`, `POST /v1/memory/stripe/portal`	billing: create a checkout session / open the Stripe portal

Endpoints not listed here (/_internal/embed, /stripe/webhook, /email/inbound) are internal: secured by shared secrets or signatures, not for client use.

PreviousEmail Drop NextPython SDK