Search

One endpoint per collection, plus a tenant-wide variant. Hybrid by default; filter as narrowly as you like.

Basic#

hits = mem.search("what theme does Alex like?", collection="prefs",
                  limit=5, min_score=0.3, mode="hybrid")

const hits = await mem.search("what theme does Alex like?",
  { collection: "prefs", limit: 5, minScore: 0.3, mode: "hybrid" });

curl -s https://rememberos.ai/v1/memory/collections/prefs/search \
  -H "Authorization: Bearer $REMEMBEROS_API_KEY" -H "Content-Type: application/json" \
  -d '{"query": "what theme does Alex like?", "limit": 5, "mode": "hybrid"}'

Modes#

Mode	How	Use when
`hybrid` (default)	vector + full-text fused with Reciprocal Rank Fusion (k=60)	almost always
`vector`	pure embedding similarity	paraphrase-heavy queries, no keyword overlap
`text`	Postgres full-text rank	exact terms, ids, names; no embed cost

The two arms are fused symmetrically: a strong keyword-only match — a rare token, an error code, an id, an unusual name — surfaces in hybrid mode even when it falls below min_score for vector similarity. min_score gates the vector arm only; it never hides an exact lexical hit.

Filters#

min_score (default 0.3) — drop weak vector matches. Gates the vector arm only (not exact keyword hits).
category — exact match on the memory's category.
metadata_filter — JSONB containment, e.g. {"source": "runbook"}.
container_tag — scope to a sub-namespace (see Container Tags).

Boosters: rewrite & rerank (optional)#

{"query": "…", "rewrite": true, "rerank": true}

rewrite — one LLM call expands/disambiguates the query before retrieval (synonyms, resolved references). Better recall on vague queries.
rerank — retrieves 3x the limit, then one LLM call re-orders by meaning and trims. Better precision on subtle queries.

Both are off by default and each adds an LLM round-trip (~300–800 ms), billed as usage purposes rewrite/rerank. They use your BYOK extraction endpoint if configured. Failures degrade gracefully to the plain search.

Across all collections#

POST /v1/memory/search
{"query": "…", "container_tag": "user-alice"}

Each result carries a collection field so you know where it came from.

Current-truth semantics#

Search excludes superseded (is_latest = false) and expired memories — you get what's true now, automatically.

The result cache#

Exact-repeat queries are served from Redis in ~1 ms. The cache is keyed by a per-collection version that every write bumps, so results are never stale: any add/update/delete in the collection invalidates instantly. The response carries "cached": true when it was a cache hit.

PreviousAdd Context NextManage Content