Skip to main content

Memory Engine overview

The Memory Engine stores and retrieves semantic context for your app. Content is embedded, indexed with pgvector, and queried by meaning—so you can inject relevant past context into prompts without keyword hacks.

Concepts

ConceptDescription
NamespaceIsolates memories per app or tenant (e.g. my-app, user-123). Required on every call.
EntryOne unit of memory: content (text), optional metadata (JSON). Stored and embedded.
SearchSemantic search over a namespace. Returns entries ranked by similarity to the query.
ContextPre-built context string from search results, ready to paste into an LLM prompt.

Flow

  1. Write — Send entries (and optional metadata) via the API or SDK. Foundry embeds and indexes them.
  2. Search — Query by natural language; get back the most relevant entries.
  3. Context — Call the context endpoint to get a single string you can inject into a system or user message.

Limits and quotas

  • Payload size: Max request body and per-entry content length follow your plan (see dashboard).
  • Namespaces: No hard limit; use one per app or per user for isolation.
  • Rate limits: Applied per API key; 429 responses include Retry-After.

Next steps