Memory Engine overview

The Memory Engine stores and retrieves semantic context for your app. Content is embedded, indexed with pgvector, and queried by meaning—so you can inject relevant past context into prompts without keyword hacks.

Concepts

Concept	Description
Namespace	Isolates memories per app or tenant (e.g. `my-app`, `user-123`). Required on every call.
Entry	One unit of memory: `content` (text), optional `metadata` (JSON). Stored and embedded.
Search	Semantic search over a namespace. Returns entries ranked by similarity to the query.
Context	Pre-built context string from search results, ready to paste into an LLM prompt.

Flow

Write — Send entries (and optional metadata) via the API or SDK. Foundry embeds and indexes them.
Search — Query by natural language; get back the most relevant entries.
Context — Call the context endpoint to get a single string you can inject into a system or user message.

Limits and quotas

Payload size: Max request body and per-entry content length follow your plan (see dashboard).
Namespaces: No hard limit; use one per app or per user for isolation.
Rate limits: Applied per API key; 429 responses include Retry-After.

Next steps

Write memories

Write single or batch entries with metadata.

Search

Semantic search and ranking.

Context

Build context strings for LLM prompts.

Quickstart Write memories

​Memory Engine overview

​Concepts

​Flow

​Limits and quotas

​Next steps