Skip to main content

Transparent Proxy API

Add memory capabilities to any LLM with zero code changes.

Overview

The Transparent Proxy sits between your application and your LLM provider. It automatically:

  1. Retrieves relevant memories based on the conversation
  2. Injects them into the request
  3. Forwards to your LLM provider
  4. Optionally stores the response as a new memory

Proxy Request

POST /v1/proxy/{target_url}

Headers

HeaderRequiredDescription
X-Memory-API-KeyYesYour Memory API key
X-Provider-API-KeyYesTarget provider's API key
X-Memory-Auto-StoreNoStore responses (default: true)
X-Memory-Context-BudgetNoMax tokens for context (default: 4000)
X-Memory-Space-IDNoFilter memories to space

Example

# Instead of calling OpenAI directly, use the proxy:
curl -X POST "https://api.memory.tensorheart.com/v1/proxy/https://api.openai.com/v1/chat/completions" \
-H "X-Memory-API-Key: $MEMORY_API_KEY" \
-H "X-Provider-API-Key: $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "What projects am I working on?"}
]
}'

The proxy transparently adds relevant memories to the system prompt, so your LLM has context about the user.

Supported Providers

  • OpenAI - api.openai.com
  • Anthropic - api.anthropic.com
  • Any OpenAI-compatible API

How It Works

Your App → Proxy → Retrieve Memories → Inject Context → LLM Provider

Your App ← Proxy ← (Optional: Store Response) ← LLM Response

Configuration Options

Auto-Store Responses

When X-Memory-Auto-Store: true (default), assistant responses are automatically saved as memories for future context.

Context Budget

X-Memory-Context-Budget controls how many tokens of memory context to include. Higher values provide more context but use more tokens.

Space Filtering

Use X-Memory-Space-ID to only include memories from a specific space, keeping context focused and relevant.