Transparent Proxy API

Add memory capabilities to any LLM with zero code changes.

Overview

The Transparent Proxy sits between your application and your LLM provider. It automatically:

Retrieves relevant memories based on the conversation
Injects them into the request
Forwards to your LLM provider
Optionally stores the response as a new memory

Proxy Request

POST /v1/proxy/{target_url}

Headers

Header	Required	Description
`X-Memory-API-Key`	Yes	Your Memory API key
`X-Provider-API-Key`	Yes	Target provider's API key
`X-Memory-Auto-Store`	No	Store responses (default: true)
`X-Memory-Context-Budget`	No	Max tokens for context (default: 4000)
`X-Memory-Space-ID`	No	Filter memories to space

Example

# Instead of calling OpenAI directly, use the proxy:
curl -X POST "https://memoryapi.tensorheart.com/v1/proxy/https://api.openai.com/v1/chat/completions" \
  -H "X-Memory-API-Key: $MEMORY_API_KEY" \
  -H "X-Provider-API-Key: $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "What projects am I working on?"}
    ]
  }'

The proxy transparently adds relevant memories to the system prompt, so your LLM has context about the user.

Supported Providers

OpenAI - api.openai.com
Anthropic - api.anthropic.com
Any OpenAI-compatible API

How It Works

Your App → Proxy → Retrieve Memories → Inject Context → LLM Provider
                                                              ↓
Your App ← Proxy ← (Optional: Store Response) ← LLM Response

Configuration Options

Auto-Store Responses

When X-Memory-Auto-Store: true (default), assistant responses are automatically saved as memories for future context.

Context Budget

X-Memory-Context-Budget controls how many tokens of memory context to include. Higher values provide more context but use more tokens.

Space Filtering

Use X-Memory-Space-ID to only include memories from a specific space, keeping context focused and relevant.

Overview​

Proxy Request​

Headers​

Example​

Supported Providers​

How It Works​

Configuration Options​

Auto-Store Responses​

Context Budget​

Space Filtering​