Transparent Proxy API
Add memory capabilities to any LLM with zero code changes.
Overview
The Transparent Proxy sits between your application and your LLM provider. It automatically:
- Retrieves relevant memories based on the conversation
- Injects them into the request
- Forwards to your LLM provider
- Optionally stores the response as a new memory
Proxy Request
POST /v1/proxy/{target_url}
Headers
| Header | Required | Description |
|---|---|---|
X-Memory-API-Key | Yes | Your Memory API key |
X-Provider-API-Key | Yes | Target provider's API key |
X-Memory-Auto-Store | No | Store responses (default: true) |
X-Memory-Context-Budget | No | Max tokens for context (default: 4000) |
X-Memory-Space-ID | No | Filter memories to space |
Example
# Instead of calling OpenAI directly, use the proxy:
curl -X POST "https://api.memory.tensorheart.com/v1/proxy/https://api.openai.com/v1/chat/completions" \
-H "X-Memory-API-Key: $MEMORY_API_KEY" \
-H "X-Provider-API-Key: $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "What projects am I working on?"}
]
}'
The proxy transparently adds relevant memories to the system prompt, so your LLM has context about the user.
Supported Providers
- OpenAI -
api.openai.com - Anthropic -
api.anthropic.com - Any OpenAI-compatible API
How It Works
Your App → Proxy → Retrieve Memories → Inject Context → LLM Provider
↓
Your App ← Proxy ← (Optional: Store Response) ← LLM Response
Configuration Options
Auto-Store Responses
When X-Memory-Auto-Store: true (default), assistant responses are automatically saved as memories for future context.
Context Budget
X-Memory-Context-Budget controls how many tokens of memory context to include. Higher values provide more context but use more tokens.
Space Filtering
Use X-Memory-Space-ID to only include memories from a specific space, keeping context focused and relevant.