Context Extension API
Enable unlimited conversation length with automatic context management.
Overview
Context Extension automatically manages long conversations by:
- Tracking all messages in a session
- Compressing old context when approaching token limits
- Retrieving relevant historical context for new messages
This allows conversations of any length while staying within LLM context limits.
Create Session
POST /v1/chat/sessions
Request Body
| Field | Type | Default | Description |
|---|---|---|---|
name | string | null | Session name |
system_prompt | string | null | System prompt |
context_budget | int | 8000 | Max tokens to maintain |
space_id | string | null | Link to memory space |
Example
curl -X POST https://api.memory.tensorheart.com/v1/chat/sessions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Project Planning",
"context_budget": 16000
}'
Response
{
"success": true,
"data": {
"id": "session_abc123",
"name": "Project Planning",
"context_budget": 16000,
"message_count": 0,
"created_at": "2024-01-15T10:30:00Z"
}
}
Send Message
Add a message to a session.
POST /v1/chat/sessions/{session_id}/messages
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | "user" or "assistant" |
content | string | Yes | Message content |
Example
curl -X POST https://api.memory.tensorheart.com/v1/chat/sessions/session_abc123/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"role": "user",
"content": "Let me tell you about the project requirements..."
}'
The system automatically compresses old context when approaching the token budget.
Get Session Context
Get optimized context for LLM calls.
GET /v1/chat/sessions/{session_id}/context
Returns recent messages plus relevant historical chunks, optimized for your context budget.
List Sessions
GET /v1/chat/sessions
Get Session
GET /v1/chat/sessions/{session_id}
Delete Session
DELETE /v1/chat/sessions/{session_id}
How Context Management Works
Messages added → Token count tracked → Budget exceeded?
↓
Compress old messages
↓
Store as searchable chunks
↓
Retrieve relevant chunks for new queries
This ensures you always have the most relevant context, regardless of conversation length.