Memory Extraction

Automatically extract and store memories from text content, files, and URLs.

Overview

The Memory API provides two approaches for extracting memories:

Text Extraction - Extract memories from text content (conversations, documents, notes)
File Ingestion - Upload and process files directly (PDFs, images, audio, etc.)

Text Extraction

Extract memories from text content using the extraction endpoint.

Basic Usage

curl -X POST https://memoryapi.tensorheart.com/v1/query/extract \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User: Hi, I am Sarah and I work at Netflix as a product manager.",
    "content_type": "conversation"
  }'

Extracted memories:

"User's name is Sarah"
"User works at Netflix"
"User is a product manager"

Request Parameters

Parameter	Type	Required	Description
`content`	string	Yes	The text content to extract memories from (max 500,000 characters)
`content_type`	string	No	Type of content: `conversation`, `document`, or `notes` (default: `conversation`)
`metadata`	object	No	Custom metadata to attach to all extracted memories
`space_id`	string	No	Memory space to store extracted memories in

Content Types

Type	Best For
`conversation`	Chat logs, transcripts, dialogue
`document`	Articles, reports, long-form text
`notes`	Meeting notes, summaries, bullet points

Adding Metadata

Tag extracted memories with source information for better organization:

{
  "content": "Meeting with client discussed Q4 goals...",
  "content_type": "notes",
  "metadata": {
    "source": "client_meeting",
    "date": "2024-01-15",
    "session_id": "abc123"
  }
}

File Ingestion

Upload files directly for processing into memories. The API automatically extracts text content and creates searchable memories.

Supported File Formats

Format	Extensions	Description
PDF	`.pdf`	Text extraction with optional OCR for scanned pages
Images	`.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`	Text and content extraction from images
Audio	`.mp3`, `.wav`, `.m4a`, `.ogg`	Speech-to-text transcription
Plain Text	`.txt`, `.md`, `.csv`	Direct text processing

Upload a File

curl -X POST https://memoryapi.tensorheart.com/v1/ingest/file \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@document.pdf" \
  -F "chunk_size=500" \
  -F "overlap=50" \
  -F "ocr=true"

File Upload Parameters

Parameter	Type	Default	Description
`file`	file	Required	The file to upload and process
`chunk_size`	integer	500	Target tokens per chunk
`overlap`	integer	50	Overlap tokens between chunks
`ocr`	boolean	true	Enable OCR for scanned PDF pages
`space_id`	string	null	Memory space to store memories in

Response

{
  "success": true,
  "data": {
    "document_id": "doc_abc123def456",
    "status": "completed",
    "memories_created": 12,
    "chunks_created": 12,
    "tokens_extracted": 5840
  }
}

URL Ingestion

Ingest content directly from a URL. Supports web pages and PDF links.

Ingest from URL

curl -X POST https://memoryapi.tensorheart.com/v1/ingest/url \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/article",
    "chunk_size": 500,
    "overlap": 50
  }'

URL Support

Type	Description
Web pages	Extracts main article/content from HTML pages
PDF URLs	Downloads and processes as PDF

Managing Documents

Get Document Status

curl https://memoryapi.tensorheart.com/v1/ingest/{document_id} \
  -H "Authorization: Bearer $API_KEY"

List Documents

curl "https://memoryapi.tensorheart.com/v1/ingest?limit=50&offset=0" \
  -H "Authorization: Bearer $API_KEY"

Get Document Chunks

View individual chunks from a processed document:

curl https://memoryapi.tensorheart.com/v1/ingest/{document_id}/chunks \
  -H "Authorization: Bearer $API_KEY"

Best Practices

Choose the right method - Use text extraction for structured text, file ingestion for documents
Clean input - Remove irrelevant content before text extraction
Use metadata - Tag memories with source information for better retrieval
Tune chunk size - Larger chunks preserve more context, smaller chunks enable finer retrieval
Enable OCR - For scanned PDFs, keep OCR enabled to extract text from images
Batch wisely - Extract from complete conversations or documents, not fragments
Use spaces - Organize memories into spaces for different projects or contexts
Review results - Periodically audit extracted memories for quality

Overview​

Text Extraction​

Basic Usage​

Request Parameters​

Content Types​

Adding Metadata​

File Ingestion​

Supported File Formats​

Upload a File​

File Upload Parameters​

Response​

URL Ingestion​

Ingest from URL​

URL Support​

Managing Documents​

Get Document Status​

List Documents​

Get Document Chunks​

Best Practices​