Context-maxxing as a service
A model-agnostic API that transforms raw conversation history into an optimized, compressed messages array — ready for your next LLM call.
Pass your messages array. Get back a distilled version. Drop it straight into your next LLM call.
// Before: 4,200 tokens const result = await fetch('https://contextspa.com/api/distill', { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}` }, body: JSON.stringify({ messages: conversationHistory, strategy_id: 'technical_dense', }), }); // After: 610 tokens (85% reduction) const { messages, metadata } = await result.json();
What it does
Your conversation history accumulates noise: pleasantries, restatements, abandoned threads. contextspa strips that out and hands back only the signal.
Messages in, messages out. No storage. No memory. No side effects. Your data doesn't live here.
Choose how your context is distilled. Technical dense, aggressive summarize, decision extraction — or write your own.
Works with any downstream model. Feed the output to GPT-4, Claude, Gemini, or your local model — doesn't matter.
Append a short instruction to any strategy at call time. "Preserve all mentions of variable authToken" — and it will.
Built-in strategies
Pick the right distillation mode for your use case. Full schema and author guide at docs → strategies.
Reduces conversation to essential outcomes. Preserves conclusions, decisions, and open questions. Everything else: summarized to one sentence.
Preserves all code blocks, error messages, variable names, and architectural decisions verbatim. Strips conversational filler aggressively.
Extracts only explicit decisions, commitments, and their rationale. Everything else is dropped. Best before a planning or review session.
Community strategies coming soon. See docs for the strategy schema and author guide.
How it works
contextspa sits between your conversation history and your next LLM call. No agent rewiring. No new memory layer.
/distillPOST your messages array with a strategy ID and your API key. Optionally append a one-off inject instruction.
The strategy's system prompt runs against your messages using Gemini Flash. Output is a clean messages array.
Drop the returned messages array directly into your next call. Same format as the input — no adapter needed.
Deposit credits, use them. You'll typically spend $0.01–$0.05 to save 80–95% of your downstream context cost.