Massive Savings
Cut token costs by 50-78% without sacrificing task performance. Based on NeurIPS 2025 research.
import { fold } from "@fold/sdk";
const ctx = fold(); // That's it!
ctx.system("You are a coding assistant...");ctx.think("I need to search for information...");ctx.act({ tool: "search", query: "..." }, "search");ctx.observe("Found 3 results...", "search");
// Get optimized messages for your LLMconst messages = ctx.messages();
// Check your savingsconsole.log(ctx.saved());// { tokens: 5000, percent: 45, cost: 0.05 }Massive Savings
Cut token costs by 50-78% without sacrificing task performance. Based on NeurIPS 2025 research.
Sub-millisecond
Zero API calls for compression. Masking happens locally in under 1ms.
Semantic Preservation
Not just truncation. Fold understands your context and preserves what matters.
Drop-in Integration
Works with OpenAI, Anthropic, Vercel AI SDK, and LangChain out of the box.
| Framework | Integration |
|---|---|
| OpenAI SDK | foldMessages(), wrapOpenAI() |
| Anthropic SDK | foldAnthropicMessages() |
| Vercel AI SDK | withFold(), useFold() |
| LangChain/LangGraph | ContextManagedMemory |