Skip to content

Fold

Intelligent context management for LLM-powered agents. Reduce token costs by 50-78% through semantic condensation.
50-78%
Token Reduction
<1ms
Compression Time
Zero
API Calls Required
import { fold } from "@fold/sdk";
const ctx = fold(); // That's it!
ctx.system("You are a coding assistant...");
ctx.think("I need to search for information...");
ctx.act({ tool: "search", query: "..." }, "search");
ctx.observe("Found 3 results...", "search");
// Get optimized messages for your LLM
const messages = ctx.messages();
// Check your savings
console.log(ctx.saved());
// { tokens: 5000, percent: 45, cost: 0.05 }

Massive Savings

Cut token costs by 50-78% without sacrificing task performance. Based on NeurIPS 2025 research.

Sub-millisecond

Zero API calls for compression. Masking happens locally in under 1ms.

Semantic Preservation

Not just truncation. Fold understands your context and preserves what matters.

Drop-in Integration

Works with OpenAI, Anthropic, Vercel AI SDK, and LangChain out of the box.

FrameworkIntegration
OpenAI SDKfoldMessages(), wrapOpenAI()
Anthropic SDKfoldAnthropicMessages()
Vercel AI SDKwithFold(), useFold()
LangChain/LangGraphContextManagedMemory

Ready to cut your LLM costs?

Get started with Fold in under 5 minutes.

Get Started