Fold

Intelligent context management for LLM-powered agents. Reduce token costs by 50-78% through semantic condensation.

Get Started View on GitHub

50-78%

Token Reduction

<1ms

Compression Time

Zero

API Calls Required

Quick Example

import { fold } from "@fold/sdk";

const ctx = fold();  // That's it!

ctx.system("You are a coding assistant...");
ctx.think("I need to search for information...");
ctx.act({ tool: "search", query: "..." }, "search");
ctx.observe("Found 3 results...", "search");

// Get optimized messages for your LLM
const messages = ctx.messages();

// Check your savings
console.log(ctx.saved());
// { tokens: 5000, percent: 45, cost: 0.05 }

Why Fold?

Massive Savings

Cut token costs by 50-78% without sacrificing task performance. Based on NeurIPS 2025 research.

Sub-millisecond

Zero API calls for compression. Masking happens locally in under 1ms.

Semantic Preservation

Not just truncation. Fold understands your context and preserves what matters.

Drop-in Integration

Works with OpenAI, Anthropic, Vercel AI SDK, and LangChain out of the box.

Get Started

Quick Start Install Fold and start saving tokens in under 5 minutes.

Core Concepts Learn about sessions, blocks, and masking strategies.

Integrations Use Fold with your favorite frameworks.

API Reference Complete documentation for all exports.

Framework Support

Framework	Integration
OpenAI SDK	`foldMessages()`, `wrapOpenAI()`
Anthropic SDK	`foldAnthropicMessages()`
Vercel AI SDK	`withFold()`, `useFold()`
LangChain/LangGraph	`ContextManagedMemory`

Ready to cut your LLM costs?

Get started with Fold in under 5 minutes.

Get Started