ProductsAnchor

One line to production‑ready agents.

Anchor is the stateful runtime layer that adds persistent sessions, exact replay, and NVIDIA acceleration to any agent — without changing your code.

Join Waitlist See Features

your_agent.py

# Before
client = OpenAI()

# After — one line change
client = OpenAI(
  base_url="https://anchor.maximlabs.co/v1"
)

persistent_sessionkv_cache_hitnim_routingexact_replayanomaly_detectionhybrid_modelstep_checkpointcontext_windowstream_deltatelemetry_spangpu_utilizationcost_forecastredis_streamsession_restoreloop_detectiontoken_budgetpersistent_sessionkv_cache_hitnim_routingexact_replayanomaly_detectionhybrid_modelstep_checkpointcontext_windowstream_deltatelemetry_spangpu_utilizationcost_forecastredis_streamsession_restoreloop_detectiontoken_budget

p99: 142mstokens: 4.2kgpu_util: 87%cache_hit: 0.94steps_replayed: 12cost_saved: $0.08sessions_active: 347avg_step: 38msnim_latency: 91msreplay_cost: $0.00kv_reuse: 71%anomalies: 0p99: 142mstokens: 4.2kgpu_util: 87%cache_hit: 0.94steps_replayed: 12cost_saved: $0.08sessions_active: 347avg_step: 38msnim_latency: 91msreplay_cost: $0.00kv_reuse: 71%anomalies: 0

Persistent Sessions

Agents that remember

Stateful agent sessions backed by Redis Streams. Full context across every step — no more mid-task context loss.

Replay Engine

Debug any failure in seconds

Re-execute any past agent run exactly as it happened. Swap models, test alternatives — all on real historical data.

Simulate Mode

Test without cost

Run shadow copies of any session with zero tool costs. A/B test prompts, models, and routing against production data.

Hybrid Routing

Right model, right time

Automatically route each step to the optimal endpoint. Cheap calls go public, complex reasoning uses your NVIDIA NIM.

Observability Dashboard

Full visibility, zero instrumentation

Execution graphs, cost forecasting, token budgets, and GPU ROI — all via OpenTelemetry. No SDK changes required.

Anomaly Detection

Catch loops before they cost you

Automatic loop detection, cost spike alerts, and lightweight NIM-powered root cause analysis for enterprise governance.

anchor — session trace

▋Simulate a production agent run

◈

State is always preserved

Every step is written to durable storage before the next begins. Failures never lose context.

↺

Replay at zero tool cost

Stored tool responses are replayed — no re-calling external APIs. Only the LLM step re-runs.

⬡

NVIDIA NIM on complex steps

Hybrid routing upgrades to NIM automatically for steps that need longer context or stronger reasoning.

Multi-Agent Warehouse Automation

Persistent sessions keep inventory, order, and logistics agents synchronized across thousands of concurrent warehouse events — with exact replay for any failed run.

Persistent SessionsHybrid Routing

Ambient Healthcare

Exact replay and simulate mode let healthcare orgs test clinical decision agents against real historical sessions — at zero cost and zero patient data risk.

Exact ReplaySimulate Mode

Cybersecurity Response

Anomaly detection flags runaway loops and cost spikes in real time. Full audit trails and OpenTelemetry traces satisfy compliance requirements out of the box.

Anomaly DetectionObservability

Data Flywheels

Hybrid routing sends data-intensive steps to on-prem NIM while keeping ICMS warm hints active — eliminating KV-cache cold starts and recovering 20–40% GPU utilization.

Hybrid RoutingNVIDIA NIM

“Anchor + NIM = production agent swarms in minutes.”

← scroll right to see Anchor →

Feature

FrameworksLangGraph, Letta

SDK ObservabilityLangSmith, Langfuse

Basic ProxiesLiteLLM, Helicone

Anchorby Maximlabs

Zero-code setup

✗

✓

Persistent session state

Partial

✗

✓

Infrastructure-level replay

✗

Partial

✗

✓

Simulate mode (zero cost)

✗

✓

Hybrid intelligent routing

✗

Basic

✓

Native NVIDIA NIM

Limited

✗

Basic

✓

ICMS / BlueField-4 support

✗

✓

OpenTelemetry observability

Partial

✓

✗

✓

No SDK changes required

✗

✓

Why no one else builds this

Model providers — Optimize for their own models — not cross-provider infrastructure. OpenAI and Anthropic have no incentive to build vendor-neutral replay.

LiteLLM — Stays stateless by design — it's a routing proxy, not a runtime harness. Stateful sessions would break their architecture.

Observability tools — Require SDK instrumentation and are post-hoc only. They can observe, but they cannot replay or simulate.

Frameworks — Require you to build inside their stack. LangGraph and Letta are great — but you can't drop them in without code changes.

Do I need to change my code?

No. Anchor is an OpenAI-compatible API proxy. You change one line — your base_url — and everything else stays the same. No SDKs, no wrappers, no lock-in.

What LLMs and providers do you support?

Anchor works with any OpenAI-compatible provider: OpenAI, Anthropic (via adapter), NVIDIA NIM, Azure OpenAI, Groq, Together, and more. We route through LiteLLM, so if LiteLLM supports it, we do too.

Is my data sent through your servers?

Yes, Anchor acts as a proxy — your requests pass through our infrastructure for session management, tracing, and replay. All data is encrypted in transit and at rest. We offer self-hosted deployments for enterprise customers who require full data sovereignty.

How does session persistence work?

Every agent session is backed by Redis Streams. We store the full context — messages, tool calls, model responses — so your agent can resume exactly where it left off, even across process restarts.

What is the replay engine?

Replay lets you re-execute any past agent run with the exact same inputs, but with a different model, prompt, or configuration. It's like git bisect for AI agents — find exactly where things went wrong.

How is this different from LangSmith or Weights & Biases?

Those are observability platforms — they watch your agents. Anchor is a runtime layer — it actively manages sessions, routes models, detects loops, and optimizes inference. We're infrastructure, not monitoring.

Is there a free tier?

Yes. We're committed to a generous free tier for individual developers and small teams. Pricing details will be announced at launch. Join the waitlist to be the first to know.

Can I self-host Anchor?

Enterprise self-hosted deployments are on our roadmap. Our initial launch will be a managed cloud service. Contact us if you need on-premise deployment.

Early Access

Be the first to deploy.

Persistent sessions backed by Redis Streams

Zero-cost replay & deterministic debugging

NVIDIA NIM hybrid routing built in

OpenTelemetry GenAI tracing — zero instrumentation

OpenAI-compatible · no SDK changes required

No spam. Unsubscribe anytime.

Free tierat launch

1-lineintegration

No lock-inOAI-compatible

One line to production‑ready agents.

Everything your agents need in production.

Persistent Sessions

Replay Engine

Simulate Mode

Hybrid Routing

Observability Dashboard

Anomaly Detection

Watch Anchor handle a real failure.

State is always preserved

Replay at zero tool cost

NVIDIA NIM on complex steps

How Anchor works

Your agent calls Anchor's endpoint

Anchor is the production harness for every NVIDIA Agent Blueprint.

Multi-Agent Warehouse Automation

Ambient Healthcare

Cybersecurity Response

Data Flywheels

The layer no one else owns.

Why no one else builds this

Frequently asked questions.

Be the first to deploy.