The Routing Brain for AI engineering teams

The missing decision layer between developers and AI models.

FanOn continuously determines where AI work should run across local and cloud models, balancing cost, privacy, latency, and model suitability with transparent, policy-based decisions today.

Policy-based decisions today Local-first fallback Cost visibility Explainable by design
Developer Cursor / Continue / Cline same workflow
FanOn Routing Brain Execution decision policy, affinity, fallback, visibility
Local models Team capacity routine work stays close
Cloud models Provider fallback hard work can escalate

Problem

Every team adopted AI. Almost nobody got a control plane.

The first wave was obvious: give engineers better coding assistants. The second wave is messier. Usage spreads across providers, local models become good enough for real work, and every team starts asking the same decision question: where should this request run, and why?

01 Developers bring AI into the IDE.

Adoption happens because the workflow is already there.

02 Provider usage sprawls.

Costs, keys, policies, and fallback behavior spread across tools.

03 Local models become viable.

Laptops, workstations, and shared GPUs can handle more routine work.

04 The missing brain becomes obvious.

Teams need a decision layer, not another chat surface.

Why Existing Approaches Fail

The current answer is usually more tools, more keys, and more policy drift.

Teams try to solve the problem inside each client, each provider account, or each local model setup. That creates more places to configure, explain, audit, and debug.

Provider-only policies become a tax.

They are easy to start with, but every routine request still leaves the team.

Local-only rollouts break the workflow.

Engineers will not adopt a new path just because the infrastructure is cheaper.

Per-tool configuration does not scale.

Every IDE, agent, and script becomes another policy surface.

Routing Brain

Every AI request is a tradeoff. Most teams still make those tradeoffs by hand.

FanOn is designed to continuously evaluate execution tradeoffs and make the decision observable. Today that means explainable policies, local-first fallback, session affinity, cost estimation, and transparent route history. Future routing should become more adaptive without becoming opaque.

Cost Latency Privacy Quality Availability
FanOn Routing Brain explainable decision logic
Execution Decision local worker, provider fallback, or affinity target with a route reason the team can inspect

FanOn Solution

Give engineering teams a better way to make AI execution decisions.

FanOn is an OpenAI-compatible control plane for AI coding traffic. Existing tools point to FanOn. FanOn makes explainable decisions about when to use local capacity, when to fall back to a provider, when to preserve session affinity, and what aggregate signals the team should see.

Keep the developer workflow

Cursor, Continue, Cline, and other OpenAI-compatible clients can keep their shape.

Use local capacity first

Routine work can run on local or team-owned workers when available.

Escalate without drama

Provider fallback remains explicit for harder requests or unavailable local targets.

Explain why each path was chosen

Teams see route reasons, local/provider split, avoided spend, fallback activity, and topology.

How FanOn Thinks Today

The Routing Brain starts explainable before it becomes adaptive.

FanOn is not claiming sophisticated ML routing today. The current product direction is to make practical decisions visible, reliable, and safe, then evolve toward better optimization with the same explainability bar.

Today
  • Explainable policy-based decisions
  • Local-first provider fallback
  • Session affinity
  • Cost estimation
  • Transparent routing history
Future direction
  • Adaptive routing
  • Outcome-based optimization
  • Organization-specific routing policies
  • Learning from historical outcomes
  • Human-readable explanations for smarter decisions

Real Workflow

A request should get an execution decision without the engineer thinking about it.

1 An engineer asks for help in their IDE.

No new app. No new habit. The assistant still feels like the assistant.

2 The Routing Brain checks policy, affinity, and available workers.

The team can prefer local execution while keeping provider fallback available.

3 The request receives an execution decision.

Local when it is a fit. Provider when the request or availability calls for it.

4 Managers see aggregate value, not private conversations.

The useful story is routing, cost, latency, and reliability, not prompt content.

Trust Model

A Routing Brain has to be explainable to be trusted.

FanOn is intentionally shaped around aggregate infrastructure signals. The product direction avoids prompt inspection, employee scoring, and manager visibility into individual conversations. Engineers should be able to inspect what FanOn knows and why an execution decision was made.

Read the trust model

Supporting Proof

The dashboard exists to make execution decisions observable.

Example pilot dashboard Illustrative metrics from a 7-day local/dev pilot shape. The important questions are: why did this request run locally, why did it escalate, and what cost was avoided?

Value Overview 7 days
68% worker/local
32% provider
Estimated spend avoided $418
Estimated savings 41%

Value overview

Shows which decisions stayed local and the estimated provider spend avoided.

Routing Activity latest
local mock-code worker-a
local qwen2.5-coder worker-b
fallback gpt-4o-mini local unavailable
local mock-code worker-a

Routing activity

Shows selected targets, fallback reasons, and recent execution decisions.

Topology local/dev
FanOn API
worker-a
healthy
worker-b
healthy
provider
explicit fallback

Topology

Shows live workers, provider fallback, and the local/dev control plane.

Pilot Program

Help shape the layer that should already exist.

FanOn is a local/dev MVP for pilots and design partner discovery. The goal is to learn with teams already feeling AI provider sprawl, local model experimentation, and the need for a clearer AI execution decision story.

2-4 week pilot window Existing IDE workflow first Local workers plus optional provider fallback Transparent decisions and trust boundaries

Design Partners

Join the FanOn Design Partner Program

We are looking for engineering managers, staff engineers, platform teams, and AI infrastructure teams who think AI execution needs a real decision layer and want to help shape it.

Join the Design Partner Program

Takes about 2 minutes. We are looking for conversations and feedback, not a sales process.