Costs scale with every developer
IDE assistants make teams faster, but cloud-model usage can become a growing recurring infrastructure cost.
Pilot program for engineering teams
FanOn routes requests between local models and cloud models, keeping simple work local and escalating harder work when needed.
Example dashboard values for design direction. Real metrics depend on pilot traffic.
Problem
IDE assistants make teams faster, but cloud-model usage can become a growing recurring infrastructure cost.
Engineers should not need to leave Cursor, Continue, Cline, or other tools just because the organization wants better routing.
Laptops, workstations, and shared machines can handle many routine AI tasks, but teams need a transparent routing layer to use them well.
How FanOn Works
FanOn exposes an OpenAI-compatible endpoint. Existing clients point at FanOn, and FanOn routes each request through local workers first, with explicit provider fallback when configured.
Use OpenAI-compatible settings from IDE tools and local playgrounds.
Routine, private, or low-cost work can stay on local or team-owned machines.
Provider fallback remains available for harder requests or unavailable local targets.
Benefits
Track local versus provider execution and estimated avoided spend.
Start with OpenAI-compatible IDE workflows instead of a new app mandate.
Inspect route history, fallback reasons, latency, and target selection.
FanOn is designed around no prompt storage by default and local-first execution.
Privacy & Trust
FanOn focuses on aggregate routing, cost, latency, and reliability signals. The product direction explicitly avoids prompt inspection, productivity scoring, and manager visibility into individual conversations.
Pilot Dashboard
Example pilot dashboard Illustrative metrics from a 7-day local/dev pilot shape. Real numbers depend on team traffic, configured workers, models, and provider fallback policy.
Shows the local/provider split and the estimated provider spend avoided.
Shows selected targets, fallback reasons, and recent routing decisions.
Shows live workers, provider fallback, and the local/dev control plane.
Pilot Program
A FanOn pilot tests whether local-first routing can reduce provider usage while keeping latency, reliability, and developer satisfaction within acceptable bounds.
FAQ
No. FanOn is currently a local/dev MVP for pilots and design partner discovery.
No prompt storage by default is a core trust principle. Current metrics avoid prompt and message content.
That is the intended near-term wedge: OpenAI-compatible tools should point to FanOn with minimal workflow changes.
No. FanOn is a routing and optimization layer. Premium provider models remain useful for harder requests and fallback.
Join Pilot
Tell us about your team, AI tooling, and cost pressure. The current pilot focus is engineering organizations with active AI coding assistant usage.