Local AI.
Unleashed with MLX.

The open source, native macOS app for running LLMs locally. Maximum performance, complete privacy.

Download for macOS View Source

Open Source (MIT) • v1.0 • Apple Silicon Optimized

Built directly on Apple's MLX framework to deliver industry-leading inference speeds on M-series chips.

Built with SwiftUI for a truly Mac-like experience. Smooth animations, frosted glass, and seamless integration.

Transparency is key. Inspect the code, contribute features, and run models with zero compiled secrets.

Benchmarks. Off the charts.

Leveraging the Neural Engine to deliver token generation speeds that leave others in the dust.

Generative Feedback (MLX) 92 tok/s

Standard Runtime 41 tok/s

Llama-3-8B-Instruct on MacBook Pro M3 Max

🦙 Llama 3

🤖 Mistral

🧠 Gemma

🦅 Falcon

🔮 Phi-3

⚡ StarCoder