Daily Technology
·01/04/2026
Ollama, a popular runtime for local large language models, has significantly enhanced its performance on Macs with Apple Silicon chips. The latest update introduces support for Apple's open-source MLX framework, promising faster execution of AI models directly on user devices. This development arrives as local AI models gain traction beyond niche communities, driven by user demand for alternatives to costly cloud-based services.
The integration of MLX into Ollama is a major step forward for Mac users interested in running AI models locally. MLX is Apple's framework designed to optimize machine learning tasks on Apple Silicon, enabling faster computations and more efficient use of hardware resources. This means that tasks like code generation, text summarization, and other AI-driven operations can be performed with greater speed and responsiveness on compatible Macs.
The timing of Ollama's update is particularly relevant. The surge in popularity of open-source models, exemplified by projects like OpenClaw, has spurred widespread experimentation with running AI models on personal computers. Developers are increasingly seeking local solutions to bypass the rate limits and subscription costs associated with cloud-based AI services such as ChatGPT Codex and Claude Code. Ollama's improvements directly address this growing need for accessible and cost-effective local AI capabilities.
The new MLX support is currently available in a preview version of Ollama (0.19). While it promises significant performance gains, it comes with specific hardware requirements. Users will need a Mac equipped with an Apple Silicon chip (M1 or later) and a minimum of 32GB of RAM to utilize this feature effectively. Currently, the MLX support is limited to Alibaba's Qwen3.5 model in its 35 billion-parameter variant. Beyond MLX, Ollama has also improved its caching mechanisms and added support for Nvidia's NVFP4 format, further enhancing memory efficiency for certain compressed models.









