Run two local AI runtimes behind a single secure reverse proxy with separate authentication Running Ollama or Swama locally is straightforward: start the server and connect via localhost. But if…
Swama vs Ollama: Why Apple Silicon Macs Deserve a Faster Local AI Runtime
The reason is simple: because you can — and even faster now. If you have an Apple Silicon Mac (M1 or later) with 16GB of RAM or more, you can run powerful LLMs…
From Ollama to MLX: Achieving 2-3x Performance on Apple Silicon
Unlock 2-3x faster AI on Apple Silicon! This post explores optimizing models with Ollama and MLX, boosting performance for demanding applications.
How and why to run Ollama on your own Mac?
The reason is simple: because you can. M1 or later Macs with 16GB of RAM or more are powerful enough to run useful models on device without sending any data…
Securely Expose Your Ollama Server: A macOS Guide with Caddy and Bearer Tokens
Running Ollama locally is straightforward: visit http://localhost:11434 and connect from your Python scripts or other tools. If you plan to expose your machine to the internet, though, you should add basic protections…
🧠 Local RAG Chatbot with Ollama on Mac
Lightweight, private, and customizable retrieval-augmented chatbot running entirely on your Mac. Based on the excellent work by pruthvirajcyn and his Medium article. ⚙️ About This Project This is my personal…