Run two local AI runtimes behind a single secure reverse proxy with separate authentication Running Ollama or Swama locally is straightforward: start the server and connect via localhost. But if...
The reason is simple: because you can — and even faster now. If you have an Apple Silicon Mac (M1 or later) with 16GB of RAM or more, you can run powerful LLMs...
Unlock 2-3x faster AI on Apple Silicon! This post explores optimizing models with Ollama and MLX, boosting performance for demanding applications.
The reason is simple: because you can. M1 or later Macs with 16GB of RAM or more are powerful enough to run useful models on device without sending any data...
Running Ollama locally is straightforward: visit http://localhost:11434 and connect from your Python scripts or other tools. If you plan to expose your machine to the internet, though, you should add basic protections...
Lightweight, private, and customizable retrieval-augmented chatbot running entirely on your Mac. Based on the excellent work by pruthvirajcyn and his Medium article. ⚙️ About This Project This is my personal...