Local Works - But Not Well Enough
I keep occasionally dipping my toes into the local AI for coding pool. It gets better and better - but TLDR; - nowhere near good enough for me right now.
This last time was best yet. I have Codex at my finger tips and after discussing the ins and outs of local on Apple silicon Macs we selected the following:
Pi as the agentic coding harness. Think “Claude Code” - open source and minimal. I like it for when I’m using llm’s via api. Works just as well for local llms.
My machine is an M5 Air with 32gig RAM. I tried to find a solution that would use both the NPU and GPU, alas, that solution hasn’t come online yet. I went wit MLX as that’s Apple’s optimized platform.
mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit
Qwen Coder is one of the best local coding llm’s but 4bit just takes out too much of the intelligence.
Fun thing, with Codex’s abilities of late, after we discussed through the options and what was best I could say “install all the shizzel” and it installed all the software and models and configured Pi to use the local models.
The performance time wise wasn’t too bad. Nothing remotely fast as using Claude Code or Codex with their models, but not so slow for something you can start and walk away from and let it cook.
The result was just not good enough. It was actually bad. Rewrote my markdown files (product definition, architecture, project plan) and degraded them.
I could see myself using this to reformat files or other light intelligence work where I didn’t want to expose those files to “the cloud”.
But for coding? No, not yet.
My system BARELY ran this. The gpu’s were at maximum and it used all the ram and 20gig more in swap.
I might try one of the smaller Gemma models sometime later. For now, I simply don’t have the hardware needed to run local models for coding.
I have had better luck with local text to speech. audiblez does a decent enough job to create my audio books that I intentionally designed to have 30 minute chapters.
I tried MisoTTS where the company touted it’s the best thing in voice and ran locally. Locally if you have an Nvidia H100 server locally. I was able to install it and run….but it could not complete. Not with the M5 with 35gig ram, not with an M5 Pro with 24gig ram.
You can listen to my book here: https://nginx.leebasehome.com/rogue-ai-v2/