Bryan Ramos

Bryan Ramos

Software & Systems Engineer

I work with systems programming, real-time computing, and simulation engineering. My background spans Linux, low-level development in C and Python, and debugging latency-critical systems where determinism and microseconds count. Currently exploring applied AI and LLM integration.

Reach me at bryan@ramos.codes

Stack & Tools

C Python Bash Linux Real-Time Systems Simulation Engineering AI/LLM Integration Automation Networking Docker KVM/QEMU Nix SQL

Recent Posts

Experimenting With TurboQuant and MoE Caching

Some notes from maintaining a TurboQuant llama.cpp fork and testing whether hot MoE experts on the GPU could make a huge local model practical.

Building a Local AI Rig

The machine I built for local AI work, why I built it, and why good enough hardware changes how I use these tools.

Why I'm Moving More AI Work Off the Cloud

Cloud AI is useful, but I want more of my day-to-day AI workflow on infrastructure I control.

View all posts →