Back to home

Blog

Thoughts on web development, artificial intelligence and technology

RequestScopedPipeline: Concurrent Inference in Diffusers without Race Conditions or Memory Duplication

Diffusers pipelines weren't designed for concurrency: calling pipe() simultaneously causes race conditions in schedulers, 'Already borrowed' errors in Rust tokenizers, or duplicates entire models in memory. My contribution (#12328) introduces RequestScopedPipeline, which solves these issues by creating lightweight per-request views, cloning only small mutable components, and adding automatic locks to tokenizers. Result: a server that handles multiple concurrent users without exploding GPU memory.
diffusersaiserver-async+3
Sep 30, 20259 min
Read more

lambda-gateway: Building a Serverless Host Demo

Have you ever wondered how platforms like Vercel and AWS Lambda work under the hood? I built lambda-gateway, a serverless hosting demo using Docker, FastAPI, and Next.js to discover how serverless architectures work through hands-on experimentation.
lambdaserverlessdocker+4
Invalid Date15 min
Read more
Fine-tuning Asclepio-8B and Qwen2.5-VL-3B: Medical Reasoning and Screenshot-to-Code with LoRA
Video

Fine-tuning Asclepio-8B and Qwen2.5-VL-3B: Medical Reasoning and Screenshot-to-Code with LoRA

Training specialized models doesn't require impossible GPUs or weeks of compute. Asclepio-8B learns clinical medical reasoning with 1.3M examples in 6.7 hours, reaching 76.9% accuracy. Qwen2.5-VL-3B converts UI screenshots into functional HTML/CSS with 94.6% accuracy in 5.5 hours. Both trained on L4 24GB using LoRA, demonstrating that specialized fine-tuning with smaller models can outperform giant general-purpose models. This post documents configurations, data pipelines, real metrics, and why fast iteration matters more than raw parameters.
fine-tuningLoRAmedical-ai+6
Oct 19, 202516 min
Read more

My First Blog Post

Welcome to my blog! This is my first article where I'm going to share my experiences as a Full-Stack developer. In this space I'll be writing about web development, artificial intelligence...
web developmentnext.jsreact+1
Sep 28, 20251 min
Read more