Hi, I'm Fredy Rivera, a Full-Stack Developer and AI-Developer!

Tech Stack

React
React
Tailwind CSS
Tailwind CSS
PostgreSQL
PostgreSQL
Vercel
Vercel
GitHub
GitHub
Git
Git
Python
Python
FastAPI
FastAPI
Redis
Redis
OpenAI
OpenAI
Anthropic
Anthropic
Stripe
Stripe
PayPal
Paypal
Supabase
Supabase
DigitalOcean
DigitalOcean
AWS (EC2)
React
React
Tailwind CSS
Tailwind CSS
PostgreSQL
PostgreSQL
Vercel
Vercel
GitHub
GitHub
Git
Git
Python
Python
FastAPI
FastAPI
Redis
Redis
OpenAI
OpenAI
Anthropic
Anthropic
Stripe
Stripe
PayPal
Paypal
Supabase
Supabase
DigitalOcean
DigitalOcean
AWS (EC2)

My Projects

Things I've built

Web budgeting application - Image 1

Web budgeting application

A budget management platform for construction companies

PythonFastAPITailwind-CSSHTMLPostgreSQLSupabaseVercel
Aquiles-RAG: High-Performance Vector Search System - Image 1

Aquiles-RAG: High-Performance Vector Search System

Production-ready Retrieval-Augmented Generation (RAG) solution with multiple vector database support

PythonFastAPIRedisQdrantPostgreSQLRAGVector SearchHNSWAI
GitHub
AtlasServer-Core: Self-Hosted Application Server - Image 1

AtlasServer-Core: Self-Hosted Application Server

Fast deploy. No cloud. Just code. A self-hosted server platform with AI-powered deployment capabilities

PythonFlaskFastAPIDjangoCLISelf-HostedAIOllamaDevOpsNgrok
GitHub
LLaDA-from-scratch: Diffusion Language Model Implementation - Image 1

LLaDA-from-scratch: Diffusion Language Model Implementation

Building LLaDA, a diffusion-based language model that learns text distribution through progressive masking and reconstruction

PythonPyTorchHuggingFaceDiffusion ModelsNLPDeep LearningA100Research
GitHub

Latest Articles

Recent thoughts and insights

RequestScopedPipeline: Concurrent Inference in Diffusers without Race Conditions or Memory Duplication

Diffusers pipelines weren't designed for concurrency: calling pipe() simultaneously causes race conditions in schedulers, 'Already borrowed' errors in Rust tokenizers, or duplicates entire models in memory. My contribution (#12328) introduces RequestScopedPipeline, which solves these issues by creating lightweight per-request views, cloning only small mutable components, and adding automatic locks to tokenizers. Result: a server that handles multiple concurrent users without exploding GPU memory.

diffusersaiserver-async+3
September 30, 2025Read more →

My First Blog Post

Welcome to my blog! This is my first article where I'm going to share my experiences as a Full-Stack developer. In this space I'll be writing about web development, artificial intelligence...

web developmentnext.jsreact+1
September 28, 2025Read more →