lambda-gateway: Building a Serverless Hosting from Scratch

Before wanting to build our own Serverless Hosting, we first need to understand what Serverless Hosting or Serverless Platforms are. These are cloud services that allow you to run code, frontend, and modern web applications, such as Vercel and AWS Lambda. All this code is executed through specific events, such as making an API request or accessing a SaaS frontend.

When multiple events occur simultaneously, the platform automatically creates more instances of the function to handle them (horizontal scaling). Once the function finishes executing, the environment shuts down automatically and doesn't consume resources after completing its task.

There are also platforms that allow concurrency in functions and/or events, where a single instance handles multiple simultaneous calls. This optimizes efficiency by leveraging idle time and can reduce costs by up to 50%.

Main Features

Auto-scaling: Your code runs by automatically scaling up or down based on demand.
Pay-per-use: You only pay for actual execution time, not for idle servers.
No infrastructure management: Lambda manages all the infrastructure to run your code in a highly available and fault-tolerant environment, freeing you to focus on building differentiated backend services.
Event-driven: Functions execute in response to events (HTTP requests, file uploads, database changes, etc.).

From all of this, we can now define the minimum features for our Serverless Hosting, which will be:

Event-driven activation: Execute code in response to HTTP requests.
Event concurrency: Handle multiple simultaneous requests efficiently.
Front-End hosting: Serve static and dynamic web applications.
Deployment interface: A system that allows generating the necessary builds for deployment.

To handle all the build and invocation logic, I decided to create a class called BuildandRunLambda that will internally communicate with Docker using the Python SDK. When initializing the class, we check whether it's for build or invocation. If it's for build, we validate if the .dockerignore exists; if not, we create it automatically and get the environment variables to connect to Docker.

python

class BuildandRunLambda:
    def __init__(self, project_path: str | None = None):
        self.client = docker.from_env()
        self.project_path = Path(project_path) if project_path is not None else None
        self._invoke_only = project_path is None
        # This is used to verify that a .dockerignore exists.
        # If it doesn't exist, we create it so builds aren't too heavy.
        if self.project_path is not None:
            self._ensure_dockerignore()
    
    @classmethod
    def for_invoke(cls):
        return cls(project_path=None)

Helper Methods

Before getting into the main methods, we need some helpers to make everything work correctly.

Stopping containers: The stop_and_collect(...) method handles stopping active containers and, if remove_after is True, removes them afterward.

python

def stop_and_collect(self, container, timeout: int = 10, remove_after: bool = True):
        try:
            if container.status == 'running':
                container.stop(timeout=timeout)
            result = container.wait(timeout=timeout)
            logs = container.logs(stdout=True, stderr=True)
            exit_code = result.get('StatusCode', None) if isinstance(result, dict) else None
            if remove_after:
                try:
                    container.remove()
                except Exception:
                    pass
            return {
                "exit_code": exit_code,
                "logs": logs.decode() if isinstance(logs, bytes) else str(logs)
            }
        except Exception as e:
            try:
                container.remove(force=True)
            except Exception:
                pass
            return {"error": str(e)}

Handling Dockerfiles: The following methods get the correct dockerfile and environment variables according to the framework. If the dockerfile doesn't exist, they create it automatically (we support Next.js and Vite).

python

def _get_dockerfile(self, framework: str) -> str:
        dockerfiles = {
            "nextjs": "Dockerfile.nextjs",
            "vite": "Dockerfile.vite",
            "react": "Dockerfile.vite"
        }
        return dockerfiles.get(framework, "Dockerfile")

    def _get_runtime_env(self, framework: str, env_vars: dict) -> dict:
        if framework == "nextjs":
            return {
                k: v for k, v in env_vars.items()
                if not k.startswith("NEXT_PUBLIC_")
            }
        return {}

    def create_dockerfile(self, framework: str):
        if self.project_path is None:
            raise RuntimeError("You need project_path to create Dockerfiles")
    
        dockerfile_path = self.project_path / self._get_dockerfile(framework)
    
        if dockerfile_path.exists():
            print(f"{dockerfile_path.name} already exists")
            return
    
        content = self._get_dockerfile_content(framework)
        dockerfile_path.write_text(content)

        if framework in ["vite", "react"]:
            self._create_nginx_conf()

Generating .dockerignore: To keep builds from being too heavy, we create a .dockerignore automatically if it doesn't exist.

python

def gen_dockerignore(self):
        # We generate a generic .dockerignore
        # for JS and TS frameworks
        file = """
node_modules
.next
.git
.env*.local
npm-debug.log*
README.md
.dockerignore
Dockerfile 
        """
        return file

    def _ensure_dockerignore(self):
        # We validate if the .dockerignore exists
        if self.project_path is None:
            return

        dockerignore_path = self.project_path / ".dockerignore"
        
        if not dockerignore_path.exists():
            dockerignore_content = self.gen_dockerignore()
            dockerignore_path.write_text(dockerignore_content)

I'm not going to put the code for _get_dockerfile_content(...) and _create_nginx_conf(...) here because they're long, but basically they generate the Dockerfiles and nginx configuration depending on the framework. You can see the complete code in the repository.

The build(...) method

This is where the build magic happens. We receive the app_name, framework and env_vars, validate that the instance is for builds, create the base_path for internal routing, and generate the Dockerfile.

Then we use client.images.build() to build the Docker image. The important parameters are:

path: where the code is
dockerfile: which Dockerfile to use
tag: image name ({app_name}:latest)
buildargs: environment variables for the build
rm: cleans intermediate containers
pull: downloads the latest version of the base image

While building, we display the logs in real-time to see what's happening.

python

def build(self, app_name: str,
                    framework: str,
                    env_vars: dict | None = None):

        if self._invoke_only or self.project_path is None:
            raise RuntimeError("""This instance does 
                not have a 'project_path'. 
                Use the constructor with project_path to 'build'.""")

        env_vars = env_vars or {}
        
        base_path = f"/app/{app_name}"
        env_vars['BASE_PATH'] = base_path

        self.create_dockerfile(framework)
        dockerfile = self._get_dockerfile(framework)

        image, logs = self.client.images.build(
            path=str(self.project_path),
            dockerfile=dockerfile,
            tag=f"{app_name}:latest",
            buildargs=env_vars,  
            rm=True,  
            pull=True 
        )

        try:
            for log in logs:
                if isinstance(log, dict) and 'stream' in log:
                    print(log['stream'].strip())
                else:
                    print(str(log))
        except Exception:
            pass

        return image

The invoke_function(...) method

This method executes containers with already built applications. It receives the app_name, framework, port and optionally env_vars.

First we get the runtime-specific environment variables and configure the port. Depending on the framework, the internal port changes: Next.js uses 3000, while Vite and React use nginx's port 80.

Then we run the container with limited resources (128MB RAM and 0.5 CPU) to simulate a real serverless environment. The container runs in detach mode (background) and we add labels to identify it easily.

At the end we wait 5 seconds for it to start, reload the container state and display the logs to verify that everything is working.

python

def invoke_function(self, app_name: str,
                        framework: str,
                        port: int,
                        env_vars: Optional[Dict] = None):
        
        env = self._get_runtime_env(framework, env_vars or {})
        env['PORT'] = str(port)
        env['HOSTNAME'] = '0.0.0.0'

        # We don't pass BASE_PATH at runtime - only at build
        # The container serves from "/" internally

        if framework in ['vite', 'react']:
            internal_port = 80
        else:
            internal_port = port

        container = self.client.containers.run(
            image=f"{app_name}:latest",
            detach=True,
            remove=False,
            ports={f'{internal_port}/tcp': port},
            # Limited resources
            mem_limit="128m",
            nano_cpus=500000000,  # 0.5 CPU
            environment=env,
            labels={
                "type": "serverless",
                "invocation": str(time.time())
            }
        )

        time.sleep(5)
        container.reload()
        logs = container.logs(tail=50).decode()

        print(f"\n{'='*50}")
        print(f"State: {container.status}")
        print(f"Port: {port}")
        print(f"Logs:\n{logs}")
        print(f"{'='*50}\n")

        return container

With these main methods (build and invoke_function) and the helpers we saw before, we now have a complete class to build and run frontend applications in Docker, simulating the basic behavior of serverless hosting.

The utils module

This module contains helper functions used throughout the application to manage containers, HTTP requests and configuration.

cleanup_idle_containers(...): It's a background task that checks every 5 seconds if there are idle containers. If a container hasn't received requests for more than 15 seconds (defined in CONTAINER_IDLE_TIMEOUT), it stops and removes it automatically. This simulates the behavior of serverless platforms where containers shut down when not in use to save resources.

get_app_url(...): Builds the complete URL of an application from the app name and HTTP request. Basically takes the server's base URL and adds /app/{app_name} to generate the route where the application will be available.

wait_for_service(...): Waits for a service to be available before continuing. Makes periodic HTTP requests to a URL until it responds correctly or until the timeout expires. It's useful after starting a container to make sure the application is ready before starting to send traffic to it.

filter_request_headers(...): Filters HTTP headers that shouldn't propagate between client and container. Removes "hop-by-hop" headers like connection, host, transfer-encoding, etc., which are connection-specific and shouldn't be passed to the container.

get_next_available_port(...): Automatically assigns an available port for each new container. Starts at port 3500 and increments to avoid conflicts. Each time a function is invoked, it gets the next available port.

The main Backend

This is where everything comes to life. This module uses FastAPI to create the API that handles builds, deployments and acts as a reverse proxy for applications. What's interesting is how we handle concurrency to optimize resources.

Initial configuration

We define three global dictionaries to maintain state:

python

from fastapi import FastAPI, Request, HTTPException, Response
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from buildlambda import BuildandRunLambda
from pydantic import BaseModel, Field
from typing import Dict
import asyncio
import httpx
import time
from utils import (cleanup_idle_containers, 
            get_app_url, wait_for_service, 
            filter_request_headers, 
            get_next_available_port)

deployed_apps: Dict[str, dict] = {}  # Apps we've already built
app_locks: Dict[str, asyncio.Lock] = {}  # Locks to avoid race conditions
running_containers: Dict[str, dict] = {}  # Active containers with their timestamp

Lifecycle

When starting the application we create a global instance of BuildandRunLambda and start the automatic container cleanup task. When the app shuts down, we make sure to stop all containers cleanly.

python

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.run = BuildandRunLambda()
    cleanup_task = asyncio.create_task(cleanup_idle_containers(running_containers))
    yield
    cleanup_task.cancel()
    for app_name, info in running_containers.items():
        try:
            if info.get('container'):
                info['container'].stop()
                info['container'].remove()
        except:
            pass

app = FastAPI(lifespan=lifespan)

Building

This endpoint receives the project, does the build and saves the app info. If you don't specify a port, we assign one automatically. We also create a lock for each app - this is key for the concurrency handling we'll see later.

python

class JSONBuild(BaseModel):
    project_path: str
    app_name: str
    framework: str
    env_vars: dict
    port: int | None = Field(default=None)

@app.post("/build/lambda")
async def build_lambda(q: JSONBuild):
    build = BuildandRunLambda(q.project_path)
    try:
        build.build(app_name=q.app_name,
                    framework=q.framework,
                    env_vars=q.env_vars)

        if q.port is None:
            port = get_next_available_port()
        else:
            port = q.port

        deployed_apps[q.app_name] = {
            "framework": q.framework,
            "port": port,
            "env_vars": q.env_vars
        }

        app_locks.setdefault(q.app_name, asyncio.Lock())

        return {"success": True}

    except Exception as e:
        print(f"Error: {e}")
        return {
            "success": False,
            "error": str(e),
        }

Listing apps

Simple - returns all built apps with their metadata and status (running or stopped).

python

@app.get("/apps")
async def get_apps(request: Request):
    results = []

    for name, info in deployed_apps.items():
        port = info.get("port")
        framework = info.get("framework")
        env_vars = info.get("env_vars", {})
        is_running = name in running_containers
        
        try:
            url = get_app_url(name, request)
        except Exception:
            url = f"{str(request.base_url).rstrip('/')}/app/{name}"

        results.append({
                "app_name": name,
                "url": url,
                "port": port,
                "framework": framework,
                "env_vars": env_vars,
                "status": "running" if is_running else "stopped"
            })

    return {"apps": results}

Basic redirect

To avoid issues with relative paths, we redirect /app/{app_name} to /app/{app_name}/.

python

@app.get("/app/{app_name}")
async def redirect_app_root(app_name: str):
    return Response(
        status_code=307,
        headers={"Location": f"/app/{app_name}/"}
    )

The reverse proxy - where the serverless magic happens

This is the heart of the system. Here we handle concurrent invocations intelligently: if multiple requests arrive simultaneously for the same app, we only start one container and all requests are redirected to that same container. This simulates exactly how real serverless platforms work.

First we validate that the app exists and build the target URL according to the framework. For Vite/React we remove the /app/{app_name} prefix because nginx serves from root, while Next.js handles the basePath internally.

Now comes the good part: we check if there's already a running container. If it doesn't exist, we use the app's lock (remember we created it during build) to ensure only one thread starts the container. The "double-check locking" pattern is key here - we check twice if the container exists to avoid race conditions.

Imagine this scenario: 10 simultaneous requests arrive for an app that isn't running. Without the lock, all 10 would try to start separate containers. With the lock, the first request acquires the lock, verifies there's no container, starts it and registers it. The other 9 requests wait at the lock, but when they finally acquire it, the second check detects that a container already exists and they simply reuse it. Boom - resource optimization.

Once the container is starting (or was already running), we wait up to 15 seconds for it to respond to the health check. If it doesn't respond, we stop it and return an error. If everything works, we update the last access timestamp (so the cleanup task knows it's active) and proxy the request.

python

@app.api_route("/app/{app_name}/{path:path}", methods=["GET", "POST", "PUT", "DELETE", "PATCH"])
async def proxy_to_app(app_name: str, path: str, request: Request):

    if app_name not in deployed_apps:
        raise HTTPException(404, f"App '{app_name}' not found")

    app_info = deployed_apps[app_name]
    target_port = app_info['port']
    framework = app_info['framework']

    # Build the URL according to the framework
    if framework in ["vite", "react"]:
        target_url = f"http://localhost:{target_port}/{path}"
    else:
        target_url = f"http://localhost:{target_port}/app/{app_name}/{path}"
    
    if request.url.query:
        target_url += f"?{request.url.query}"

    lock = app_locks.setdefault(app_name, asyncio.Lock())
    
    # First check - is there a container?
    container_info = running_containers.get(app_name)
    
    if container_info is None:
        async with lock:
            # Second check for safety (double-check locking)
            # This prevents multiple concurrent requests from starting the same container
            container_info = running_containers.get(app_name)
            if container_info is None:
                try:
                    print(f"Starting container for '{app_name}'...")
                    container = await asyncio.to_thread(
                        app.state.run.invoke_function,
                        app_name,
                        app_info['framework'],
                        target_port,
                        app_info.get('env_vars', {})
                    )
                    
                    running_containers[app_name] = {
                        'container': container,
                        'last_access': time.time()
                    }
                    container_info = running_containers[app_name]
                    
                    # Health check
                    if framework in ["vite", "react"]:
                        health_check_url = f"http://localhost:{target_port}/"
                    else:
                        health_check_url = f"http://localhost:{target_port}/app/{app_name}/"
                    
                    ready = await wait_for_service(health_check_url, timeout=15.0, interval=0.2)
                    
                    if not ready:
                        try:
                            info = await asyncio.to_thread(app.state.run.stop_and_collect, container, 3, True)
                            running_containers.pop(app_name, None)
                        except Exception:
                            info = {"error": "The container could not be stopped after a timeout"}
                        raise HTTPException(status_code=503, 
                        detail=f"""The service on '{app_name}' 
                        did not respond in a timely manner. Info: {info}""")
                    
                except Exception as e:
                    running_containers.pop(app_name, None)
                    raise HTTPException(status_code=500, detail=f"Error starting container: {e}")
    
    # We already have a container (just created or already existing)
    # Update the timestamp so it doesn't shut down
    container_info['last_access'] = time.time()

    # Proxy the request
    try:
        body = await request.body()
        headers = filter_request_headers(dict(request.headers))
        async with httpx.AsyncClient() as client:
            resp = await client.request(
                method=request.method,
                url=target_url,
                headers=headers,
                content=body,
                timeout=30.0
            )

            response_headers = {k: v for k, v in resp.headers.items() 
            if k.lower() not in ("content-encoding", "transfer-encoding", "connection")}

        return Response(content=resp.content, 
        status_code=resp.status_code, 
        headers=response_headers, 
        media_type=resp.headers.get("content-type"))

    except httpx.ConnectError:
        running_containers.pop(app_name, None)
        raise HTTPException(503, f"Could not connect to '{app_name}'")
    except httpx.TimeoutException:
        raise HTTPException(504, f"Timeout connecting with '{app_name}'")
    except Exception as e:
        raise HTTPException(500, f"Proxy error: {str(e)}")

CORS and static files

We enable CORS for development and handle a curious case: when frameworks request static files without specifying the app, we use the referer header to guess which app they belong to and redirect correctly.

python

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/{filename:path}")
async def catch_static_files(filename: str, request: Request):
    static_extensions = (
        '.svg', '.png', '.jpg', '.jpeg', '.gif', '.ico', 
        '.webp', '.woff', '.woff2', '.ttf', '.eot'
    )
    
    if not filename.lower().endswith(static_extensions):
        raise HTTPException(404, "Not found")
    
    if not deployed_apps:
        raise HTTPException(404, f"File '{filename}' not found")
    
    referer = request.headers.get('referer', '')
    target_app = None
    
    for app_name in deployed_apps.keys():
        if f"/app/{app_name}" in referer:
            target_app = app_name
            break
    
    if not target_app:
        target_app = list(deployed_apps.keys())[0]
    
    redirect_url = f"/app/{target_app}/{filename}"
    return Response(
        status_code=307,
        headers={"Location": redirect_url}
    )

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=5500)

And that's it. The complete system works like this: you make a request to /app/{app_name}/, the backend checks if there's an active container. If there isn't, it starts one (but only one, no matter how many requests arrive simultaneously). Once ready, it proxies your request to the container. If the container doesn't receive traffic for 15 seconds, it shuts down automatically. When the next request arrives, it starts up again. Exactly like AWS Lambda or Vercel.

Running lambda-gateway

To interact with this FastAPI backend I built a basic frontend with Next.js and React. Before running lambda-gateway, make sure to create a .env.local in the frontend root folder containing this:

env

NEXT_PUBLIC_BACKEND_URL="http://127.0.0.1:5500"

Or the backend URL you've configured. Also make sure you've run npm install . beforehand to avoid errors when executing it.

Once everything is configured, run lambda-gateway and you'll see an interface like this:

Testing lambda-gateway with Next.js and Vite+React

Conclusion

And that's how I built lambda-gateway. It started as an experiment to understand how serverless platforms work internally, and ended up being a functional implementation that simulates the basic concepts: event-driven activation, concurrency, auto-scaling (well, more like auto-shutdown) and application hosting.

The complete code is available on GitHub under Apache 2.0 license, so you can clone it, modify it, break it and improve it as you like. If you find bugs or have ideas to improve it, pull requests are welcome.

Obviously this isn't a replacement for Vercel or AWS Lambda - there are a thousand things I didn't implement (data persistence, monitoring, robust logging, secrets management, CDN, etc.). But as a learning exercise, it helped me understand much better how these platforms work internally. I hope it helps you too.

If you have questions or comments, you can find me on my portfolio.

References

lambda-gateway: Building a Serverless Hosting from Scratch

Main Features

Auto-scaling: Your code runs by automatically scaling up or down based on demand.
Pay-per-use: You only pay for actual execution time, not for idle servers.
No infrastructure management: Lambda manages all the infrastructure to run your code in a highly available and fault-tolerant environment, freeing you to focus on building differentiated backend services.
Event-driven: Functions execute in response to events (HTTP requests, file uploads, database changes, etc.).

From all of this, we can now define the minimum features for our Serverless Hosting, which will be:

Event-driven activation: Execute code in response to HTTP requests.
Event concurrency: Handle multiple simultaneous requests efficiently.
Front-End hosting: Serve static and dynamic web applications.
Deployment interface: A system that allows generating the necessary builds for deployment.

python

class BuildandRunLambda:
    def __init__(self, project_path: str | None = None):
        self.client = docker.from_env()
        self.project_path = Path(project_path) if project_path is not None else None
        self._invoke_only = project_path is None
        # This is used to verify that a .dockerignore exists.
        # If it doesn't exist, we create it so builds aren't too heavy.
        if self.project_path is not None:
            self._ensure_dockerignore()
    
    @classmethod
    def for_invoke(cls):
        return cls(project_path=None)

Helper Methods

Before getting into the main methods, we need some helpers to make everything work correctly.

Stopping containers: The stop_and_collect(...) method handles stopping active containers and, if remove_after is True, removes them afterward.

python

def stop_and_collect(self, container, timeout: int = 10, remove_after: bool = True):
        try:
            if container.status == 'running':
                container.stop(timeout=timeout)
            result = container.wait(timeout=timeout)
            logs = container.logs(stdout=True, stderr=True)
            exit_code = result.get('StatusCode', None) if isinstance(result, dict) else None
            if remove_after:
                try:
                    container.remove()
                except Exception:
                    pass
            return {
                "exit_code": exit_code,
                "logs": logs.decode() if isinstance(logs, bytes) else str(logs)
            }
        except Exception as e:
            try:
                container.remove(force=True)
            except Exception:
                pass
            return {"error": str(e)}

python

def _get_dockerfile(self, framework: str) -> str:
        dockerfiles = {
            "nextjs": "Dockerfile.nextjs",
            "vite": "Dockerfile.vite",
            "react": "Dockerfile.vite"
        }
        return dockerfiles.get(framework, "Dockerfile")

    def _get_runtime_env(self, framework: str, env_vars: dict) -> dict:
        if framework == "nextjs":
            return {
                k: v for k, v in env_vars.items()
                if not k.startswith("NEXT_PUBLIC_")
            }
        return {}

    def create_dockerfile(self, framework: str):
        if self.project_path is None:
            raise RuntimeError("You need project_path to create Dockerfiles")
    
        dockerfile_path = self.project_path / self._get_dockerfile(framework)
    
        if dockerfile_path.exists():
            print(f"{dockerfile_path.name} already exists")
            return
    
        content = self._get_dockerfile_content(framework)
        dockerfile_path.write_text(content)

        if framework in ["vite", "react"]:
            self._create_nginx_conf()

Generating .dockerignore: To keep builds from being too heavy, we create a .dockerignore automatically if it doesn't exist.

python

def gen_dockerignore(self):
        # We generate a generic .dockerignore
        # for JS and TS frameworks
        file = """
node_modules
.next
.git
.env*.local
npm-debug.log*
README.md
.dockerignore
Dockerfile 
        """
        return file

    def _ensure_dockerignore(self):
        # We validate if the .dockerignore exists
        if self.project_path is None:
            return

        dockerignore_path = self.project_path / ".dockerignore"
        
        if not dockerignore_path.exists():
            dockerignore_content = self.gen_dockerignore()
            dockerignore_path.write_text(dockerignore_content)

The build(...) method

Then we use client.images.build() to build the Docker image. The important parameters are:

path: where the code is
dockerfile: which Dockerfile to use
tag: image name ({app_name}:latest)
buildargs: environment variables for the build
rm: cleans intermediate containers
pull: downloads the latest version of the base image

While building, we display the logs in real-time to see what's happening.

python

def build(self, app_name: str,
                    framework: str,
                    env_vars: dict | None = None):

        if self._invoke_only or self.project_path is None:
            raise RuntimeError("""This instance does 
                not have a 'project_path'. 
                Use the constructor with project_path to 'build'.""")

        env_vars = env_vars or {}
        
        base_path = f"/app/{app_name}"
        env_vars['BASE_PATH'] = base_path

        self.create_dockerfile(framework)
        dockerfile = self._get_dockerfile(framework)

        image, logs = self.client.images.build(
            path=str(self.project_path),
            dockerfile=dockerfile,
            tag=f"{app_name}:latest",
            buildargs=env_vars,  
            rm=True,  
            pull=True 
        )

        try:
            for log in logs:
                if isinstance(log, dict) and 'stream' in log:
                    print(log['stream'].strip())
                else:
                    print(str(log))
        except Exception:
            pass

        return image

The invoke_function(...) method

This method executes containers with already built applications. It receives the app_name, framework, port and optionally env_vars.

First we get the runtime-specific environment variables and configure the port. Depending on the framework, the internal port changes: Next.js uses 3000, while Vite and React use nginx's port 80.

At the end we wait 5 seconds for it to start, reload the container state and display the logs to verify that everything is working.

python

def invoke_function(self, app_name: str,
                        framework: str,
                        port: int,
                        env_vars: Optional[Dict] = None):
        
        env = self._get_runtime_env(framework, env_vars or {})
        env['PORT'] = str(port)
        env['HOSTNAME'] = '0.0.0.0'

        # We don't pass BASE_PATH at runtime - only at build
        # The container serves from "/" internally

        if framework in ['vite', 'react']:
            internal_port = 80
        else:
            internal_port = port

        container = self.client.containers.run(
            image=f"{app_name}:latest",
            detach=True,
            remove=False,
            ports={f'{internal_port}/tcp': port},
            # Limited resources
            mem_limit="128m",
            nano_cpus=500000000,  # 0.5 CPU
            environment=env,
            labels={
                "type": "serverless",
                "invocation": str(time.time())
            }
        )

        time.sleep(5)
        container.reload()
        logs = container.logs(tail=50).decode()

        print(f"\n{'='*50}")
        print(f"State: {container.status}")
        print(f"Port: {port}")
        print(f"Logs:\n{logs}")
        print(f"{'='*50}\n")

        return container

The utils module

This module contains helper functions used throughout the application to manage containers, HTTP requests and configuration.

The main Backend

Initial configuration

We define three global dictionaries to maintain state:

python

from fastapi import FastAPI, Request, HTTPException, Response
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from buildlambda import BuildandRunLambda
from pydantic import BaseModel, Field
from typing import Dict
import asyncio
import httpx
import time
from utils import (cleanup_idle_containers, 
            get_app_url, wait_for_service, 
            filter_request_headers, 
            get_next_available_port)

deployed_apps: Dict[str, dict] = {}  # Apps we've already built
app_locks: Dict[str, asyncio.Lock] = {}  # Locks to avoid race conditions
running_containers: Dict[str, dict] = {}  # Active containers with their timestamp

Lifecycle

When starting the application we create a global instance of BuildandRunLambda and start the automatic container cleanup task. When the app shuts down, we make sure to stop all containers cleanly.

python

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.run = BuildandRunLambda()
    cleanup_task = asyncio.create_task(cleanup_idle_containers(running_containers))
    yield
    cleanup_task.cancel()
    for app_name, info in running_containers.items():
        try:
            if info.get('container'):
                info['container'].stop()
                info['container'].remove()
        except:
            pass

app = FastAPI(lifespan=lifespan)

Building

python

class JSONBuild(BaseModel):
    project_path: str
    app_name: str
    framework: str
    env_vars: dict
    port: int | None = Field(default=None)

@app.post("/build/lambda")
async def build_lambda(q: JSONBuild):
    build = BuildandRunLambda(q.project_path)
    try:
        build.build(app_name=q.app_name,
                    framework=q.framework,
                    env_vars=q.env_vars)

        if q.port is None:
            port = get_next_available_port()
        else:
            port = q.port

        deployed_apps[q.app_name] = {
            "framework": q.framework,
            "port": port,
            "env_vars": q.env_vars
        }

        app_locks.setdefault(q.app_name, asyncio.Lock())

        return {"success": True}

    except Exception as e:
        print(f"Error: {e}")
        return {
            "success": False,
            "error": str(e),
        }

Listing apps

Simple - returns all built apps with their metadata and status (running or stopped).

python

@app.get("/apps")
async def get_apps(request: Request):
    results = []

    for name, info in deployed_apps.items():
        port = info.get("port")
        framework = info.get("framework")
        env_vars = info.get("env_vars", {})
        is_running = name in running_containers
        
        try:
            url = get_app_url(name, request)
        except Exception:
            url = f"{str(request.base_url).rstrip('/')}/app/{name}"

        results.append({
                "app_name": name,
                "url": url,
                "port": port,
                "framework": framework,
                "env_vars": env_vars,
                "status": "running" if is_running else "stopped"
            })

    return {"apps": results}

Basic redirect

To avoid issues with relative paths, we redirect /app/{app_name} to /app/{app_name}/.

python

@app.get("/app/{app_name}")
async def redirect_app_root(app_name: str):
    return Response(
        status_code=307,
        headers={"Location": f"/app/{app_name}/"}
    )

The reverse proxy - where the serverless magic happens

python

@app.api_route("/app/{app_name}/{path:path}", methods=["GET", "POST", "PUT", "DELETE", "PATCH"])
async def proxy_to_app(app_name: str, path: str, request: Request):

    if app_name not in deployed_apps:
        raise HTTPException(404, f"App '{app_name}' not found")

    app_info = deployed_apps[app_name]
    target_port = app_info['port']
    framework = app_info['framework']

    # Build the URL according to the framework
    if framework in ["vite", "react"]:
        target_url = f"http://localhost:{target_port}/{path}"
    else:
        target_url = f"http://localhost:{target_port}/app/{app_name}/{path}"
    
    if request.url.query:
        target_url += f"?{request.url.query}"

    lock = app_locks.setdefault(app_name, asyncio.Lock())
    
    # First check - is there a container?
    container_info = running_containers.get(app_name)
    
    if container_info is None:
        async with lock:
            # Second check for safety (double-check locking)
            # This prevents multiple concurrent requests from starting the same container
            container_info = running_containers.get(app_name)
            if container_info is None:
                try:
                    print(f"Starting container for '{app_name}'...")
                    container = await asyncio.to_thread(
                        app.state.run.invoke_function,
                        app_name,
                        app_info['framework'],
                        target_port,
                        app_info.get('env_vars', {})
                    )
                    
                    running_containers[app_name] = {
                        'container': container,
                        'last_access': time.time()
                    }
                    container_info = running_containers[app_name]
                    
                    # Health check
                    if framework in ["vite", "react"]:
                        health_check_url = f"http://localhost:{target_port}/"
                    else:
                        health_check_url = f"http://localhost:{target_port}/app/{app_name}/"
                    
                    ready = await wait_for_service(health_check_url, timeout=15.0, interval=0.2)
                    
                    if not ready:
                        try:
                            info = await asyncio.to_thread(app.state.run.stop_and_collect, container, 3, True)
                            running_containers.pop(app_name, None)
                        except Exception:
                            info = {"error": "The container could not be stopped after a timeout"}
                        raise HTTPException(status_code=503, 
                        detail=f"""The service on '{app_name}' 
                        did not respond in a timely manner. Info: {info}""")
                    
                except Exception as e:
                    running_containers.pop(app_name, None)
                    raise HTTPException(status_code=500, detail=f"Error starting container: {e}")
    
    # We already have a container (just created or already existing)
    # Update the timestamp so it doesn't shut down
    container_info['last_access'] = time.time()

    # Proxy the request
    try:
        body = await request.body()
        headers = filter_request_headers(dict(request.headers))
        async with httpx.AsyncClient() as client:
            resp = await client.request(
                method=request.method,
                url=target_url,
                headers=headers,
                content=body,
                timeout=30.0
            )

            response_headers = {k: v for k, v in resp.headers.items() 
            if k.lower() not in ("content-encoding", "transfer-encoding", "connection")}

        return Response(content=resp.content, 
        status_code=resp.status_code, 
        headers=response_headers, 
        media_type=resp.headers.get("content-type"))

    except httpx.ConnectError:
        running_containers.pop(app_name, None)
        raise HTTPException(503, f"Could not connect to '{app_name}'")
    except httpx.TimeoutException:
        raise HTTPException(504, f"Timeout connecting with '{app_name}'")
    except Exception as e:
        raise HTTPException(500, f"Proxy error: {str(e)}")

CORS and static files

python

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/{filename:path}")
async def catch_static_files(filename: str, request: Request):
    static_extensions = (
        '.svg', '.png', '.jpg', '.jpeg', '.gif', '.ico', 
        '.webp', '.woff', '.woff2', '.ttf', '.eot'
    )
    
    if not filename.lower().endswith(static_extensions):
        raise HTTPException(404, "Not found")
    
    if not deployed_apps:
        raise HTTPException(404, f"File '{filename}' not found")
    
    referer = request.headers.get('referer', '')
    target_app = None
    
    for app_name in deployed_apps.keys():
        if f"/app/{app_name}" in referer:
            target_app = app_name
            break
    
    if not target_app:
        target_app = list(deployed_apps.keys())[0]
    
    redirect_url = f"/app/{target_app}/{filename}"
    return Response(
        status_code=307,
        headers={"Location": redirect_url}
    )

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=5500)

Running lambda-gateway

env

NEXT_PUBLIC_BACKEND_URL="http://127.0.0.1:5500"

Or the backend URL you've configured. Also make sure you've run npm install . beforehand to avoid errors when executing it.

Once everything is configured, run lambda-gateway and you'll see an interface like this:

Testing lambda-gateway with Next.js and Vite+React

Conclusion

If you have questions or comments, you can find me on my portfolio.

lambda-gateway: Building a Serverless Host Demo

lambda-gateway: Building a Serverless Hosting from Scratch

Main Features

Helper Methods

The build(...) method

The invoke_function(...) method

The utils module

The main Backend

Running lambda-gateway

Testing lambda-gateway with Next.js and Vite+React

Conclusion

References

lambda-gateway: Building a Serverless Host Demo

lambda-gateway: Building a Serverless Hosting from Scratch

Main Features

Helper Methods

The build(...) method

The invoke_function(...) method

The utils module

The main Backend

Running lambda-gateway

Testing lambda-gateway with Next.js and Vite+React

Conclusion

References