serverless functions

Functions as a service. Triggered by an event (e.g. click).

scales with demand automatically
do not pay for idle
eliminates server maintenance

But:

timeouts
latency issues, having to deal with cold starts each time

Main options:

AWS Lambda (annoying to set up, hardware limitations, expensive, not easy to get big machines/GPU)
Coiled
Modal

Coiled

Coiled creates a VM, uv pip install local software packages, cloud credentials, files, etc.., runs your script, and then shuts down the VM
use any of the major cloud providers
- based on Dask clusters
support for mlflow

pip install coiled "dask[complete]"
coiled login

run a script using CLI

coiled run --region europe-west1 --gpu python script.py

run a single function in python

import coiled  
  
@coiled.function(
	# see https://cloud.coiled.io/settings/setup/infrastructure
	workspace="devs", # 
	region="europe-west1",
	vm_type="g2-standard-4",
	# or can specify specific GPU type
	# gpu=True, # interpreted as 1
	# worker_gpu_type="nvidia-tesla-t4",
	idle_timeout="5 minutes", # shutdown after 5 minutes of idleness
)
def train_expensive_model(**kwargs):
	return fit(**kwargs)
  
train_expensive_model()

modal runs it in its own cloud environment (so no AWS or other cloud providers yet) but with rapid spin-up times (Rust rather than Docker/kubernetes)
support for scheduling cron jobs in the cloud, timeouts, retries
support for web endpoints @modal.web_endpoint(method="POST)

pip install modal
modal setup

specify images in python (rather than Docker) through method chaining
- can also use mamba install or uv
- can access images from registry or .from_dockerfile("Dockerfile")

image = (
	modal.Image.debian_slim(python_version="3.10")
	.apt_install("git")
	.pip_install("torch==2.2.1")
	.run_commands("git clone https://github.com/modal-labs/agi && echo 'ready to go!'")
)

run a single function in python

import modal
 
app = modal.App("app-name")
 
@app.function(
	image=image,
	# add specific GPU, could even do modal.gpu.A100(size="80GB", count=8)
	gpu="A100",
	cpu=8.0, # ask for 8 CPU cores
	memory=(1024, 32768), # in MB, ask for at least 1GB RAM (32GB limit)
)
def train_expensive_model(**kwargs):
	return fit(**kwargs)
 
@app.local_entrypoint()
def main():
	train_expensive_model.local() # run the function locally
	train_expensive_model.remote() # run the function remotely on Modal
	train_expensive_model.map()

notes

Explorer

serverless functions

Coiled

Graph View

Table of Contents

Backlinks