Functions as a service. Triggered by an event (e.g. click).

  • scales with demand automatically
  • do not pay for idle
  • eliminates server maintenance

But:

  • timeouts
  • latency issues, having to deal with cold starts each time

Main options:

  • AWS Lambda (annoying to set up, hardware limitations, expensive, not easy to get big machines/GPU)
  • Coiled
  • Modal

Coiled

pip install coiled "dask[complete]"
coiled login
  • run a script using CLI
coiled run --region europe-west1 --gpu python script.py
  • run a single function in python
import coiled  
  
@coiled.function(
	# see https://cloud.coiled.io/settings/setup/infrastructure
	workspace="devs", # 
	region="europe-west1",
	vm_type="g2-standard-4",
	# or can specify specific GPU type
	# gpu=True, # interpreted as 1
	# worker_gpu_type="nvidia-tesla-t4",
	idle_timeout="5 minutes", # shutdown after 5 minutes of idleness
)
def train_expensive_model(**kwargs):
	return fit(**kwargs)
  
train_expensive_model()
pip install modal
modal setup
  • specify images in python (rather than Docker) through method chaining
    • can also use mamba install or uv
    • can access images from registry or .from_dockerfile("Dockerfile")
image = (
	modal.Image.debian_slim(python_version="3.10")
	.apt_install("git")
	.pip_install("torch==2.2.1")
	.run_commands("git clone https://github.com/modal-labs/agi && echo 'ready to go!'")
)
  • run a single function in python
import modal
 
app = modal.App("app-name")
 
@app.function(
	image=image,
	# add specific GPU, could even do modal.gpu.A100(size="80GB", count=8)
	gpu="A100",
	cpu=8.0, # ask for 8 CPU cores
	memory=(1024, 32768), # in MB, ask for at least 1GB RAM (32GB limit)
)
def train_expensive_model(**kwargs):
	return fit(**kwargs)
 
@app.local_entrypoint()
def main():
	train_expensive_model.local() # run the function locally
	train_expensive_model.remote() # run the function remotely on Modal
	train_expensive_model.map()