Multi Cloud Serverless Cold starts

Measuring cold starts on AWS, GCP and Azure with Prometheus

Published: Sunday, Nov 28, 2021 Last modified: Monday, Dec 9, 2024

Continuing from Serverless DX survey, there is now a new arm64 branch.

Since AWS claim arm64 (Graviton) is cheaper and faster than amd64, I was thinking how to test this objectively. My previous survey focused on DX, which is not objectively measurable!

I thought it might be fun to test managed runtime cold starts using my prometheus setup at home.

Measuring cold starts

blackbox.yml is the blackbox exporter configuration

  http_cold:
	prober: http
	timeout: 5s
	http:
	  preferred_ip_protocol: ip4
	  fail_if_body_not_matches_regexp:
		- "<title>Count: 1</title>"

The http function I have running across the major clouds counts up when invoked. How do I know it’s a cold start?

The count is one when it’s a cold start!

My home Prometheus monitor is configured with my http function endpoints:

 - job_name: sls
	params:
	  module: [http_cold]
	static_configs:
	  - targets:
		- https://sam.dabase.com/
		- https://sarm.dabase.com/
		labels:
		  cloud: aws
	  - targets:
		- https://asia-east2-idiotbox.cloudfunctions.net/Countpage
		- https://count.dabase.com/
		labels:
		  cloud: gcp
	  - targets:
		- https://counttesting.azurewebsites.net/
		labels:
		  cloud: azure

The query to plot the cold starts is:

probe_duration_seconds{job="sls"} and probe_success{job="sls"}

Unsuccessful probes are typically when the serverless endpoint is not cold, so we ignore those.

Preliminary results

Azure is unsurprisingly slow since it’s hosted in Hong Kong, and my BBE is in my home in Singapore.

Is arm64 faster than x86_64? … it would appear so!

yellow is arm64, green is x86_64

But .. the fastest surprisingly is … Google Cloud Platform Cloud Run!

fastest cold start

🤯